-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AST discussion #12
Comments
Since mathsteps is educational in nature, what are your thoughts on a good way to add metadata to the nodes to support visually identifying various parts of a tree like what is shown in the following examples?: |
This is a great write-up! I appreciate defining relations, programs and sequences. Essentially encapsulating many more types of mathematical statements than we currently have. Should the value of a Number node be a number or a string? In the first example, there are both:
Although mathjs also uses string values, I'm curious what the reasoning for using a string is over just a number? @tkosan not sure if you already knew this but what we currently due is add a changeGroup attribute to the nodes that have changed during certain steps to accomplish that. |
@tkosan the plan is to use the same
@aelnaiem the reason for using string is so that we can represent decimal numbers with the correct precision, e.g. |
Right, that makes sense! |
@kevinbarabash, what are your thoughts on adding support for parsing input that contains LaTeX? A few years ago I obtained a copy of a math parser from the Khan Academy khan-exercises repository on GitHub that included support for parsing LaTeX. It is present near the bottom of the following JavaScript program if you would like to look at it: |
@tkosan the khan-exercises parsers has been extracted into a sub-library called KAS which is still used by KA. I was thinking of having different settings for the parser that produce behavior that was compatible of mathjs, KAS, asciimath, etc. KAS treats |
@kevinbarabash the parser I copied is the one KA used before they switched to the KAS parser. I found that the KAS parser was not suitable for the kind of step-by-step equation solving I was implementing (but I don't remember why). I like the idea of math-parser having different settings. A setting I would like is one that produced trees that are not flattened. On the LaTeX issue, my thought is having support for even a limited number of LaTeX commands was sufficient for parsing the problems that Ahmed posted. I have been studying the math-parser code, and it is beautifully written! I am looking forward to having it in mathsteps. |
@tkosan I had a look at the file that @aelnaiem posted. Here's the list of all of the LaTeX commands being used:
As the capabilities of
Regarding, non-flattened trees, is the main use-case generating binary expression trees to show computation order? |
I've opened #14 for limited LaTeX support. For further discussion about the AST we want to produce, please see math-ast/blob/master/spec.md and file issues at math-ast/issues. |
An AST can be produced by input that is syntactic valid. Operations and functions may (or may not) be semantically valid depending on the operands. Moreover, the operands may affect the meaning and or the properties of an operation or function, e.g.
a * b
commutes ifa
andb
are scalars whereasA * B
does not commute ifA
andB
matricesSemantic knowledge should be kept out of the AST, but can be used with an AST to for purposes of validating, evaluating, and transforming a mathematical statement.
A mathematical statement could be an expression, equation, or definition.
Multiple mathematical statements can be grouped into a linear sequence to form a computation, manipulation, or proof. These specific types of sequences do not need to be explicitly modeled, only that there should be a way to describe a sequence of statements. Let's call this a
"Program"
for lack of a better term.Some design goals for the nodes:
Program
is a sequence of statements which can be any other node. This"Program"
Operation
handles unary, binary, and n-ary operations. Unary minus is used to represent negation."Operation"
"+"
,"*"
,"/"
,"\u00b7"
,^
, etc.Relation
or whatever we call a sequenceRelation
is used for equivalence relations, e.g.=
,<
,<=
, etc., set relations, e.g. "is a subset of", and any other relations that might come up"Relation"
=
,<
,<=
, etc. (TODO: figure out subset)Statement
orIdentifier
stores information about variables, constants, and function names. The reason not to have separate nodes forVariable
andConstant
is that an identifier may be either depending on the context."Identifier"
"x"
,"pi"
,"atan2"
, etc.null
,Identifier
,Number
, orSequence
(sequence is useful for matrix indices)Function
can be used to represent either a function definition or function application."Function"
(can be refined to"FunctionDefinition"
or"FunctionApplication"
)Identifier
Relation
orProgram
.Number
"Number"
Sequence
comma separated sequence of non-program nodes"Sequence"
Sequence
ofRelation
s could represent a system of equations, a sequence of numbers could represent an ordered pair or a vector.Brackets
encompasses parentheses as well as other related symbols. Can be used for standard parenthesis, open/closed/half intervals, ordered tuples, sets etc."Brackets"
"("
,"["
,"{"
, etc.")"
,"]"
,"}"
, etc.Weird edge cases:
^-1
for function, trigonometric function, and matrix inverses^T
for transposeStuff that hasn't been described but should be at some point:
The text was updated successfully, but these errors were encountered: