Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: next generation mathJSON #500

Closed
arnog opened this issue Jun 15, 2020 · 46 comments
Closed

Feature: next generation mathJSON #500

arnog opened this issue Jun 15, 2020 · 46 comments
Assignees
Labels
feature Request for a new feature or functionality, as opposed to a defect in an existing feature. roadmap An issue describing a substantial, potentially backward compatibility breaking, in a future release

Comments

@arnog
Copy link
Owner

arnog commented Jun 15, 2020

Introduction

mathJSON (MASTON) has been useful to represent the content of a mathfield as an Abstract Syntax Tree in a format that can be parsed and manipulated. For example, it's used on mathlive.io to power a computation engine that is used to evaluate expressions and plot them.

However, it has some limitations:

  • it is relatively verbose, even for simple expressions, and not as easy to parse as it could be
  • it is too close to the Latex syntactic conventions to be completely generalizable.
    For example, some constructs are represented in ways that make them specific to the typesetting of those operations. e.g. exponentiation (i.e. x^2) is represented with a sup property.
  • the syntax and semantic are closely bound, and the current implementation does not have mechanisms to customize either.
    For example, it would be desirable to specify how arithmetic operations are performed (using native JavaScript numbers, using BigInt, using a third-party numerics library, etc...).
    It would also be desirable to be able to specify the syntactic rules of the Latex that can be parsed in order to support custom conventions, for example on how to interpret fences (]-5, +∞)) or other syntactic constructs, including specialized operators, functions and fences.

As another example from #293 \frac{d}{dx} a + b could be interpreted/parsed as:

  • if d is a known variable: "((d / (d * x)) * a) + b"
  • or it's "(a + b) derived for x"
  • or it's "(a derived for x) + b"

The 'correct' interpretation is entirely dependent of the context, and there is currently no way to control this.

Proposal

Therefore, we propose a new version of mathJSON that will feature the following:

  • Clear separation between syntax and semantic. In particular the semantic will be completely independent from the Latex syntax. A "translator" from/to another syntax (MathML, ASCIIMath, etc...) could be provided.
  • The syntax will be represented by a set of rules transforming a Latex stream into a mathJSON expression. These rules can be complemented or overridden. If no syntactic rules are provided, the result is a valid mathJSON expression representing a stream of Latex tokens (i.e. '["latex", "\frac", "{", "1", "}", "{", "2", "}" for \frac{1}{2}). An option to include parsing of Latex commands (but not their interpretation) would result in ["latex", ["\\frac", "1", "2"]}. A default rule would specify that \frac should map to the divide function, in which case the output would be ["divide", 1, 2]
  • The semantic will be provided by a dictionary of symbols, specifying what the symbol represent (constant, variable, function) and with associated methods to evaluate it, etc...
  • Default syntax and semantic will be provided for various domains (arithmetic, algebra, calculus, etc...). Only those dictionaries relevant to the application can be loaded.

Examples

Latex mathJSON
\frac{a}{1+x} ["divide", "a", ["add", 1, "x"]]
e^{\imaginaryI \pi }+1=0 ["eq", ["power", "e", ["add", ["multiply", "pi", "i"], 1]], 0]

For comparison, that last expression was represented in the previous mathJSON version as:

{
  "fn": "equal",
  "arg": [
    {
      "fn": "add",
      "arg": [
        {
          "sym": "e",
          "sup": {
            "fn": "multiply",
            "arg": [{ "sym": "" }, { "sym": "π" }]
          }
        },
        { "num": "1" }
      ]
    },
    { "num": "0" }
  ]
}

Backward Compatibility

The new format is not backward compatible with the previous version of mathJSON. Although a "translator" between the formats could be written, we do not plan to provide one.

Related Issues

This feature will address the following related issues: #437, #396, #380, #379, #293.

@arnog arnog added feature Request for a new feature or functionality, as opposed to a defect in an existing feature. roadmap An issue describing a substantial, potentially backward compatibility breaking, in a future release labels Jun 15, 2020
@arnog arnog self-assigned this Jun 15, 2020
@arnog arnog pinned this issue Jun 15, 2020
@bengolds
Copy link
Contributor

Very LISPy! Some questions that come to mind for me:

  1. How do you ensure that different implementations of MASTON don't drift apart? e.g. one program uses divide, and another uses div or frac? Is this done by having a standardized set of dictionaries? Or are users free to make their own dictionaries?
  2. What does the dictionary format look like?
  3. Are dictionaries sent along with each expression? Or is a dictionary loaded once, ahead of time?

@arnog
Copy link
Owner Author

arnog commented Jun 15, 2020

Very LISPy!

Yes! They might have been onto something... As it turns out, a number of historically significant computer algebra systems were implemented in Lisp: Macsyma, Reduce, Maxima, Axiom...

How do you ensure that different implementations of MASTON don't drift apart? e.g. one program uses divide, and another uses div or frac? Is this done by having a standardized set of dictionaries? Or are users free to make their own dictionaries?

The default dictionaries should help with this. They'll cover a broad range of domains.

In addition, the expressions can be annotated with some metadata, specifically a wikidata token that should help disambiguate. If you decide to use a dictionary that defines π as 'PI', and I use the default dictionary that defines it as 'pi', they will both have a wikidata token of 'Q167'. I've considered using OpenMathCD as well, but wikidata seems to have better coverage (and the wikidata info often includes the corresponding OpenMath ID, i.e. nums1#pi).

What does the dictionary format look like?

The 'translation' dictionary (i.e. the one that maps from/to Latex) looks something like this:

[
        { name: 'pi', trigger: { symbol: '\\pi' } },
        {
            name: 'mu-0',
            trigger: { symbol: ['\\mu', '_', '0'] },
        },
        {
            name: 'set',
            trigger: { matchfix: '\\lbrace' },
            separator: ',',
            closeFence: '\\rbrace',
        },
        {
            name: 'divide',
            trigger: { function: '\\frac' },
            emit: '\\frac',
            requiredLatexArg: 2,
        },
        {
            name: 'eq',
            trigger: { infix: '=' },
            associativity: 'right',
            precedence: 260,
        },
        {
            name: 'abs',
            trigger: { matchfix: '|' },
            precedence: 880,
            closeFence: '|',
        },
        {
            name: 'factorial',
            trigger: { postfix: '!' },
            precedence: 880,
        }
]

The 'function' dictionary looks something like this:

[
        pi: {
            wikidata: 'Q167',
            isConstant: true,
            domain: 'R+',
        },
        'mu-0': {
            isConstant: true,
            wikidata: 'Q1515261',
            domain: 'R+',
            value: 1.25663706212e-6,
            unit: ['multiply', 'H', ['power', 'm', -1]],
        },
        multiply: {
            wikidata: 'Q40276',
            isPure: true,
            isCommutative: true,
            isAssociative: true,
        },
        abs: {
            wikidata: 'Q3317982',
            isPure: true,
            isListable: true,
        },
        factorial: {
            wikidata: 'Q120976',
            isPure: true,
        },
        eq: {
            // mathematical relationship asserting that two quantities have the same value
            wikidata: 'Q842346',
            isPure: true,
            isCommutative: true,
        },
]

Are dictionaries sent along with each expression? Or is a dictionary loaded once, ahead of time?

No, the dictionaries are not included in the expression. The wikidata tokens should be sufficient to do the appropriate mapping, but a reference to the dictionary could be included as well (i.e. a URL pointing to the definition of the dictionary).

@rmeniche
Copy link

Hello,

We appreciate this change as it seems to be better for our project and will fix some bugs.

Thanks for this !

@arnog
Copy link
Owner Author

arnog commented Jun 21, 2020

Update:

The work is in progress, with the core functionality implemented.

This includes support for several "forms", including a canonical form that transforms expressions so they are written as sum of products, with sorted argument (for commutative functions) and using a lexdeg sort order for polynomials.

What's left to do:

  • handling of integrals, generalized derivatives, sums and products
  • handling of environments (matrix, cases, etc...)
  • propagation of optional Latex/wikidata metadata
  • write more test cases
  • switch over to the new MathJSON in MathLive
  • update mathlive.io (which currently depends on parsing the old format)
  • complete Latex bindings (i.e. the intent is that each of the 800+ Latex commands that MathLive knows about will have a default binding to MathJSON, and from MathJSON to Latex)

Once this lands, it will be able to handle some pretty gnarly notations that were difficult/impossible to handle before, for example:

\sin^{-1}\prime x -> [(["derivative", 1, ["inverse-function", "sin"]], "x")]

@arnog
Copy link
Owner Author

arnog commented Jun 24, 2020

The WIP code has been committed, but it's not hooked up to anything yet, so the old MathJSON implementation is still used.

Coming soon:

  • finish the full latex bindings
  • propagation of optional Latex/wikidata metadata
  • actually hooking up the next gen MathJSON to the MathLive APIs

@BChip
Copy link

BChip commented Jul 1, 2020

@arnog Not sure if this is related. However, is there a way to customize the output of MathLive? For example, instead of outputting as 2^2, it output as Pow(2,2)?

@arnog
Copy link
Owner Author

arnog commented Jul 1, 2020

@BChip Yes! Using the new MathJSON output you would get ["Power", 2, 2] as a JSON data structure that you can then easily transform you "Pow(2,2)" or whatever custom syntax you'd prefer.

@stefnotch
Copy link
Contributor

stefnotch commented Jul 10, 2020

The syntax will be represented by a set of rules transforming a Latex stream into a mathJSON expression. These rules can be complemented or overridden

Does this also mean that it will be possible to override symbols like =?

For what it's worth, my use-case is implementing a custom equals sign that contains the result of an expression. Here, the equals sign and the placeholder should be a single, customized symbol.
image

Furthermore, will it be possible to set captureSelection for our symbols?

In a similar vein, will it be possible to turn off captureSelection, especially for functions (e.g. so that when you type sin(3x), you can delete individual characters from sin)?

@arnog
Copy link
Owner Author

arnog commented Jul 11, 2020

It will be possible to customize the output generated for MathJSON. So, you could indicate that the Latex input 3+3=\placeholder{} generates ["MyEqualSign", ["Add", 3, 3]] instead of the default output ["Equal", ["Add", 3, 3], "Missing"].

However, I'm not sure if that's exactly what you're looking for. It sounds like you might need to customize the input. You might be able to do what you are looking for by providing an inline shortcuts dictionary (with the inlineShortcuts config property). In your shortcut definitions, you could have a "sin" shortcuts that generates latex sin instead of \sin, although it will then display as italics, not roman.

@NSoiffer
Copy link
Collaborator

I'm a little late to the game and you have likely already thought about cases like these, but I list them here just in case...

  1. Any thought to recovering the original presentation? It seems like this is something divorced from mathJSON, but keeping both semantics and presentation seems to keep coming up in MathML. We are moving towards a solution to "easily" combine both forms in Presentation MathML via a "semantics" attr that points off to the children that are the key to the semantics.

  2. nary operators -- will you support these or will they be turned into binary operators. In particular:

    1. a+b+c -- does this become ["plus", "a", "b", "c"] or ["plus", "a", ["plus", "b", "c"] ]
    2. a+b-c+d -- does this become ["plus", "a", "b", ["minus" "c"], "d"] (or maybe ["times", -1, "c] instead of "minus")
    3. a = b < c = d -- does this magically get conjunctions added ["and", ["eq", "a", "b"], ["lessthan", "b", c"], ...] or is it ["relational-ops", "a", "=", "b", "<", ...]
  3. function composition

    1. (f ∘g)x
    2. sin sin^{-1} x
  4. integral

    1. \int \frac{dx}{x} -- ["integral", ["div", 1, "x"], "x"] -- the interesting thing here is numerator needs to be "1" once the "dx" is pulled out. Just a detail to remember.
    2. double integrals -- do these turn into two (nested) integrals? Could be tricky for limits when the limits aren't separated. Also, it can be tricky figuring out the inner/outer variables of integration.
  5. higher order partial differential equations \frac{\partial d^3}{\partial x^2 \partial y }f(x,y) -- nothing deep here, just need to keep in mind that there might be multiple bound variables. Total derivatives should "naturally" extend to partials

How smart are your rules for going from syntax to semantic going to be. E.g, you have matchfix rule for '| ... |' which becomes absolute value. But if the contents are a table, then that's not right and it's probably not right for a capital letter as in '|M|'. There are lots of other cases like this, so maybe there needs to be some pattern matching and ordering for your rules if that's not there already.

@arnog
Copy link
Owner Author

arnog commented Jul 13, 2020

@NSoiffer Good questions!

Any thought to recovering the original presentation?

Yes: there is an option to attach an attribute to an expression that corresponds to the input:

{ "latex": "\\frac{x}{12}", "fn": [ "Divide", { "latex": "x", "sym": "x"}, { "num": "12", "latex": "12" } ] }

nary operators -- will you support these or will they be turned into binary operators.

It all depends on how the operators are defined, since the precise definition of the operators can be overridden. If an operator is defined as associative, a n-ary form will be generated. In the default dictionary, "Add" is an associative operator.

In particular:

a+b+c -- does this become ["plus", "a", "b", "c"] or ["plus", "a", ["plus", "b", "c"] ]
[ "Add", "a", "b", "c" ]

a+b-c+d -- does this become ["plus", "a", "b", ["minus" "c"], "d"] (or maybe ["times", -1, "c] instead of "minus")

There will be several "forms" supported. A "form" is a transformation applied to an expression, usually with the intent to use a canonical representation for easier processing. The "Full Form" is a form that does not apply transformations, and is therefore closer to the input. The "Canonical Form" applies more simplifications. In particular, the canonical form transforms subtraction into additions, and divisions into multiplications.

Canonical Form: [ "Add", [ "Multiply", -1, "c" ], "a", "b", "d"]
Full Form: [ "Add", "a", ["Subtract", "b", "c"], "d"]

a = b < c = d -- does this magically get conjunctions added ["and", ["eq", "a", "b"], ["lessthan", "b", c"], ...] or is it ["relational-ops", "a", "=", "b", "<", ...]

The relational operators are currently non-associative, so this would result in a syntax error, but it could make sense to have them do the first option.

function composition

(f ∘g)x
sin sin^{-1} x

Yes, this is supported. The "head" of a function can itself be an expression, so:
[ "Sin", [ [ "InverseFunction", "Sin" ], "x" ]]

integral

\int \frac{dx}{x} -- ["integral", ["div", 1, "x"], "x"] -- the interesting thing here is numerator needs to be "1" once the "dx" is pulled out. Just a detail to remember.

Yes.

double integrals -- do these turn into two (nested) integrals? Could be tricky for limits when the limits aren't separated. Also, it can be tricky figuring out the inner/outer variables of integration.

Yes, double integrals become two nested integrals.

higher order partial differential equations \frac{\partial d^3}{\partial x^2 \partial y }f(x,y) -- nothing deep here, just need to keep in mind that there might be multiple bound variables. Total derivatives should "naturally" extend to partials

Yes, this will be supported as well by the default dictionary.

How smart are your rules for going from syntax to semantic going to be. E.g, you have matchfix rule for '| ... |' which becomes absolute value. But if the contents are a table, then that's not right and it's probably not right for a capital letter as in '|M|'. There are lots of other cases like this, so maybe there needs to be some pattern matching and ordering for your rules if that's not there already.

Yes, the rules are ordered and they can match on patterns as well (including checkin on the domain of arguments, etc...).

@NSoiffer
Copy link
Collaborator

Great! You've put a lot of thought into this and it sounds like it will be very powerful.

@stefnotch
Copy link
Contributor

stefnotch commented Aug 18, 2020

I've been taking a sneak peek at the upcoming features and wanted to ask if it will be possible to make "non-greedy" postfix operators.

Example:
I wanted a postfix = so that typing the following becomes valid

a=

Adding that isn't an issue. However, it means that

a=3

will be parsed as a= (postfix equals) multiplied by 3. But maybe I've just been doing something wrong. Or maybe there is another, more reasonable approach?

@arnog
Copy link
Owner Author

arnog commented Aug 18, 2020

What would you like a=3 to be interpreted as?

(As an aside, it it possible to customize the parsing by providing a parse function associated with the symbol. This would give you great latitude in how the symbol is interpreted. But I'm not sure this is necessary, depending on what you expect to get from a=3)

@stefnotch
Copy link
Contributor

stefnotch commented Aug 19, 2020

Ideally, I would like it to get interpreted as ["Equals", "a", "3"].

Edit:
I came up with this solution that covers my use case. It basically overrides the parse function of the built-in equal and lets it handle the case where there isn't a right hand side. I hope I got it down correctly.

// Yes, I know, this is probably quite a bit of a hack
dictionary["inequalities"].find(v => v.name=="Equal").parse = function (lhs, scanner, minPrec, _latex) {
          if (260 < minPrec) return [lhs, null];
          const rhs = scanner.matchExpression(260);
          if (rhs == null) return [null, ["Equal", lhs]];
          return [null, ["Equal", lhs, rhs]];
        };

@arnog
Copy link
Owner Author

arnog commented Aug 19, 2020

OK, yeah, this can kinda work.
A couple of things:

  • The "Equal" function should have two arguments, so you should probably use something like:
          if (rhs == null) return [null, ["Equal", lhs, "Missing"]];
  • Instead of brute force overriding the default definition of Equal, you should specify a custom dictionary with a new definition for Equal (although I believe the overriding is currently broken, so what you have right now is OK)
  • You can define both a postfix and an infix version as a trigger. I think that if you do that it would do what you want (although you may still need a custom parse function to generate the desired output, but it would avoid triggering an error notification (for a missing argument), which I believe with your solution as is would still happen)

But I'm also curious as to why you're trying to do this. Would "a=" really be a valid expression, or are you trying to handle syntax errors in the input? Trying to figure out if the default definitions shouldn't handle your use case.

@stefnotch
Copy link
Contributor

Oh, thank you for pointing those things out!

Regarding defining both a postfix and an infix trigger, I actually did try that, but no matter what I did, the postfix trigger always got triggered. a=b would result in ["Multiply",["Equal","a"],"b"].
The code that I tried out

{
        name: 'Equal',
        trigger: { infix: '=', postfix: '=' },
        associativity: 'right',
        precedence: 260,
    },

Now, as to why I'm doing this, no a= would not be a valid expression. Rather, the idea is to have a little computer algebra system running in the background that would then realize that an argument is missing and calculate the result. A more realistic example would be 3*4-e^7= and the computer algebra system taking the corresponding MathJson and inserting the result after the = sign.

@arnog
Copy link
Owner Author

arnog commented Aug 19, 2020

OK, that makes sense. So, yeah, in that case it would probably make sense for the default dictionary to return ["Equal", "a", "Missing"]] in this situation. I'll look into it. Thanks for the feedback!

@arnog
Copy link
Owner Author

arnog commented Aug 19, 2020

Right now when a missing operand is encountered (as in a=) an 'expected-operand' error is signaled.

This behavior could be enhanced with a missingOperand property on the dictionary entry for Equal that could be set to a substitution symbol, for example Missing (which is a standard symbol that gets translated to Latex \placeholder{} and rendered accordingly).

There could also be a global missingOperand option for the scanner that would work similarly but would apply to all dictionary entries. This means that a+ would produce ["Add", "a", "Missing"] which is perhaps less desirable.

So, an open question right now is whether there should be (1) a per-dictionary entry option, (2) a global option to control the behavior of missing operands, or (3) both.

Another question is should the default dictionary entries produce ["Equal", "a", "Missing"] for a= or throw an error. I'm leaning towards throwing an error because otherwise the default behavior is not round-tripable (i.e. rendering to Latex would not generate the same as the input), but of course this default behavior could be changed using either the global or per-entry missingOperand option.

Any thoughts?

@NSoiffer
Copy link
Collaborator

I like having the option for it not to be an error. The example of some-expr = and being able to evaluate that is compelling and exists today in some browsers when you type that in the addr bar for linear exprs. The case for doing it for other operations is maybe less compelling, but an (overly simplified) example would be 3+ =7 and having an equation solver figure out the solution. In common computation systems, you would write something like solve(3+x=7, x), or maybe just solve(3+x=7), but having the option of not having to specify a variable and essentially it being a blank/placeholder could be useful.

@stefnotch
Copy link
Contributor

stefnotch commented Aug 19, 2020

I also like the option for it to not be an error. (by default it should probably be an error) Another symbol where this would be useful would be an interval symbol. For example, 1..10 would be a integer interval from 1 to 10. And 1.. would be from 1 to Infinity.

However, it should definitely be customize-able per symbol, otherwise catching errors such as 3*/7 would become impossible.

@arnog
Copy link
Owner Author

arnog commented Aug 19, 2020

OK, so right now I'm leaning towards:

  • have a missingOperand attribute for each dictionary entry. Its default value would be undefined, which would indicate that an error should be signaled
  • have no global missingOperand

@stefnotch
Copy link
Contributor

This includes support for several "forms", including a canonical form that transforms expressions so they are written as sum of products, with sorted argument (for commutative functions) and using a lexdeg sort order for polynomials.

I'd love it if Mathlive were to expose a function to convert between these forms or applying certain forms.
e.g.

const expression = latexToMathjson(mathfield.value?.$text("latex-expanded") + "", { form: "full" });

// Do stuff with the full expression

// Now get a "simplified" version of the expression that can, say, be submitted to a CAS backend
const simplified = mathjsonApplyForm(expression, ["canonical-subtract", "canonical-root"]);

@arnog
Copy link
Owner Author

arnog commented Aug 31, 2020

This function is called form(), e.g. form(dictionary, expression, ['flatten', 'sorted', 'full']). It's in math-json/forms.ts and although it's not public in the current build, it will be.

And MathJSON will include a CAS engine.

@arnog
Copy link
Owner Author

arnog commented Oct 9, 2020

The MathJSON implementation is being extracted from MathLive and moved to https://github.com/cortex-js/math-json so that it can be used to manipulate expression without having to load MathLive.

@michelLenczner
Copy link

michelLenczner commented Nov 16, 2020

Hi Arno, I am taking over the work of rmeniche, and would like to know if there is an agenda for the usability of the new functionalities related to mathJSON ASTs ?

@arnog
Copy link
Owner Author

arnog commented Nov 16, 2020

Hi @michelLenczner . There is some documentation on the format here: http://cortexjs.io/guides/math-json/
I expect to have the work completed in order to switch over to the new format before the end of the year. Does that answer your question and does that work for you?

@michelLenczner
Copy link

Thank you. Of course, I am familiar with this documentation. I take note of this agenda. For us, the work will start again intensively in March. At the moment we are in a design phase before new implementations.

@michelLenczner
Copy link

Hi Arno. First of all, happy new year. Then, I am coming to know if you had time to progress on the new MathJSON?

@arnog
Copy link
Owner Author

arnog commented Jan 5, 2021

Thank you. Yes, there has been some progress... I'm a bit behind, but I'll try to get something out as quickly as possible.

@saivan
Copy link

saivan commented Jan 30, 2021

I just read through this, and it seems really solid. Great ideas here @arnog :)

@stefnotch
Copy link
Contributor

I was wondering if the mathjson format also supports numbers in a different base. The most important ones would be binary and hexadecimal.

@arnog
Copy link
Owner Author

arnog commented Feb 4, 2021

Yes, this can be represented using the BaseForm function: ["BaseForm", 0x2A, 16].

In Latex, this is represented as (2A)_{16}, up to base 36.

@arnog
Copy link
Owner Author

arnog commented Feb 10, 2021

Progress update: an implementation is now available at https://github.com/cortex-js/math-json

The documentation of the API is lacking, but you can get an idea of how to use it by looking at test/index.html.

@michelLenczner
Copy link

Thank you very much. I will look at it carefully.

@saivan
Copy link

saivan commented Feb 11, 2021

So how will you go about integrating this into mathlive? I'm assuming MASTON is going to be completely superseded by this, so can it be used in mathlive right away?

@michelLenczner
Copy link

It is a bit early to draw conclusions since at this stage I don't really see the general principles guiding the construction. Nevertheless, I would like to know why it is necessary to have a double representation of certain objects in list form and in dictionary form such as
["Add", 1, "x"]
{"fn": ["Add", 1, {sym: "x"}]}
For what I intend to do with it, the use of the list representation is more complicated. Is there a plan to normalize all expressions in dictionary form?

@arnog
Copy link
Owner Author

arnog commented Feb 11, 2021

@saivan yes, MASTON is going to be removed from Mathlive. You can use MathJSON right now by importing the package separately, and using the Latex.parse() method on the mathfield value property. See test/index.html for an example in the math-json repo.

@arnog
Copy link
Owner Author

arnog commented Feb 11, 2021

@michelLenczner The form using arrays to represent functions (and numbers to represent numbers, and strings to represent symbols) has the benefit of being more concise. However, the object literal form is necessary to attach metadata to the expressions.

It is possible to customize the representation to suits your need using ComputeEngine.format(). Right now, there is a form that use the compact form. There currently isn't one to force the use of the object-literal (expanded) form, but that's a very reasonable addition, you would then be able to call ComputeEngine.format(expr, 'object-literal') to transform the expression to this format.

@arnog
Copy link
Owner Author

arnog commented Feb 11, 2021

I've added the object-literal form now.

@saivan
Copy link

saivan commented Feb 14, 2021

Great, I look forward to trying it out then!

@michelLenczner
Copy link

For our part, the date of use of this feature has been moved to April. I don't think I will be able to make a feedback before then.

@arnog
Copy link
Owner Author

arnog commented Feb 21, 2021

Progress update

Documentation

The documentation has been significantly beefed up at http://cortexjs.io/guides/math-json/.

New Atomic Type: Dictionary

I have come to the conclusion that adding a fourth atomic element to MathJSON in addition to 'number', 'symbol' and 'function' would be very valuable, namely an element to represent dictionaries (aka associative arrays or maps).

This will be added shortly to the documentation and the core library.

Domains: Feedback Requested

I am looking for feedback on the definition of 'domains' in the MathJSON library.

Domains are not strictly speaking part of the core MathJSON format, but the default symbol dictionary will make use of them.

Domain are analogous to "types" in programming languages and they will allow for optimizations when compiling expressions and to perform reasoning by inference on expressions (for example: ["ElementOf", 7, "PrimeNumber"] will return True when evaluated).

Have a look at http://cortexjs.io/guides/compute-engine-domains/ and let me know of any domain that should be included (or ones that shouldn't) and of any domain relationship that I may have gotten wrong.

This does not need to be an exhaustive list of domains, since it will be possible to dynamically define new domains, but since this is the default dictionary this should be a list of domains that would be frequently convenient to have.

Compute Engine

I have also made progress on the Compute Engine that can evaluate, compile, and otherwise manipulate MathJSON expressions.

It's in the same repo, and the documentation is here: http://cortexjs.io/guides/compute-engine/. Still a work in progress, though.

New Language: Cortex

I have also decided to build a new language, Cortex, that will be essentially syntactic sugar on top of MathJSON expressions, so ["Evaluate", ["Add", 1, x, 3]] will be equivalent to Evaluate(1+x+3).

While it's nice to be able to express math formulas using Latex, more 'functional' programming is better represented with a different syntax.

I'll add shortly a parser that will generate MathJSON for this syntax as well as serialize from MathJSON to this syntax.

@michelLenczner
Copy link

michelLenczner commented Feb 23, 2021

This is very interesting, but it would be useful to demarcate the ambitions.
I see that you want to define types on mathematical objects with one of your goals being to make type inference.
As I don't know any type theory and therefore no type inference algorithms I can't say much of relevance.
Nevertheless, this proposition is certainly relevant since type theories are at work in some proof assistants and checkers like COQ.
Assuming we have an ad-hoc inference engine, I suppose we would then have to define inference rules associated with the type classes you propose.
In the proposed schema there are only subtype relationships. This seems quite feasible.
Note that in this framework you could perhaps take into account the integrability of functions in addition to their derivability. But this becomes more complicated if you then take into account the integrability of function derivatives.

This kind of difficulty occurs at almost all levels of these types. For example, if we consider functions as relations (which is not done here) then things get complicated. I guess you didn't want to do it to avoid complications.
Sorry I can't help any further.

@arnog
Copy link
Owner Author

arnog commented May 31, 2021

MathJSON has been integrated in mathlive@0.68 🥳

To get the value of a mathfield as MathJSON, use mf.getValue('math-json').

The documentation about MathJSON is available here: https://cortexjs.io/math-json/
The MathJSON repository is here: https://github.com/cortex-js/math-json

@arnog arnog closed this as completed May 31, 2021
@arnog
Copy link
Owner Author

arnog commented Jun 11, 2021

The MathJSON repo has been renamed to @cortex-js/compute-engine.

@arnog arnog unpinned this issue Jun 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Request for a new feature or functionality, as opposed to a defect in an existing feature. roadmap An issue describing a substantial, potentially backward compatibility breaking, in a future release
Projects
None yet
Development

No branches or pull requests

8 participants