Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing preorder reasoning expressions takes very, very long! #3209

Closed
nicolabotta opened this issue Jun 7, 2016 · 36 comments
Closed

Parsing preorder reasoning expressions takes very, very long! #3209

nicolabotta opened this issue Jun 7, 2016 · 36 comments

Comments

@nicolabotta
Copy link

With 0.11.2-git:fc3e88c and on my machine, it takes almost 5 minutes for

> import Syntax.PreorderReasoning
> %default total
> multDistributesOverPlusRightEq : {x, y, z : Fraction} -> x * (y + z) `Eq` (x * y) + (x * z)
> multDistributesOverPlusRightEq {x = (m, d')} {y = (n, e')} {z = (o, f')} = 
>   let d       =  toNat d' in
>   let e       =  toNat e' in
>   let f       =  toNat f' in
>     ( (m * (n * f + o * e)) * (toNat ((d' * e') * (d' * f'))) )
>   ={ cong {f = \ ZUZU => (m * (n * f + o * e)) * ZUZU} toNatMultLemma }=
>     ( (m * (n * f + o * e)) * ((toNat (d' * e')) * (toNat (d' * f'))) )
>   ={ cong {f = \ ZUZU => (m * (n * f + o * e)) * (ZUZU * (toNat (d' * f')))} toNatMultLemma }=
>     ( (m * (n * f + o * e)) * ((d * e) * (toNat (d' * f'))) )  
>   ={ cong {f = \ ZUZU => (m * (n * f + o * e)) * ((d * e) * ZUZU)} toNatMultLemma }=
>     ( (m * (n * f + o * e)) * ((d * e) * (d * f)) )  
>   ={ cong {f = \ ZUZU => (m * (n * f + o * e)) * ZUZU} (multFlipCentre d e d f) }=
>     ( (m * (n * f + o * e)) * ((d * d) * (e * f)) )  
>   ={ cong {f = \ ZUZU => (m * (n * f + o * e)) * ZUZU} (sym (multAssociative d d (e * f))) }=
>     ( (m * (n * f + o * e)) * (d * (d * (e * f))) )  
>   ={ multAssociative (m * (n * f + o * e)) d (d * (e * f)) }=
>     ( ((m * (n * f + o * e)) * d) * (d * (e * f)) )  
>   ={ cong {f = \ ZUZU => (ZUZU * d) * (d * (e * f))} (multDistributesOverPlusRight m (n * f) (o * e)) }=
>     ( ((m * (n * f) + m * (o * e)) * d) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => ((ZUZU + m * (o * e)) * d) * (d * (e * f))} (multAssociative m n f) }=
>     ( (((m * n) * f + m * (o * e)) * d) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => (((m * n) * f + ZUZU) * d) * (d * (e * f))} (multAssociative m o e) }=  
>     ( (((m * n) * f + (m * o) * e) * d) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => ZUZU * (d * (e * f))} (multDistributesOverPlusLeft ((m * n) * f) ((m * o) * e) d)}=  
>     ( (((m * n) * f) * d + ((m * o) * e) * d) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => (ZUZU + ((m * o) * e) * d) * (d * (e * f))} (sym (multAssociative (m * n) f d)) }=
>     ( ((m * n) * (f * d) + ((m * o) * e) * d) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => ((m * n) * (f * d) + ZUZU) * (d * (e * f))} (sym (multAssociative (m * o) e d)) }=  
>     ( ((m * n) * (f * d) + (m * o) * (e * d)) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => ((m * n) * (f * d) + (m * o) * ZUZU) * (d * (e * f))} (multCommutative e d) }=
>     ( ((m * n) * (f * d) + (m * o) * (d * e)) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => ((m * n) * ZUZU + (m * o) * (d * e)) * (d * (e * f))} (multCommutative f d) }=
>     ( ((m * n) * (d * f) + (m * o) * (d * e)) * (d * (e * f)) )
>   ={ cong {f = \ ZUZU => ((m * n) * (d * f) + (m * o) * (d * e)) * (d * ZUZU)} (sym toNatMultLemma) }=  
>     ( ((m * n) * (d * f) + (m * o) * (d * e)) * (d * (toNat (e' * f'))) )
>   ={ cong {f = \ ZUZU => ((m * n) * (d * f) + (m * o) * (d * e)) * ZUZU}  (sym toNatMultLemma) }=  
>     ( ((m * n) * (d * f) + (m * o) * (d * e)) * (toNat (d' * (e' * f'))) )
>   ={ cong {f = \ ZUZU => ((m * n) * (d * f) + (m * o) * ZUZU) * (toNat (d' * (e' * f')))} 
>           (sym toNatMultLemma) }=
>     ( ((m * n) * (d * f) + (m * o) * (toNat (d' * e'))) * (toNat (d' * (e' * f'))) )
>   ={ cong {f = \ ZUZU => ((m * n) * ZUZU + (m * o) * (toNat (d' * e'))) * (toNat (d' * (e' * f')))} 
>           (sym toNatMultLemma) }=
>     ( ((m * n) * (toNat (d' * f')) + (m * o) * (toNat (d' * e'))) * (toNat (d' * (e' * f'))) )
>   QED

to fail to type check:

$ time idris --check Gugu.lidr
Gugu.lidr:3:32:When checking type of Main.multDistributesOverPlusRightEq:
No such variable Fraction

real    4m44.659s
user    4m43.536s
sys 0m1.292s

! Thus, parsing proofs based on preorder reasoning does not scale up anymore. Slightly longer expressions yield unacceptable type checking times.

As shown in #3208, this is the major reason for the extremely long type checking times originally reported in #3197.

I have been routinely type checking this and similar examples (with suitable import directives) since months but, if I am not mistaken, the issue has come up (or worsened) only recently.

Any idea how to circumvent this problem? Thanks, Nicola

@ahmadsalim
Copy link

@nicolabotta If it is a parsing issue, then it may be due to the way external syntax is handled now, which is unfortunately not very efficient and requires a lot of backtracking.

I am not completely sure what is the optimal workaround, but my guess is that it would be possible to use something like where or top-level declarations more efficiently (and verbosely).

@ahmadsalim
Copy link

ahmadsalim commented Jun 7, 2016

Actually, another alternative would be to use step and qed from the package directly.
I.e. having qed (step («left hand side») («prf») («right hand side»))

@nicolabotta
Copy link
Author

@ahmadsalim You are right: in the original FractionEqProperties.lidr file we also had proofs with long chains of let ... in expressions and they had negligible effects on the parsing time. But, letclauses are not analysed for erasure, see #2837. Thus, if I was to avoid the preorder reasoning syntax, I would rather use an idiom based on whereclauses and trans rather then let.

On the other hand, I do not like the idea of sacrificing readability for the sake of type checking time and I think that, in principle, proofs should be easy to read for humans (in fact, I always tend to explicitely add to my proofs trivial steps (steps that can be implemented in terms of Refl alone) for the human reader and it could be that some of the troubles I am experiencing are a consequence of this habit).

Thus, the question here is whether parsing preorder reasoning proofs can be significantly improved or not. In the first case, I am happy to wait until this is done (if it is not going to take years). On the other hand, if the misbehavior reported in this thread turns out to be a symptom of a more fundamental problem (e.g. parsing time more than linear in the length of proofs), then it would probably be meaningful to issue some warning or recommendation against using preorder reasoning in all but very short proofs.

@ahmadsalim
Copy link

@nicolabotta I think the parsing should be improved but rewritting the parser requires a bit of work (if it has to be done correctly).
Could you be inspired by something like:

module Test

import Syntax.PreorderReasoning
import Control.Isomorphism
infixl 1 =**
infixl 1 **=

data Step : (a : Type) -> (x : a) -> (y : a) -> Type where
  MkStep : (x : a) -> (x = y) -> Step a x y


(=**) : (x : a) -> (x = y) -> Step a x y
x =** prf = MkStep x prf

(**=) : Step a x y -> y = z -> x = z
(MkStep x {y = y} prf) **= prf2 = rewrite prf in rewrite prf2 in Refl


stupidProof : 2 + 2 = 4
stupidProof = 2 + 2 =** Refl **= 4 QED

for your current proof?

@nicolabotta
Copy link
Author

@ahmadsalim Do you expect the introduction of Step, =** and **= to improve parsing times? In the above example, this does not appear to be the case:

$ time idris --check Gugu.lidr
Gugu.lidr:17:32:When checking type of Main.multDistributesOverPlusRightEq:
No such variable Fraction

real    9m42.603s
user    9m37.472s
sys 0m3.528s

with Gugu.lidr:

> import Syntax.PreorderReasoning
> import Control.Isomorphism
> infixl 1 =**
> infixl 1 **=

> data Step : (a : Type) -> (x : a) -> (y : a) -> Type where
>   MkStep : (x : a) -> (x = y) -> Step a x y

> (=**) : (x : a) -> (x = y) -> Step a x y
> x =** prf = MkStep x prf

> (**=) : Step a x y -> y = z -> x = z
> (MkStep x {y = y} prf) **= prf2 = rewrite prf in rewrite prf2 in Refl

> %default total
> multDistributesOverPlusRightEq : {x, y, z : Fraction} -> x * (y + z) `Eq` (x * y) + (x * z)
> multDistributesOverPlusRightEq {x = (m, d')} {y = (n, e')} {z = (o, f')} = 
>   let d       =  toNat d' in
>   let e       =  toNat e' in
>   let f       =  toNat f' in
>     ( (m * (n * f + o * e)) * (toNat ((d' * e') * (d' * f'))) )
>   =** cong {f = \ ZUZU => (m * (n * f + o * e)) * ZUZU} toNatMultLemma **=
>     ( (m * (n * f + o * e)) * ((toNat (d' * e')) * (toNat (d' * f'))) )
>   =** cong {f = \ ZUZU => (m * (n * f + o * e)) * (ZUZU * (toNat (d' * f')))} toNatMultLemma **=
>     ( (m * (n * f + o * e)) * ((d * e) * (toNat (d' * f'))) )  
>   =** cong {f = \ ZUZU => (m * (n * f + o * e)) * ((d * e) * ZUZU)} toNatMultLemma **=
>     ( (m * (n * f + o * e)) * ((d * e) * (d * f)) )  
>   =** cong {f = \ ZUZU => (m * (n * f + o * e)) * ZUZU} (multFlipCentre d e d f) **=
>     ( (m * (n * f + o * e)) * ((d * d) * (e * f)) )  
>   =** cong {f = \ ZUZU => (m * (n * f + o * e)) * ZUZU} (sym (multAssociative d d (e * f))) **=
>     ( (m * (n * f + o * e)) * (d * (d * (e * f))) )  
>   =** multAssociative (m * (n * f + o * e)) d (d * (e * f)) **=
>     ( ((m * (n * f + o * e)) * d) * (d * (e * f)) )  
>   =** cong {f = \ ZUZU => (ZUZU * d) * (d * (e * f))} (multDistributesOverPlusRight m (n * f) (o * e)) **=
>     ( ((m * (n * f) + m * (o * e)) * d) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => ((ZUZU + m * (o * e)) * d) * (d * (e * f))} (multAssociative m n f) **=
>     ( (((m * n) * f + m * (o * e)) * d) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => (((m * n) * f + ZUZU) * d) * (d * (e * f))} (multAssociative m o e) **=  
>     ( (((m * n) * f + (m * o) * e) * d) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => ZUZU * (d * (e * f))} (multDistributesOverPlusLeft ((m * n) * f) ((m * o) * e) d)**=  
>     ( (((m * n) * f) * d + ((m * o) * e) * d) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => (ZUZU + ((m * o) * e) * d) * (d * (e * f))} (sym (multAssociative (m * n) f d)) **=
>     ( ((m * n) * (f * d) + ((m * o) * e) * d) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => ((m * n) * (f * d) + ZUZU) * (d * (e * f))} (sym (multAssociative (m * o) e d)) **=  
>     ( ((m * n) * (f * d) + (m * o) * (e * d)) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => ((m * n) * (f * d) + (m * o) * ZUZU) * (d * (e * f))} (multCommutative e d) **=
>     ( ((m * n) * (f * d) + (m * o) * (d * e)) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => ((m * n) * ZUZU + (m * o) * (d * e)) * (d * (e * f))} (multCommutative f d) **=
>     ( ((m * n) * (d * f) + (m * o) * (d * e)) * (d * (e * f)) )
>   =** cong {f = \ ZUZU => ((m * n) * (d * f) + (m * o) * (d * e)) * (d * ZUZU)} (sym toNatMultLemma) **=  
>     ( ((m * n) * (d * f) + (m * o) * (d * e)) * (d * (toNat (e' * f'))) )
>   =** cong {f = \ ZUZU => ((m * n) * (d * f) + (m * o) * (d * e)) * ZUZU}  (sym toNatMultLemma) **=  
>     ( ((m * n) * (d * f) + (m * o) * (d * e)) * (toNat (d' * (e' * f'))) )
>   =** cong {f = \ ZUZU => ((m * n) * (d * f) + (m * o) * ZUZU) * (toNat (d' * (e' * f')))} 
>           (sym toNatMultLemma) **=
>     ( ((m * n) * (d * f) + (m * o) * (toNat (d' * e'))) * (toNat (d' * (e' * f'))) )
>   =** cong {f = \ ZUZU => ((m * n) * ZUZU + (m * o) * (toNat (d' * e'))) * (toNat (d' * (e' * f')))} 
>           (sym toNatMultLemma) **=
>     ( ((m * n) * (toNat (d' * f')) + (m * o) * (toNat (d' * e'))) * (toNat (d' * (e' * f'))) )
>   QED

Am I missing something obvious? My guess is that something has changed in Idris lately that makes parsing certain expressions a pain. How can it be that it takes almost 10 minutes to parse 42 lines of code?

@ahmadsalim
Copy link

@nicolabotta I did, I guess my initial assumption of that this was being primarily to external syntax is wrong.

@melted
Copy link
Contributor

melted commented Jun 8, 2016

Can you try to lower the version bound on trifecta to see if it's a recent
regression?
Den 8 juni 2016 11:43 skrev "Ahmad Salim Al-Sibahi" <
notifications@github.com>:

@nicolabotta https://github.com/nicolabotta I did, I guess my initial
assumption of that this was being primarily to external syntax is wrong.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#3209 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AANbXezfzwuDL02mGf_j7kBmcNpLBwZ_ks5qJo6hgaJpZM4IwC9X
.

@nicolabotta
Copy link
Author

I have no idea what trifecta is, but it takes about 50sec for my first example to fail to type check with 88d8075 (Mar 1, 2016) against about 5 minutes with fc3e88c. Now, 50 seconds is still a lot of time for 40 lines of code. But the test shows that something very bad for parsing efficiency happened in the last three months.

@melted
Copy link
Contributor

melted commented Jun 8, 2016

Trifecta is the parsing library that idris uses. One can test with an older
version of it by putting a lower upper bound for its version in idris.cabal
and rebuild.
Den 8 juni 2016 13:32 skrev "nicolabotta" notifications@github.com:

I have no idea what trifecta is, but it takes about 50sec for my first
example to fail to type check with 88d8075
88d8075
(Mar 1, 2016) against about 5 minutes with fc3e88c
fc3e88c.
Now, 50 seconds is still a lot of time for 40 lines of code. But the test
shows that something very bad for parsing efficiency happened in the last
three months.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3209 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AANbXTAXDMsqmLPf2gXjid2qugZ0H4BFks5qJqg6gaJpZM4IwC9X
.

@nicolabotta
Copy link
Author

@melted It does not appear to be possible to make relibwith

, trifecta >= 1.1 && < 1.5

in idris.cabal instead of

, trifecta >= 1.1 && < 1.6

because of dependency problems. Thus, I assume that trifecta 1.5 is currently used. I am not sure that the misbehavior can be explained in terms of trifecta, however. Both with with 88d8075 (Mar 1, 2016) and with c2b01cc (Apr. 15, 2016) it takes about 50 seconds to parse the example. In both commits, the constraints on trifectawere

, trifecta >= 1.1 && < 1.6

as in the current Idris version (that needs about 5 minutes to parse the example). I am going to bisect a little bit more and report.

@nicolabotta
Copy link
Author

The culprit appears to be commit 536372f (Rename Lazy' => Delayed edwinb committed 23 days ago). Here are the results of the bisection exercise:

76d57f2 (May 27, 2016): 435 sec.
...
f8e66f9 (May 18, 2016): 402 sec.
...
fc9490e (May 17, 2016): 405 sec.
...
d2a8c3e (May 16, 2016): 414 sec.
...
536372f (May 16, 2016): 446 sec.
5925424 (May 16, 2016): 41 sec.
b20cff5 (May 16, 2016): 59 sec.
0917cf8 (May 15, 2016): 56 sec.
...
c2b01cc (Apr 15, 2016): 59 sec.
...
88d8075 (Mar 1, 2016): 50 sec.

It would probably be good to add examples like the one above to the test suite to catch sudden deteriorations of parsing times and type checking times.

@jfdm
Copy link
Contributor

jfdm commented Jun 8, 2016

Putting my two cents out there: Rather than the test suite, I think it would be better to add it to the benchmarks if you are to add something.

@nicolabotta
Copy link
Author

I think that the first step is to get the parsing times back from 400 sec. to 40 sec. Something has happened with commit 536372f that has made parsing almost 10 times slower. This might or might not be directly related with the commit itself. It is conceivable that some external library (trifecta?) is responsible for the misbehaviour. Once the problem is understood and fixed, we should think about tests that can effectively prevent things like this to happen. I am happy to help here, but I need some advice.

@ahmadsalim
Copy link

@nicolabotta So there are no changes in the parser in that commit, so maybe it is a previous commit or trifecta as you mentioned?

@nicolabotta
Copy link
Author

nicolabotta commented Jun 8, 2016

Ahmad Salim Al-Sibahi notifications@github.com wrote:

@nicolabotta So there are no changes in the parser in that commit, so maybe it is a previous commit or trifecta as you mentioned?

As you can see from the bisection results, commit 5925424 (and the commits before) take about 40 seconds to parse and commit 536372f (and the subsequent commits) take about 400 seconds to parse! As I mentioned, the commit itself does not need to be directly responsible for the misbehavior: it could be something that happened between 5925424 and 536372f in external libraries. Still, a proper check of parsing or type checking times would have avoided this pitfall.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.*

@ahmadsalim
Copy link

@nicolabotta So the latest changes in the parser happened with commit 5a937cc . Could you please see if your issues also happen with that commit? Thanks, and sorry for the trouble.

@nicolabotta
Copy link
Author

@nicolabotta So the latest changes in the parser happened with commit 5a937cc . Could you please see if your issues also happen with that commit? Thanks, and sorry for the trouble.

Done. With 5a937cc it takes about 330 seconds to parse the example!

This is a bit surprising because, as reported above, all the Idris versions that I had tried before 536372f parsed the example in 40 to 60 seconds. And all the version after (and included) 536372f needed between 400 and 450 seconds. This was because I was bisecting through Idris versions, of course.

Now, the new results suggest that there are more points in the commit sequence where the jump between 40-60 and 400-450 seconds occurs. Two commits after 5a937cc (commit 0917cf8) we are back to 40 seconds!

@ahmadsalim
Copy link

@nicolabotta So it is because I looked for parent commits of the commit 536372f you referenced (I did it on GitHub), and it seems that the 5a937cc was the ancestral commit that modified the parser.

@nicolabotta
Copy link
Author

@ahmadsalim Ok, now the question is: why do we revert back to 40 seconds with commit 0917cf8 (or, perhaps even a3b40c2)? Assumig that commit 5a937cc has "spoiled" the parser, how can it be that commit 0917cf8 (or a3b40c2) "fix" the problem without even touching the parser? I suspect that the issue is related with external libraries and that different commits make the parser use different external libraries. Does this make sense? How do you plan to proceed? Please, let me know if you want me to run further tests.

@melted
Copy link
Contributor

melted commented Jun 9, 2016

That sounds like an artifact of merging. The commit that is fast was likely
committed before the parser changes were nerged but after they had been
committed on their branch.
Den 9 juni 2016 11:01 skrev "nicolabotta" notifications@github.com:

@ahmadsalim https://github.com/ahmadsalim Ok, now the question is: why
do we revert back to 40 seconds with commit 0917cf8
0917cf8
(or, perhaps even a3b40c2
a3b40c2)?

Assumig that commit 5a937cc
5a937cc
has "spoiled" the parser, how can it be that commit 0917cf8
0917cf8
(or a3b40c2
a3b40c2)
"fix" the problem without e ven touc hing the parser? I suspect that the
issue is related with external libraries and that different commits make
the parser use different external libraries. Does this make sense? How do
you plan to proceed? Please, let me know if you want me to run further
tests.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3209 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AANbXSoedr1A6ettzw--rQosMSaM_QItks5qJ9Z4gaJpZM4IwC9X
.

@nicolabotta
Copy link
Author

@melted Fine. In this case, let's start with the hypothesis that commit 5a937cc is the culprit. The question is: are there obvious reasons why this commit might have increased the parsing time of the example by a factor of about 10? Is it possible to undo the changes introduced by 5a937cc and check whether this brings back reasonable parsing times?

@ahmadsalim
Copy link

@nicolabotta So I think major rule that was added was constraintsPi, which does seem to be in a try and so might race with other expressions.

@ahmadsalim
Copy link

ahmadsalim commented Jun 9, 2016

@nicolabotta To be honest, I am not sure how to proceed. I think that if there is going to be some rollback then I would guess it has to be done by @edwinb. Alternatively, the parser needs to be rewritten, but working on Idris is not my primary job (I contribute a bit during breaks) and I unfortunately have too much work to be able to do so soon.

@melted
Copy link
Contributor

melted commented Jun 9, 2016

I can take a look at it after work. First thing is just to make a profile
and see where the time is spent. Then see if that place is possible to
refactor to be more performant.
Den 9 juni 2016 1:43 PM skrev "Ahmad Salim Al-Sibahi" <
notifications@github.com>:

@nicolabotta https://github.com/nicolabotta To be honest, I am not sure
how to proceed. I think that if there is going to be some rollback then I
would guess it has to be done by @edwinb https://github.com/edwinb.
Alternatively, the parser needs to be rewritten, but working on Idris is
not my primary job (I do so a bit between breaks) and I unfortunately have
too much work to be able to do so soon.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3209 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AANbXZgOSAW_r6SZF-S4StIM4okIDKrVks5qJ_xOgaJpZM4IwC9X
.

@nicolabotta
Copy link
Author

@melted Sounds great, thanks! I'll try to find out whether the commit has also affected the parsing time of other expressions. But at the first glance it seems that the major impact has been on equational reasoning proofs. One thing that I can say is that eliminating all explicit implicit arguments of cong-based justifications has no significant effect on the parsing times. Thus, it seems that the problem is parsing expressions like

(m * (n * f + o * e)) * (d * (d * (e * f)))

not their justifications. The parsing time also seems to be reasonably linear in the number of such expressions: it takes roughly 30 seconds per line!

@melted
Copy link
Contributor

melted commented Jun 9, 2016

Ok, I think I know what it is. The expression parser has a number of "try"s that are recursive, that is they contain a parse of another expression. That will make Idris do something like a depth-first search with loads of backtracking for big expressions. This can be fixed by restricting the scope of the trys so they aren't recursive. I made a census and most of them seem straightforward. There is one important place I need to think some more,

This was obviously not caused by the commit mentioned above, it just made it worse. 40s is way too long for a parse too.

@nicolabotta
Copy link
Author

@melted Thanks Niklas, that looks very promising!

@melted
Copy link
Contributor

melted commented Jun 11, 2016

The reason that it is slower from that commit is that it tries constraintPi before unboundPi on line 1140 in Parser/Expr.idr, unfortunately they can't simply be inverted as there are constraint lists for which the unboundPi parse would succeed. And a constraint list can contain whatever as long as it's comma separated and is followed by "=>", so that alternative isn't ever rejected out of hand.

@melted
Copy link
Contributor

melted commented Jun 11, 2016

I think the best bet here is to let constraint parser also handle the case where it finds itself with an opExpr and no following "=>", then there will be no backtracking.

@melted
Copy link
Contributor

melted commented Jun 11, 2016

But of course that doesn't work because an opExpr in a constraint can't contain implicits.

@melted
Copy link
Contributor

melted commented Jun 11, 2016

There is plenty of low-hanging fruit to improve performance in the expression parser, this isn't one of them.

@ahmadsalim
Copy link

I knew there were issues with the parser which made it hard to change, but I was hoping that a set of new eyes could see if there was an easy fix.

@melted
Copy link
Contributor

melted commented Jun 11, 2016

I added a check that an unboundPi wasn't a constraint list, by putting in a check for "=>". Then I could put it first and restore the performance. It still isn't good of course, but one thing at a time.

@nicolabotta
Copy link
Author

Then I could put it first and restore the performance. It still isn't good of course, but one thing at a time.

Thanks for addressing this issue Niklas! Is it really the case that you have managed to restore the parsing times to the values before commit 5a937cc? With 635d448 (the latest commit) I am still getting parsing times of about 5 minutes for the example above. Best, Nicola

@melted
Copy link
Contributor

melted commented Jun 13, 2016

That's probably because I haven't merged the PR yet :)
Den 13 juni 2016 11:21 AM skrev "nicolabotta" notifications@github.com:

Then I could put it first and restore the performance. It still isn't good
of course, but one thing at a time.

Thanks for addressing this issue Niklas! Is it really the case that you
have managed to restore the parsing times to the values before commit
5a937cc
5a937cc?
With 635d448
635d448
(the latest commit) I am still getting parsing times of about 5 minutes for
the example above. Best, Nicola


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3209 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AANbXdYU5zcA8sOsyWEZyZi9-JeHvO0rks5qLSE2gaJpZM4IwC9X
.

@nicolabotta
Copy link
Author

I can confirm that we are now back to more human parsing and type checking times: from 5 minutes to 40 seconds for the example and from 25 to 15 minutes for the
https://github.com/nicolabotta/SeqDecProbs/tree/master/frameworks/14- framework! Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants