You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
moduleMainwhereimportText.ParsecimportSystem.EnvironmentimportControl.Monadmain::IO()
main =do
n <-read.head<$> getArgs
print$ runParser (replicateM_ n $pure()) ()"test"""
Running it with +RTS -s(n = 500000), I get:
$ ./leak 500000 +RTS -s Right ()
212,105,040 bytes allocated in the heap
275,152,536 bytes copied during GC
56,962,568 bytes maximum residency (8 sample(s))
1,384,952 bytes maximum slop
114 MiB total memory in use (0 MB lost due to fragmentation)
Productivity 21.8% of total user, 21.7% of total elapsed
With n = 5000000:
$ ./leak 5000000 +RTS -s Right ()
2,120,105,552 bytes allocated in the heap
2,977,111,368 bytes copied during GC
439,420,112 bytes maximum residency (12 sample(s))
10,239,688 bytes maximum slop
1259 MiB total memory in use (0 MB lost due to fragmentation)
Productivity 19.8% of total user, 19.7% of total elapsed
Clearly it consumes memory at least linearly in n.
To investigate, I used (the as yet unreleased) Info Table Profiling, which resulted in the following graph:
The source locations it points to are all in parserReturn and parserBind.
The first ParseError thunk originates in parserReturn, in the call to unknownError:
parserReturn x
= ParsecT $ \s _ _ eok _ ->
eok x s (unknownError s)
The second and third FUN closures originate in parserBind, in particular peok and peerr.
parserBind m k
= ParsecT $ \s cok cerr eok eerr ->
...
-- empty-ok case for m
meok x s err =
let
-- in these cases, (k x) can return as empty
pcok = cok
peok x s err' = eok x s (mergeError err err')
pcerr = cerr
peerr err' = eerr (mergeError err err')
in unParser (k x) s pcok pcerr peok peerr
...
in unParser m s mcok mcerr meok meerr
If we look at the definition of mergeError and unknownError, we see that unknownError s is a neutral element for mergeError. This means that our program doesn't actually need to allocate all these closures, since unknownError is the only kind of ParseError that is generated by the program. In particular, peok and peerr would be the same as eok and eerr respectively under this identity.
After applying these changes:
diff --git a/src/Text/Parsec/Prim.hs b/src/Text/Parsec/Prim.hs
index 378da8d..75b503c 100644
--- a/src/Text/Parsec/Prim.hs+++ b/src/Text/Parsec/Prim.hs@@ -318,7 +318,9 @@ parserBind m k
= ParsecT $ \s cok cerr eok eerr ->
let
-- consumed-okay case for m
- mcok x s err =+ mcok x s err+ | errorIsUnknown err = unParser (k x) s cok cerr cok cerr+ | otherwise =
let
-- if (k x) consumes, those go straigt up
pcok = cok
@@ -335,7 +337,9 @@ parserBind m k
in unParser (k x) s pcok pcerr peok peerr
-- empty-ok case for m
- meok x s err =+ meok x s err+ | errorIsUnknown err = unParser (k x) s cok cerr eok eerr+ | otherwise =
let
-- in these cases, (k x) can return as empty
pcok = cok
we get
$ ./leak 5000000 +RTS -s
Right ()
1,680,105,552 bytes allocated in the heap
122,976 bytes copied during GC
63,296 bytes maximum residency (2 sample(s))
30,728 bytes maximum slop
2 MiB total memory in use (0 MB lost due to fragmentation)
Productivity 98.8% of total user, 98.6% of total elapsed
and a flat graph:
The text was updated successfully, but these errors were encountered:
I have the following program:
Running it with
+RTS -s
(n = 500000), I get:With
n = 5000000
:Clearly it consumes memory at least linearly in
n
.To investigate, I used (the as yet unreleased) Info Table Profiling, which resulted in the following graph:
The source locations it points to are all in
parserReturn
andparserBind
.The first
ParseError
thunk originates inparserReturn
, in the call tounknownError
:The second and third
FUN
closures originate inparserBind
, in particularpeok
andpeerr
.If we look at the definition of
mergeError
andunknownError
, we see thatunknownError s
is a neutral element formergeError
. This means that our program doesn't actually need to allocate all these closures, sinceunknownError
is the only kind ofParseError
that is generated by the program. In particular,peok
andpeerr
would be the same aseok
andeerr
respectively under this identity.After applying these changes:
we get
and a flat graph:
The text was updated successfully, but these errors were encountered: