Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow in some pathological cases #19

Closed
purcell opened this issue Nov 3, 2019 · 4 comments
Closed

Slow in some pathological cases #19

purcell opened this issue Nov 3, 2019 · 4 comments

Comments

@purcell
Copy link
Owner

purcell commented Nov 3, 2019

In the case of reformatting this Haskell file with ormolu v0.0.1.0, replacing the buffer contents takes an extraordinary and unacceptable amount of time (from 30s to many minutes), even though the formatter program itself executes quickly (< 0.25s).

Inputs/outputs are captured in this gist. They are somewhat pathological, in that the input source file is 64kb and the resulting diff is 36kb with 202 modified regions.

The problem is visible in Emacs 26.3: in Emacs > 26.1 we use Emacs' replace-buffer-contents function, which aims to preserve text properties, regions, markers etc. as much as possible during operations such as this. (The closest alternative insert-file-contents makes only limited attempts to do this, but generally preserves point quite effectively.)

It turns out that this is a known issue with replace-buffer-contents, which has even subsequently gained a new argument to limit execution time, after which it will revert to a simpler buffer content replacement method. This seems like a strange design choice.

The options for dealing with this in reformatter.el are, roughly:

  • Skip using replace-buffer-contents entirely. Most reformatting code "in the wild" does this anyway, of course, because it pre-dates replace-buffer-contents.
  • Use replace-buffer-contents only if it supports the time limit argument, and set the time limit to something like 0.5s. This makes the results somewhat nondeterministic, but may generally give the best results for users in the usual fast-replacement case. Here, also, the fallback behaviour may not work as well as just insert-file-contents, so this would require a little testing.
  • Use replace-buffer-contents only if the current file is sufficiently small, ie. unlikely to be pathological. This limit might be around 5kb.

I'd probably favour the first approach (ie. avoid replace-buffer-contents).

/cc @mrkkrp, who raised this issue, and @wbolster & @ludwigpacifici, who encouraged the use of replace-buffer-contents in #10.

@purcell
Copy link
Owner Author

purcell commented Nov 3, 2019

I'd also note that replace-buffer-contents is not used by any internal Emacs code or libraries, so I wouldn't particularly describe it as battle-tested.

purcell added a commit that referenced this issue Nov 3, 2019
@purcell
Copy link
Owner Author

purcell commented Nov 3, 2019

For now, I have pushed the simple fix, which is to avoid replace-buffer-contents entirely.

@ludwigpacifici
Copy link

ludwigpacifici commented Nov 3, 2019

Hello @purcell,

I did a quick benchmark when formatting OCaml code with replace-buffer-contents and insert-file-contents ocaml-ppx/ocamlformat#570 (comment). However, I never witness slowness above 30 seconds.

I wanted OCamlFormat to use reformatter.el (see issue ocaml-ppx/ocamlformat#570). To do that, the new Elisp implementation should be as close as possible with their current implementation - they currently use replace-buffer-contents. As of today, it is unclear if OCamlFormat will use reformatter.el. For information, I'll link this topic in the OCamlFormat issue.

In practice, I never had any issue whether replace-buffer-contents or insert-file-contents was used under the hood. My personal opinion would be to keep the code simple, i.e. go for the first approach.

@purcell
Copy link
Owner Author

purcell commented Jan 26, 2020

Closing, since there is no known speed issue currently.

@purcell purcell closed this as completed Jan 26, 2020
jimeh added a commit to jimeh/reformatter.el that referenced this issue May 23, 2020
Borrowed from go-mode's gofmt and releated functions:
https://github.com/dominikh/go-mode.el/blob/734d5232455ffde088021ea5908849ac570e890f/go-mode.el#L1850-L1956

Hence I'm not the original author of most this code, but I have used a
number of borrowed variations of it over the years for hand-rolled
formatting solutions.

It works by passing the current buffer content to `diff` via STDIN to
compare it against the target file on disk producing a RCS formatted
diff that easy to parse. It then iterates through the said diff only
modifying relevant lines to make the buffer be identical to the target
file.

In my experience it does a much better job of maintaining cursor
position compared to `insert-file-contents`.

Performance is also excellent and near instant. On a late-2016 13-inch
MacBook Pro with a Dual-Core i7 at 3.3Ghz, I can't really tell the
difference between this solution and `insert-file-contents`.

I did also test it against the Haskell example from:
purcell#19

Replacing buffer content of `Req.hs` with that of `Req-reformatted.hs`
is most of the time just as fast as `insert-file-contents` (around
50ms), and only sometimes a little slower of around 100-150ms based on
my totally unscientific measurement technique of guesstimating it.

I did also test the old `replace-buffer-contents` based variant too with
`Req.hs`, and that took around 15 seconds.
jimeh added a commit to jimeh/reformatter.el that referenced this issue May 23, 2020
Borrowed from go-mode's gofmt and releated functions:
https://github.com/dominikh/go-mode.el/blob/734d5232455ffde088021ea5908849ac570e890f/go-mode.el#L1850-L1956

Hence I'm not the original author of most this code, but I have used a
number of borrowed variations of it over the years for hand-rolled
formatting solutions.

It works by passing the current buffer content to `diff` via STDIN to
compare it against the target file on disk producing a RCS formatted
diff that is easy to parse. It then iterates through the said diff only
modifying relevant lines to make the buffer be identical to the target
file.

In my experience it does a much better job of maintaining cursor
position compared to `insert-file-contents`, as it only modifies lines
that got corrected, and leaves all other lines in the buffer untouched.

Performance is also excellent and near instant. On my late-2016 13-inch
MacBook Pro with a Dual-Core i7 at 3.3Ghz, I can't really tell the
difference between this solution and `insert-file-contents`.

I did also test it against the Haskell example from:
purcell#19

Replacing buffer content of `Req.hs` with that of `Req-reformatted.hs`
is most of the time just as fast as `insert-file-contents` (around
50ms), and only sometimes a little slower of around 100-150ms based on
my totally unscientific measurement technique of guesstimating it. I
find that rather impressive as the RCS-formatted diff itself is 17KB.

I also tested the old `replace-buffer-contents`-based variant to see how
it perform on my machine. It took around 15 seconds.
jimeh added a commit to jimeh/reformatter.el that referenced this issue May 23, 2020
Borrowed from go-mode's gofmt and releated functions:
https://github.com/dominikh/go-mode.el/blob/734d5232455ffde088021ea5908849ac570e890f/go-mode.el#L1850-L1956

Hence I'm not the original author of most this code, but I have used a
number of borrowed variations of it over the years for hand-rolled
formatting solutions.

It works by passing the current buffer content to `diff` via STDIN to
compare it against the target file on disk producing a RCS formatted
diff that is easy to parse. It then iterates through the said diff only
modifying relevant lines to make the buffer be identical to the target
file.

In my experience it does a much better job of maintaining cursor
position compared to `insert-file-contents`, as it only modifies lines
that got corrected, and leaves all other lines in the buffer untouched.

Performance is also excellent and near instant. On my late-2016 13-inch
MacBook Pro with a Dual-Core i7 at 3.3Ghz, I can't really tell the
difference between this solution and `insert-file-contents`.

I did also test it against the Haskell example from:
purcell#19

Replacing buffer content of `Req.hs` with that of `Req-reformatted.hs`
is most of the time just as fast as `insert-file-contents` (around
50ms), and only sometimes a little slower of around 100-150ms based on
my totally unscientific measurement technique of guesstimating it. I
find that rather impressive as the RCS-formatted diff itself is 17KB.

I also tested the old `replace-buffer-contents`-based variant to see how
it performs on my machine. It took around 15 seconds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants