Skip to content

Indentation has been spoiled #782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrkkrp opened this issue Jul 27, 2015 · 53 comments
Closed

Indentation has been spoiled #782

mrkkrp opened this issue Jul 27, 2015 · 53 comments

Comments

@mrkkrp
Copy link
Contributor

mrkkrp commented Jul 27, 2015

I haven't experienced it before. Try to copy this sample:

module Main (main) where

Now try to type the word import and press TAB, or press it when you're at the beginning of line -- it goes to 4 column and says "Sole indentation". I remember it behaved differently before -- it would stay at 0 column and say "Sole indentation" -- this behavior is better, I think. Could you please fix that or if it's on purpose please explain me why does it behave this way.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Jul 27, 2015

Sometimes it's 2nd column, not 4th.

@gracjan
Copy link
Contributor

gracjan commented Jul 27, 2015 via email

@gracjan
Copy link
Contributor

gracjan commented Jul 27, 2015

Do you know which version have you been using before? I tried to revert some recent commits and I see no change in bahvior.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Jul 27, 2015

@gracjan, I haven't done Haskell for some time, maybe a couple of months, so I cannot tell you. Also, unfortunately I cannot help you with the tests, at least not right now.

@gracjan
Copy link
Contributor

gracjan commented Jul 27, 2015 via email

@vwyu
Copy link
Contributor

vwyu commented Aug 10, 2015

This happens when haskell-indent-mode is enabled.

In contrast, when haskell-indentation-mode is enabled, import stays correctly at column 0.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 10, 2015

So which mode should one use haskell-indent-mode or haskell-indentation-mode and what's the difference between the two? Why on the earth do you keep two modes that do the same thing? @vwyu, fyi, I've just tried it with haskell-indentation-mode -- the problem persists.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 10, 2015

@gracjan, One more thing: does it make sense to enable indentation machinery automatically, just like all other language-specific modes do?

@gracjan
Copy link
Contributor

gracjan commented Aug 10, 2015

That is the goal, to make haskell-indentation-mode the default and the only.

Sadly, none of the indentation modes work as advertised so people learned to live with their limitations. haskell-indentation-mode requires a lot more work to be able to subsume everything else.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 10, 2015

@gracjan, I have to admit that I've tried both the modes without success. Now I'm using haskell-indent-mode, but after a recent change it's broken even more. Now I'm literally fighting the indentation... this is unbearable. I think I better turn it off. Could you please test all changes yourself before pushing them into the repository, that applies to pull requests. Currently it feels like some hacking is being done without good testing. I see you have some tests, but certainly they cannot catch this sort of bug.

I'm not trying to say that your work is bad, I would rather thank you for your work on this project, but I'm using almost exclusively open-source/free software at home and work and this package is most unstable thing I've ever seen. You probably need more people, but unfortunately I cannot help you right now, because I'm already busy and my free time is consumed by another open source effort.

@gracjan
Copy link
Contributor

gracjan commented Aug 11, 2015

Sad state of indentation is known and it is our top priority. We need help there as you noticed.

Having said that haskell-indent-mode has not been touched for months and all work goes into haskell-indentation-mode.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 12, 2015

@gracjan, OK, enough is enough. Describe me briefly what should I tweak/know to fix this haskell-indentation-mode, please.

@gracjan
Copy link
Contributor

gracjan commented Aug 12, 2015

@kuribas and @nilcons are authors of this part. Can you guys help?

Usually the best starting point is test cases. There are some but nowhere nearly enough.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 13, 2015

OK, I've started reading the code on my own. You have various nasty things there.

  1. Haskell should be capitalized everywhere.
  2. Don't use Emacs `' quoting for anything expect Emacs Lisp functions. You're using it for quoting Haskell keywords, creating nonsensical links (Haskell keyword is meant, hyperlink to Emacs Lisp function is created). I propose using of simple double-quotes for this: "if".
  3. Do you really need to have a lighter for this mode? I cannot imagine that anyone would like to waste his mode line space for this really minor mode.
  4. You shouldn't have lines longer than 80 characters. It's simple to adhere to this rule in all cases.
  5. You're setting "higher" limit for recursion (max max-lisp-eval-depth 600). If you really need it, I don't yet see why. At any rate this is suspicious. Given that default of value of max-lisp-eval-depth is 800, I don't see any reason for this magic. Removing this.
  6. Every top-level form should be separated by at least one empty line from other top-level forms. Otherwise code is hard to read because it's unclear where one form ends and another one begins. You can do without empty lines only if you have many similar forms that are readable without space between them.
  7. Some functions could really enjoy some documentation done in accordance to Emacs Lisp conventions.
  8. Documentation strings should end with a full stop.
  9. Some functions names are not Lispy enough: haskell-indentation-bird-outside-codep should be haskell-indentation-bird-outside-code-p because it's multi-word predicate. See Lisp predicate naming conventions. Note sure if this can be changed at this point.
  10. Use imperative style in doc-strings everywhere.
  11. Please don't quote lambdas with #'. lambda is a macro that quotes itself and expands exactly to (function (lambda (...) ...)). This style is obsolete and redundant.

I will soon open a pull request with cosmetic corrections. I have not yet analyzed the code and its design because I'm too tired right now.

@gracjan
Copy link
Contributor

gracjan commented Aug 13, 2015 via email

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

@gracjan, Why don't you run tests on every commit with Travis CI? How do you run tests at all, I don't see .cask file or something. What is the recommended way?

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

@gracjan, your haskell-indentation-check is not powerful enough. Worse yet, it imposes ugly way to write tests. Emacs lisp supports multi-line lines (you use it in doc-strings all the time). So test text should be one long (multiline) line, not this collection of lines.

As it currently implemented, you cannot place point in arbitrary context, it's always placed for you at the end of testing text. This is unacceptable. We should be able to specify exact position of point. Better yet, it should be possible to specify several positions for point with corresponding expected "indentations". Better yet, this should be parametrized over variables like haskell-indentation-layout-offset, where it makes sense, so when you change these variables your tests are still helpful without rewriting.

Doc-string for haskell-indentation-check says that it tests haskell-indentation-find-indentations function. This function doesn't have doc-string, I'll try to write all missing doc-strings. Even worse, haskell-indentation-find-indentations is not actually used in code, haskell-indentation-find-indentations-safe is used. Well...

First, you should tell me preferable way to run tests. Second, you should tell Travis CI to run tests on every commit. This is a must for project of this size.


Fixing of all the stuff will take time. Given all that hacking I fear major changes may be needed to make it work decently.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

What do these variables mean? What DYN stands for?

(defvar haskell-indentation-dyn-first-position)
(defvar haskell-indentation-dyn-last-direction)
(defvar haskell-indentation-dyn-last-indentations)

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

@gracjan, OK, let's start with #812. Next we will need better helper function for testing and more tests.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

@gracjan, what about :expected-result :failed tests? Should they be fixed after all?

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

I've written more powerful helper to test indentations. It shows that haskell-indentation-find-indentations returns different results when it's at the end of line than when it's at the beginning at line. If also returns nil sometimes (from my observations it's when it's inside some token). Is it how it's supposed to work?

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

Now it's possible to write more dense tests:

(ert-deftest haskell-indentation-check-3 ()
  "Import statememnt symbol list 1."
  (haskell-indentation-check
   "
import Control.Concurrent
  ( forkIO,
    killThread )"
   '((1 0) 0)
   '((2 0) 0)
   '((3 0) 0 2)
   '((4 0) 4)
   '((5 0) 0)))

This exposes bug that I described in this issue. For cursor position (5 0) it returns (2), not (0). This sort of tests is the way to go, I guess.

@gracjan
Copy link
Contributor

gracjan commented Aug 14, 2015

@mrkkrp: lots of comments, let me address one by one.

  1. Stylistic changes much appreciated. Spasibo.
  2. Lighter is/was useful because there are other indentation modes and people switched frequently due to bugs. We can get rid of it now.
  3. TravisCI builds and tests every pull request. We merge using github Merge button, only when it is green. This workflow is good enough for project of our size.
  4. Run tests by running make check or make check EMACS=/path/to/emacs.
  5. haskell-indentation-check was first (actually second) stab at the problem and of course it can be improved. Note though that ERT has this limitation that test either totally fails or totally succeeds so if you put and lot of should inside a single test there is mental energy needed to guess which one of the tests failed. Minimizing mental energy needed on failed tests is crucial, so I would love to keep the ratio of shoulds per ert-testcase to be below number four.
  6. ERT can has 'explainers', we haven't experimented with those but this seems to be a good idea. Guessing what was the problem when a test fails is a mental exercise that could be avoided or at least helped with.
  7. The dyn variables stick out as if another author put them there. I do not know the history behind those.
  8. haskell-indentation-find-indentations is a workhorse and I do not think edge cases were perfectly defined for this function. It should find indentations.

@gracjan
Copy link
Contributor

gracjan commented Aug 14, 2015

And note about :expected-result: this is supposed to document current bugs and give a good starting point for somebody that has a bit of time and energy. Yes, those should be eventually fixed.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

@gracjan, I will try to use those 'explainers' to improve my current testing helper. Many more tests will fail with such detailed tests in place.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

@gracjan, should position in current line (column) affect results of ...-find-indentations?

@gracjan
Copy link
Contributor

gracjan commented Aug 14, 2015

This is indent LINE function so it should work on whole LINES. Position within line should not matter.

(Same with indent REGION: position within region should not matter).

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 14, 2015

Well, my question was about haskell-indentation-find-indentations which does work differently depending on column and it doesn't do anything with line itself. Nevermind, I will need to figure out how this works on all levels.


About Travis CI testing: sorry, I've missed that you do it in your makefile, I should have checked it better.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

It may be better to use external parsing library which is tested, robust, and neat. We could then define necessary stuff using combinators from that library. This will eliminate bugs and make maintaining of the haskell-indentation-mode much easier in future.

I need to understand how entire haskell-indentation-mode works in details. Then I could write powerful enough library with haskell-indentation-mode in mind if there is no such library already.

This is what I've found: https://github.com/skeeto/rdp. It may be not powerful enough, though.

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

I have been thinking to rewrite the parser. The main problem is handling the layout rule correctly. For example, when an illegal token is encountered, the current layout context should be left, and parsing should resume in the outer context. When using an external parser, these rules should be handled correctly.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, Could you please describe how it currently works? Where is starting point of all that parsing business. What your parser returns, how its result is used to calculate indentation positions? All I can see is quite hairy and mostly not documented, so I'm having hard time understanding this. Every function seems to get dirty right away messing with all possible things.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, you probably don't re-parse entire file every time, so how do you determine which part of file should be parsed? Do you extract string from buffer and then work with that?

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

Parsing starts from the first line before point with zero indendation. Parsing is done just by walking over the buffer.

Basicly, the read-next-token, etc functions handle the tokenization, and also the layout-rule (using special tokens 'layout-item and 'layout-end). 'layout-end signifies the end of the parse stream.

Each parser consumes tokens, and sets the (dynamic) indentation variables, or adds a new (dynamic) scope. At the end of parsing, those indentation variables are used to specify each indentation (for each scope).

It can be slow when you have a large function and a slow computer. I plan to rewrite the parser:

  • no dynamic variables. All state is passed explicitly. The state is also memoized for each line of the source. When the source changes the parsestate is updated in idle time. Parsing can be done incrementally, so should be faster.
  • A macro for describing the parser should make it easier to read and write parsers. For example recognizing tokens, sequencing parsers, alternatives, setting the parser variables, setting indentations when parsing ends.
  • Instead of using a function, the parser could be a hashtable from token to a primitive state machine. That could make parsing faster and incremental parsing easier.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, one more question: does contents of current line (line at which point is placed when user tries to indent his code) affect collection of returned indentation positions, or it's completely defined by context that previous lines provide?

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

It does, for keywords like of or in.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, this may be not enough... Consider the test-case:

(hindent-test "17a A type for a function""
fun :: Int
    -> Int"
  ((1 0) 0)
  ((2 0) 4)
  ((3 0) 0 4))

This is from improved tests I'm working on. It works like this:

  • when point is at line 1 and column 0, haskell-indentation-find-indentations should return (0);
  • when point is at line 2 and column 0, haskell-indentation-find-indentations should return (4);
  • et cetera.

If second like starts with -> there must be only one indentation position — 4, otherwise only 0. If however the line is empty, it's natural to assume two possible indentation positions: 0 and 4, see third line (it's missing in example, but my helper adds one empty line after every snippet, so it's OK).

Is this sort of behavior supported?

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

That's not right, indenting after -> should be different from indenting before ->

better example:

f = let x 

indentation before x is different than indentation after x:

f = let
  x

but:

f = let x = 10
        y

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

Don't really see what is not right with my example. Of course it should be different. First line is "top-level" statement and indentation should stay at 0 there. Then second line may or may not continue type declaration. This depends on ->. If this thing is present, then it's part of type declaration and only possible indentation position is 4. If not, it's another "top-level" form, then only possible position is 0. Finally, third line is again may be "top-level" form as well as part of type declaration. When it's empty it's impossible to tell which one it is.

Your let example is simpler because it's clear that the next line will be continuation of let statement, because it is not complete.

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

Well, the current behaviour is more like:

    (hindent-test "17a A type for a function""
    fun :: Int
        -> Int"
      ((1 0) 0)
      ((1 10) 0 7)
      ((2 0) 4 7)
      ((2 7) 0 4)
      ((3 0) 0 4))

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

The reason is choice, the user may want the -> to align with ::, or with Int.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, Yes I know. Initially I wanted to test at the end of lines too. I was surprised that at the end of line collection of indentations is different. Column position shouldn't matter, imho. ((1 10) 0 7) looks strange to me, when we are at the first line, why should it propose anything but 0 (indentation of first line of a top-level statement).

Choice is OK, in conjunction with all other logic described above.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, I'm also considering more stupid solution, collection of rules that look at previous line and current line (possibly at many previous lines in some cases to get more indentation positions). This can be done in form of domain specific language with macros, for example. This stupid approach may be faster and simpler.

Decent parsing of Haskell syntax tree may be overkill for the task at hand. Maybe I'm naive in this thought, though.

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

Well, consider:

fun :: Maybe
       (String, String)

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

Yeah, it already simplifies haskell syntax. However I don't want it more stupid than it is, because it should give the correct indentations with respect to haskell syntax, which has some quirks.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, I will try to come up with implementation of my simplified approach. I believe it can work very well (some conceptions should be introduced and relations between them, it may be actually even elegant). At least it can fix bugs that annoy people right now and we could try how good it can be. Then you may replace it with decent parser when it's ready.

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

I don't want to discourage you, but making it simpler than it is right now will probably introduce more bugs. If you would like to help with the table based parser, that would be great!

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

@kuribas, one more thing. haskell-indent-mode supported insertion of function names in some cases if you press TAB twice, if memory serves me well. Does haskell-indentation-mode have something like this?

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

Not at this time.

@kuribas
Copy link
Contributor

kuribas commented Aug 15, 2015

Here's an idea for of a parser macro (which would compile to a hashtable):

(defparser case ()
  (keyword "case" expr
    (keyword "of" expr
      (alt
       (end-tokens (indentations current-indent))
       (keyword "|" (separated case-alt "|" nil))
       (seq "->" (statement-right expr))))))

(defparser statement-right (p)
  (alt
   (end-tokens
    (parse-end left-indent current-indent)))
   (seq
    (set current-indent)
    p))

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

Looks much better. I hope it's possible to write most part of the parser in this clear style, using decent abstractions.

@mrkkrp
Copy link
Contributor Author

mrkkrp commented Aug 15, 2015

Please write doc-strings for every function describing in details every layer of your new parser. This way, I and other people in future will be able to understand it and contribute. I am glad to help with this cleaner parser.

I won't work at my simplified parser for now. After all, I've other things to do :-)

@gracjan
Copy link
Contributor

gracjan commented Sep 4, 2015

Just checked the particular issue that spawned this discussion and it works. So closing this issue.

If you wish to continue discussing indentation by all means please open other discussion issue about it.

@gracjan gracjan closed this as completed Sep 4, 2015
@mrkkrp
Copy link
Contributor Author

mrkkrp commented Nov 8, 2015

@gracjan, For quite a while I wanted to point out that this still does not work. Here is how to reproduce:

module Main
  ()
where

Interestingly, if I put where at the beginning of line, all lines below it are indented. And since this is my style of writing module declaration, I'm constantly annoyed by this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants