Skip to content

feat: split command filename to old/new file #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

thatlittleboy
Copy link

This PR is mostly motivated by the following problem:

  1. The diff command has two filenames, but the parser is currently parsing everything after diff --git as a single filename node, which is wrong.
  2. This results in different semantic interpretations during highlighting, whereas ideally it should have the exact semantic meaning. I propose it should be diff --git (old_file) (new_file) so that these filenames get the same highlighting/semantic meaning as the ones in the diff output, --- (old_file) and +++ (new_file).

So we get identical syntax highlighting as the `---`/`+++` filenames below.
@thatlittleboy
Copy link
Author

On this input (taken from ur playground)

diff --git a/.gitmodules b/.gitmodules
index d5bd61c9e..422671b4e 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -174,3 +174,7 @@
 	path = helix-syntax/languages/tree-sitter-git-commit
 	url = https://github.com/the-mikedavis/tree-sitter-git-commit.git
 	shallow = true
+[submodule "helix-syntax/languages/tree-sitter-git-diff"]
+	path = helix-syntax/languages/tree-sitter-git-diff
+	url = https://github.com/the-mikedavis/tree-sitter-git-diff.git
+	shallow = true

the query output is now

a.diff
  pattern: 4
    capture: 4 - variable.builtin, start: (0, 0), end: (0, 38), text: `diff --git a/.gitmodules b/.gitmodules`
  pattern: 1
    capture: 1 - keyword, start: (0, 11), end: (0, 24), text: `a/.gitmodules`
  pattern: 0
    capture: 0 - string, start: (0, 25), end: (0, 38), text: `b/.gitmodules`
  pattern: 2
    capture: 2 - constant, start: (1, 6), end: (1, 15), text: `d5bd61c9e`
  pattern: 2
    capture: 2 - constant, start: (1, 17), end: (1, 26), text: `422671b4e`
  pattern: 1
    capture: 1 - keyword, start: (2, 0), end: (2, 17), text: `--- a/.gitmodules`
  pattern: 0
    capture: 0 - string, start: (3, 0), end: (3, 17), text: `+++ b/.gitmodules`
  pattern: 3
    capture: 3 - attribute, start: (4, 0), end: (4, 19), text: `@@ -174,3 +174,7 @@`
  pattern: 0
    capture: 0 - string, start: (8, 0), end: (8, 58), text: `+[submodule "helix-syntax/languages/tree-sitter-git-diff"]`
  pattern: 0
    capture: 0 - string, start: (9, 0), end: (9, 52), text: `+  path = helix-syntax/languages/tree-sitter-git-diff`
  pattern: 0
    capture: 0 - string, start: (10, 0), end: (10, 65), text: `+    url = https://github.com/the-mikedavis/tree-sitter-git-diff.git`
  pattern: 0
    capture: 0 - string, start: (11, 0), end: (11, 16), text: `+    shallow = true`

notice that the a/.gitmodules and b/.gitmodules from the diff --git a/.gitmodules b/.gitmodules is being picked up by the query now. And they have respectively identical captures with the --- a/.gitmodules and +++ b/.gitmodules

@the-mikedavis the-mikedavis self-requested a review January 2, 2023 13:05
@the-mikedavis
Copy link
Owner

I was interested in adding this but it's not straightforward if you have filenames with spaces in them:

diff --git a/a b.txt b/a b.txt
index 86e041d..46add00 100644
--- a/a b.txt   
+++ b/a b.txt   
@@ -1,3 +1,3 @@
 foo
-bar
+baz
 baz

On this branch:

$ tree-sitter parse f.diff
(source [0, 0] - [9, 0]
  (command [0, 0] - [0, 20]
    (old_file [0, 11] - [0, 14])
    (new_file [0, 15] - [0, 20]))
  (ERROR [0, 21] - [0, 30]
    (ERROR [0, 21] - [0, 30]))
  (index [1, 0] - [1, 29]
    (commit [1, 6] - [1, 13])
    (commit [1, 15] - [1, 22])
    (mode [1, 23] - [1, 29]))
  (old_file [2, 0] - [2, 7]
    (filename [2, 4] - [2, 7]))
  (ERROR [2, 8] - [2, 13]
    (ERROR [2, 8] - [2, 13]))
  (new_file [3, 0] - [3, 7]
    (filename [3, 4] - [3, 7]))
  (ERROR [3, 8] - [3, 13]
    (ERROR [3, 8] - [3, 13]))
  (location [4, 0] - [4, 15]
    (linerange [4, 3] - [4, 7])
    (linerange [4, 8] - [4, 12]))
  (context [5, 0] - [5, 4])
  (deletion [6, 0] - [6, 4])
  (addition [7, 0] - [7, 4])
  (context [8, 0] - [8, 4]))
f.diff	0 ms	(ERROR [0, 21] - [0, 30])

@thatlittleboy
Copy link
Author

I see, that's a good point. Let me think about this and revisit this when I have a solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants