Skip to content

Subversion history is incomplete when a folder has been renamed #3443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
harrisric opened this issue Mar 2, 2021 · 4 comments · Fixed by #4172
Closed

Subversion history is incomplete when a folder has been renamed #3443

harrisric opened this issue Mar 2, 2021 · 4 comments · Fixed by #4172

Comments

@harrisric
Copy link
Contributor

Describe the bug
For a subversion repository where a folder has been renamed and then a file within that folder has subsequently been amended, the cached history created by the indexer is incomplete. This is similar to #761.
If the cached history is deleted or somehow ignored then the history gathered via SubversionHistoryParser for the single file is correct.

This is reproducible in the latest version of OpenGrok.

To Reproduce
This can be seen by replacing the existing svnlog.dump file in the testdata with the one attached which has 4 additional commits.
svnlog.zip

Using a test over the subversion repository in FileHistoryCacheTest you can then verify the history of the file which should have 3 history entries:

        File testFileInRenamedFolder = new File(reposRoot.toString() +
          File.separatorChar + "renamedFolder" + File.separatorChar +  "FileInRenamedFolder.txt");
        History cachedHistory = cache.get(testFileInRenamedFolder, repo, false);
        History fullHistory = repo.getHistory(testFileInRenamedFolder);
        assertEquals(fullHistory.getHistoryEntries(), cachedHistory.getHistoryEntries());

Expected behavior
The history produced by the initial indexing of a subversion repository should contain the complete history for a file which is in a folder which has been renamed.
The correct history in this case can be seen by bypassing the cache (as shown in the test snippet), which would be achieved if the initial repository History was able to correctly identify the file as renamed

Additional context
The subversion history contains renames of folders in the following format:

<path
   text-mods="false"
   kind="dir"
   copyfrom-path="/repo/originalfoldername/src"
   copyfrom-rev="564779"
   action="A"
   prop-mods="false">/repo/newfoldername/src</path>

I don't think that there is an easy way to determine the contents of the folder at the point of discovering such a directive, and therefore it doesn't seem immediately obvious that this could be turned into a list of renamedFiles by the history parser.
The approach that I have considered would be to simply extend History to track renamed directories as well. Then History.isRenamed could check both the file and the directory.
The downside here is that this will catch too many files causing any file in a directory that was ever renamed as a renamed file, this may lead to some performance degredation of initial indexing. For extreme cases where much of the repository has been renamed it might even lead to the consumption of the complete history being largely redundant and instead an alternate approach of simply generating the history for each file independently being prefered. Possibly some overlap with comments on #3243 about chunking the history consumption.

@vladak
Copy link
Member

vladak commented Mar 2, 2021

I think the Subversion history implementation in OpenGrok needs to receive proper support for renamed files like this is done for Git/Mercurial.

@harrisric
Copy link
Contributor Author

harrisric commented Mar 2, 2021

@vladak initial support for renamed files similar to the Mercurial approach was implemented under #3095, the problem here is that folder renames have a more problematic structure.
The git approach seems to use a specific command to find renamed files seperately from the core history parse.

It looks like in svn if we hit a folder rename we could use a separate command to find the folder contents at that revision e.g.
svn ls repo/renamedFolder -r12345 (where 12345 is the revision at which we found the folder rename) - this would give a simple list of files present at that revision which could then become the renamedFiles

@sreehari83
Copy link

any update on this request

@rkrishnas81
Copy link

any update or any workaround? Please help me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants