Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treebank development in master branch #520

Closed
msklvsk opened this issue Jan 2, 2018 · 4 comments
Closed

Treebank development in master branch #520

msklvsk opened this issue Jan 2, 2018 · 4 comments
Milestone

Comments

@msklvsk
Copy link
Member

msklvsk commented Jan 2, 2018

UD’s current model is to push new commits to dev, and merge to master twice a year. This often causes confusion: people expect new commits to be at master: they visit treebank’s main page (which is master) to see what’s happening. Instead of news they see a 3-month-old state (on average).
You tell a friend to go and check your updates, and he writes back there aren’t any because you forget to mention he should switch to dev.

What stops us from using a classical model, with stable releases branching out from master? You can even push bugfixes for old releases this way.

@jnivre
Copy link
Contributor

jnivre commented Jan 2, 2018 via email

@msklvsk
Copy link
Member Author

msklvsk commented Jun 14, 2018

Let me change this from a question to a suggestion.

The current workflow is anti-github.

  • Just recently, I found people using our treebank from master, stumbling upon errors we’ve already fixed in dev.
  • Constantly, somebody is pushing to master, like yesterday: Move README to README.md so it displays correctly UD_Japanese-GSD#5.
  • People also visit repo’s index, which is master, and expect it to show what’s happening: latest commits, announcements in the readme etc.

You can’t blame people for using Github like they are supposed to. The policy to protect users who are doing it wrong at the expense of others is questionable. If you “prefer to download from GitHub instead”, you should know that master is the latest. And for the uninitiated, there’s Lindat. However, I would give our users more credit.

I believe there is more harm than benefit with the current solution and propose to develop in master and branch releases out.

@msklvsk msklvsk reopened this Jun 14, 2018
@msklvsk msklvsk removed the question label Jun 14, 2018
@dan-zeman
Copy link
Member

We did not come up with the current policy out of the blue. Unfortunately, there were already paper submissions presenting results obtained on a Github revision of a treebank, instead of a numbered release. The current policy is a reaction to that experience. UD treebanks are not used the same way as typical open software, so it should not be too surprising if the workflow differs from typical Github workflow.

It would have been possible to leave it to the responsibility of paper authors that they do not take their data from Github. The papers with unreproducible and unverifiable results should be rejected anyways. But the collective decision was to adopt the current policy. With 144 treebanks, reversing the policy would no longer be a simple step, and the benefits do not outweigh the drawbacks. (Personal note: I felt quite indifferently about the switch to the current policy when it was being adopted. But now I feel strongly against changing it unless there are extremely convincing reasons that the current policy is evil.)

@dan-zeman dan-zeman added this to the later milestone Jun 17, 2018
@jnivre
Copy link
Contributor

jnivre commented Jun 18, 2018

I agree with Dan. While it may be unfortunate if our policy clashes with the expectations of github users, our use case is in many ways different from the standard one. And changing it now could potentially create total chaos, so we would have to think this through very carefully before doing anything like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants