Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github migration fails with error 500 when issue comment is too long #7905

Closed
1 of 7 tasks
ashimokawa opened this issue Aug 18, 2019 · 11 comments
Closed
1 of 7 tasks
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug

Comments

@ashimokawa
Copy link
Contributor

  • Gitea version (or commit ref): g2d0b90c96
  • Git version: 2.20.1
  • Operating system: Debian 10
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant
  • Log gist:

Description

Trying to migrate https://github.com/Freeyourgadget/Gadgetbridge to codeberg-test.org running gitea master yields error 500 with the following in log:

2019/08/18 11:50:58 routers/repo/repo.go:320:MigratePost() [E] MigratePost: Error 1406: Data too long for column 'content' at row 1

@lunny
Copy link
Member

lunny commented Aug 22, 2019

Comment content have ~64KB size. Isn't it enough for almost all the situation?

@ashimokawa
Copy link
Contributor Author

@lunny
Yes, 64KB ought to be enough for everyone.
Anyway, I got that error above and that should not happen.
I will investigate which post exactly triggers the problem.
In any case 500 should be prevented by cutting off the text at a valid utf-8 boundary.

@guillep2k
Copy link
Member

I agree. The 64K limit sounds reasonable. But whatever the limit, we should check the available size before inserting. A post-migration list of warnings explaining what compromises had to be made during migration would be ideal, so the admin will be aware of the problems and can take corrective/compensatory actions.

@typeless
Copy link
Contributor

Besides a warning in the log, We can maybe reserve a little space to insert a line like ----- cut here, comment length exceeding 64KB -----\n when the text is too long.

@mrsdizzie
Copy link
Member

mrsdizzie commented Aug 26, 2019

I'm pretty sure Github comment restriction is the same 65536 characters as a TEXT column, so it shouldn't be possible to fill it up when migrating.

Perhaps some other bug is happening in this case and this is missing the bigger picture.

I tried to migrate the example repo to see what comment might have been causing it but the migration was successful and I wasn't able to reproduce (on sqlite3).

@guillep2k
Copy link
Member

Perhaps this could be relevant:

Values in VARCHAR columns are variable-length strings. The length can be specified as a value from 0 to 255 before MySQL 5.0.3, and 0 to 65,535 in 5.0.3 and later versions. The effective maximum length of a VARCHAR in MySQL 5.0.3 and later is subject to the maximum row size (65,535 bytes, which is shared among all columns) and the character set used.

I believe blob types (e.g. TEXT) are stored in a separate location and don't add up to row size.

Also, character encoding could be set to an improper value and increase the size of the encoded values (e.g. UCS-2).

@ashimokawa
Copy link
Contributor Author

OK, I think found out why this does not fail hard on try.gitea.io but on my test instance.
sql mode is strict on my server, so exceeding column size fails hard, but is just truncated when sql mode is more relaxed (I assume this is the case on try.gitea.io)

Nevertheless, comment size is bigger on github, so we should somehow handle this.

I added a quick hack to slice the string in go, and it worked. But doing this properly on a Rune boundary is the way to I assume.

@lunny
Copy link
Member

lunny commented Sep 27, 2019

We need a PR to check comment size before insert it to database.

@ashimokawa
Copy link
Contributor Author

@lunny
This is what I did locally, but I think someone with more proficiency in go should do the PR which checks the Rune boundary. Just cutting bytes is not nice.

@stale
Copy link

stale bot commented Nov 27, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale stale bot added the issue/stale label Nov 27, 2019
@lunny lunny added the issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented label Nov 27, 2019
@stale stale bot removed the issue/stale label Nov 27, 2019
juergenhoetzel added a commit to juergenhoetzel/gitea that referenced this issue Jul 5, 2020
Prevents invalid UTF-8 encoding for Description and Website. Refs go-gitea#7905
lafriks added a commit that referenced this issue Jul 7, 2020
* Trim to 255 runes instead of bytes

Prevents invalid UTF-8 encoding for Description and Website. Refs #7905

* Apply suggestions from code review

Co-authored-by: zeripath <art27@cantab.net>

Co-authored-by: techknowlogick <matti@mdranta.net>
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
techknowlogick added a commit that referenced this issue Jul 7, 2020
* Trim to 255 runes instead of bytes

Prevents invalid UTF-8 encoding for Description and Website. Refs #7905

* Apply suggestions from code review

Co-authored-by: zeripath <art27@cantab.net>

Co-authored-by: techknowlogick <matti@mdranta.net>
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
ydelafollye pushed a commit to ydelafollye/gitea that referenced this issue Jul 31, 2020
* Trim to 255 runes instead of bytes

Prevents invalid UTF-8 encoding for Description and Website. Refs go-gitea#7905

* Apply suggestions from code review

Co-authored-by: zeripath <art27@cantab.net>

Co-authored-by: techknowlogick <matti@mdranta.net>
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
@wxiaoguang
Copy link
Contributor

The issue comment problem has been fixed in 1.16

@go-gitea go-gitea locked and limited conversation to collaborators Apr 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug
Projects
None yet
Development

No branches or pull requests

7 participants