Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wiki filenames for non-latin character are escaped / not UTF8 #24907

Open
bserem opened this issue May 24, 2023 · 4 comments
Open

Wiki filenames for non-latin character are escaped / not UTF8 #24907

bserem opened this issue May 24, 2023 · 4 comments
Labels
topic/wiki type/enhancement An improvement of existing functionality

Comments

@bserem
Copy link

bserem commented May 24, 2023

Description

When creating a Wiki page with non-latin characters in the title (Greek in my case), the resulting MD file is using escaped characters. Please see screenshots bellow.
Seems like Wiki filenames are not using UTF-8 ?

Gitea Version

1.19.3

Can you reproduce the bug on the Gitea demo site?

Yes

Log Gist

No response

Screenshots

image

Git Version

2.38.5

Operating System

Linux gitea-gitea1 4.4.180+ #42962 SMP Sat Apr 8 00:14:24 CST 2023 x86_64 Linux

How are you running Gitea?

Official Docker image from: https://hub.docker.com/r/gitea/gitea

Database

SQLite

@wxiaoguang
Copy link
Contributor

That's an legacy problem. In history, the wiki filename system was not well-designed. The current rule is: most chars are encoded to URL parameter format.

@bserem
Copy link
Author

bserem commented May 24, 2023

@wxiaoguang thanks for the response.

Is it safe to say the issue lies in this function ??
https://github.com/go-gitea/gitea/blob/main/services/wiki/wiki.go#L49

	foundEscaped := false
	for _, filename := range filesInIndex {
		switch filename {
		case unescaped:
			// if we find the unescaped file return it
			return true, unescaped, nil
		case gitPath:
			foundEscaped = true
		}
	}

As I am not familiar with Gitea decisions, I'll ask: is there a reason to keep this as is? Will it break things if it gets updated?

@wxiaoguang
Copy link
Contributor

wxiaoguang commented May 24, 2023

The real problem is more complicated than it.

If you have a chance to work on 1.20, you can take a look at my PR for the details (especially the wiki_path.go)

Make wiki title supports dashes and improve wiki name related features #24143

(the purpose of that PR is just clarifying the details, it doesn't change the legacy behavior too much, there were already a lot of technical debts)

@silverwind
Copy link
Member

silverwind commented May 24, 2023

Note that we can not dump the given title as-is into the file system because because not all strings are valid filenames, like filenames with / or : in them. URL-Encoding them is the safe option. See here for regexes to match for invalid filenames:

https://github.com/sindresorhus/filename-reserved-regex/blob/main/index.js

Also, on Windows, IIRC, there may be different issues because NTFS uses UTF-16 in filenames, but the UI sends UTF-8.

@silverwind silverwind added type/enhancement An improvement of existing functionality topic/wiki and removed type/bug labels May 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/wiki type/enhancement An improvement of existing functionality
Projects
None yet
Development

No branches or pull requests

3 participants