-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8, MIME Type Declaration, BOMs, and other encoding standards #198
Comments
Moved the UTF-8 bug hunting work to #200. |
If a githubRepoPage has multiple javascriptBlobs, open the import page in a new tab. Accepts POST requests on the import page as well.
This and the following commits should be PR ready, and should not break existing routes. This means refactored code that affect more than one route will be duplicated and renamed. We can easily cleanup extra code after implementing the entire refactor.
I just imported /ZrenTest/TestUserScript/folder/folder/%E3%83%86%E3%82%B9%E3%83%88.user.js. On dev, it redirects okay. On Production it redirects to: https://openuserjs.org/scripts/zrentest/%C6%B9%C8 The script imports fine. |
One thing I've noticed is there usually isn't a clear path of:
With a specific example... in our search routines Input is currently converted to a Working ISO 8859-1 "sort of" regular expression compatible syntax via deprecated Ideally Working encoding should always be UTF-8 but we know regular expressions have some issues (as do some other support routines expecting something else)... which is why they aren't always reliable to use with variant Input. Another reason why I opted out of using a re with sanitizing certain linked For the docs we should clarify for prs that all transformations like this should be commented briefly in the source as to what the I/W/O is and perhaps a variable name identifier prefix ... unless it's in HTML/XML where the tags could of course denote this. |
If running Linux a good recursive method for detecting these is: ``` sh-session $ find . -type f -print0 | xargs -0r awk '/^\xEF\xBB\xBF/ {print FILENAME} {nextfile}' ``` Applies to OpenUserJS#200 and OpenUserJS#198
…ion-4) Applies to OpenUserJS#198 and OpenUserJS#200 **NOTE** * Why is there a `defer` flag for ace? Could be related to OpenUserJS#148 * Currently ignoring `./public/js`
This is more or less been determined... closing. |
* Similar to @janekptacijarabaci fix in greasemonkey/greasemonkey#1940 * Fix compliance with STYLEGUIDE.md and usage of pre-initialized identifiers * Currently **do not** propagate BOM with meta routine or user.js source with and without installation count increment * BOM currently shows up in Ace as a exclamation triangle with "This character may get silently deleted by one or more browsers" **NOTE**: Many thanks to the report by @cvzi and applies to OpenUserJS#200 and partially outlined in OpenUserJS#198.
…keeps things consistent *hopefully* * Using uppercase as mentioned at OpenUserJS#198 (comment) Applies to OpenUserJS#678 Related to: * OpenUserJS#348 discovery * OpenUserJS#200 * OpenUserJS#198
This is going to be a very general issue ticket in order to discuss and address a few inconsistencies with node projects, including ours. And I'd like to hear some experiences encountered with our node project and any others.
DONE - UTF-8: All node projects should be using uppercase
UTF-8
and notutf-8
according to the spec. When this is not consistent this can lead to unpredictable behavior between deployments whether on dev or production servers. Some book page websites may say it can be lowercase but some software components may not be smart enough to distinguish the difference.DONE - Byte Order Mark (BOM) : BOMs are said to be not used everywhere... currently we have one in at least one file. This should be remedied on a system that can control that. This from my experience usually happens when there is a Unicode character inserted into a file and saved. Our current STYLEGUIDE hints at this with "Avoid use of international characters
because they may not read well or be understood everywhere.". Unfortunately I don't see an easier way to detect if a pr or commit is generating these or not.
In general encodings may need to be explicitly defined in contradiction to our current STYLEGUIDE saying the server handles it.
_EDIT_:
MIME types: These should always be included rather than having the server/client guessing off of file extensions.
See also:
Applies to and isolated from #19. Most of this will go in either STYLEGUIDE and/or CONTRIBUTING
The text was updated successfully, but these errors were encountered: