Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extension xhtml ignored by Google translate #153

Open
KhronosWebservices opened this issue May 21, 2024 · 13 comments
Open

Extension xhtml ignored by Google translate #153

KhronosWebservices opened this issue May 21, 2024 · 13 comments

Comments

@KhronosWebservices
Copy link

A Russian speaker has discovered that the .xhtml cannot be translated by Google Translate. For example, this .html page translated fine:

https://registry-khronos-org.translate.goog/OpenGL-Refpages/gl4/?_x_tr_sl=en&_x_tr_tl=ru

This page does not:
https://registry-khronos-org.translate.goog/OpenGL-Refpages/gl4/html/all.xhtml?_x_tr_sl=en&_x_tr_tl=ru

Instead, getting redirected to:
https://registry-khronos-org.translate.goog/OpenGL-Refpages/gl4/html/all.xhtml

Changing the extension to just .html and the page then translates without any issues. I could imagine there are others that are having translation issues on our ref pages.

Is there anything that can be done to fix this issue?

@oddhack
Copy link
Contributor

oddhack commented May 22, 2024

The toolchain is Docbook 4 -> XHTML Transitional. Switching to Docbook 5 -> HTML would be a huge amount of work. I do not know what happens these days if we simply rename .xhtml -> .html as I have barely touched this in a decade. If you're confident that it would be benign on all the major browsers and platforms we could try that, although would also have to establish redirects.

@KhronosWebservices
Copy link
Author

I'll setup a test folder with .html inside and see how that goes.

@KhronosWebservices
Copy link
Author

This looks to work as expected: https://registry.khronos.org/OpenGL-Refpages/gl4/html_test/

Only significant change was to modify the <script src=/> tag to <script src=""></script>, otherwise seems to work well.

@BuslikDrev
Copy link

BuslikDrev commented May 22, 2024

https://registry.khronos.org/OpenGL-Refpages/gl4/html_test/
Я проверил все страницы, теперь перевод работает через https://translate.google.com/?op=websites.
Спасибо.

@KhronosWebservices
Copy link
Author

Отлично, спасибо, что помогли нам решить эту проблему.

@KhronosWebservices
Copy link
Author

@oddhack Changing to .html fixed the issue. If we have any other areas that are only .xhtml, it might be worth setting up similar.

In the meantime, what is the best way to make the new .html extension files the permanent go-to for the OpenGL 4 RefPages?

@oddhack
Copy link
Contributor

oddhack commented May 23, 2024

All the various versions of GL refpages and the EGL refpages use the same toolchain. I will need to understand exactly what you did aside from file renaming and the script tag and replicate that in the Makefiles, then setup redirects for .xhtml -> .html and change the index generation scripts.

@KhronosWebservices
Copy link
Author

No other changes were required apart from modifying the script tag, renaming .xhtml to .html and updating all the links in all the files to point to .html instead of .xhtml.

AFAIK once the Makefiles are updated from xhtml to html, <script/> and other various

tags should automatically be rendered as <script></script> and
?

@oddhack
Copy link
Contributor

oddhack commented May 23, 2024

AFAIK once the Makefiles are updated from xhtml to html, <script/> and other various
tags should automatically be rendered as <script></script> and
?

"Automatically" in the sense of needing to run a script over the output .xhtml document. There is no way to generate HTML5 output from Docbook 4 source, which predates HTML5 and is obsolete, but maybe this will patch around it. Did you happen to run an HTML5 validator over the test directory?

This might be good motivation to finally convert the refpage source to asciidoc markup.

@KhronosWebservices
Copy link
Author

Did you happen to run an HTML5 validator over the test directory?

No, a little nervous of the output... but I will today.

@oddhack
Copy link
Contributor

oddhack commented May 23, 2024

I have the impression that non-valid HTML is likely to downrank search results on that page. Right now generally the right thing happens when you put in a GL entry point, the top result is likely to be the XHTML refpage which in turn is probably valid XHTML because it comes from the Docbook toolchain. If we convert it to HTML5 by postprocessing but it's not valid HTML5, that might change.

@KhronosWebservices
Copy link
Author

KhronosWebservices commented May 23, 2024

There are some issues with the current xhtml pages and some general issues from converting:

  1. <table style="cellpadding: 0; cellspacing: 0;"> should be <table style="border-collapse: collapse;"> This is a CSS error as cellpadding and cellspacing doesn't exist in CSS.
  2. Any tags that are self closing either need to be closed or removed. <footer/> and <header/> are the culprits here. I've removed those on the test page. There was also a <span class="trademark"/> tag which I think was supposed to be <span class="trademark">©</span>?
  3. Any void elements like <col/> <link/> <script/> <meta/> should be <col> <link> <meta>. And <script/> should become <script></script>.
  4. Remove type="text/javascript", no longer needed in script tags.
  5. Update <html> to be <html lang="en">.
  6. Added language meta tag <meta charset="utf-8">.
  7. Remove XML from line 1.

Test page with changes applied, and will validate Green: https://registry.khronos.org/OpenGL-Refpages/gl4/html_test/glBindFragDataLocationIndexed.html

Randomly picking a different page and applying the same changes resulted also in properly validation.

@oddhack
Copy link
Contributor

oddhack commented May 24, 2024

TBH I would rather work on an asciidoc conversion. Postprocessing XHTML to HTML 5 looks really fragile / heuristic and I think the time will be better spent on a more modern toolchain though it will not be an instant fix. I'm sorry for the person reporting the translation problem as this will not immediately solve their problem.

BTW if you write comments with HTML tags in github (or gitlab), the tags are prone to disappearing in the web view - a lot of your comment above suffers this. Compare

(self-closing 'col' tag, which does not render)

with

<col/> (same tag with a backslash before the left angle bracket).

Comments are treated as Github-Flavored Markdown and the brackets are specially treated. Extremely annoying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants