-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add canonical links to the PySpark docs page for published docs #482
Conversation
At present, this PR has only made changes to the historical document of version 3.1.1. If there are no issues with similar modifications, I will use tools to make similar modifications to other versions. This PR will involve a lot of files. |
I will perform similar operations on other versions of HTML documents. |
This PR has completed the
|
Merged to asf-site |
@panbingkun thanks for doing this. However, I discovered that some of the canonical links generated are not a valid URL, for example: https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.sql.DataFrame.groupBy.html |
Yes, I also noticed this, and the reason is not the issue of updating the logic, According to normal logic, But in the latest version, this document has been moved to a different location: This issue will always exist. If the position of the document changes in the new version, the |
Yes exactly, and we should not change the URL structure of any documentation published in the future. I think the URL structure stays the same for docs after 3.2 (correct me if I am wrong). @panbingkun if it's not too much trouble, can we manually update the canonical link for docs < version 3.2? And we need to make sure we don't change the doc URL structure again in the future. cc @allanf-db @HyukjinKwon @zhengruifeng |
@allisonwang-db Of course, I wrote a small tool last time. Slightly modify its logic, it should be able to handle it, but I need to carefully check to ensure that it is completely accurate this time, waiting for me 😄 A new followup pr: #483 |
…atest version of the document The pr is followup #482. #482 (comment) As discussed above, due to changes in some document addresses after version `3.3.0`, `the canonical link` is incorrect. We are now correcting it. Base on rule: For file that do not exist in version `latest`, delete the corresponding ref and update it for the rest. Author: panbingkun <pbk1982@gmail.com> Closes #483 from panbingkun/canonical_links_followup.
The pr aims to Add canonical links to the PySpark docs page for published docs
As discussed, we aim to fix the aforementioned issues by directly repairing files that have already been released in history.
include versions:
3.1.1, 3.1.2, 3.1.3, 3.2.0, 3.2.1, 3.2.2, 3.2.3, 3.2.4, 3.3.0, 3.3.1, 3.3.2, 3.3.3, 3.4.0, 3.4.1