Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate a split sitemap (also fix robots.txt) #4639

Merged
merged 3 commits into from
Apr 14, 2023
Merged

Conversation

reebalazs
Copy link
Member

@reebalazs reebalazs commented Apr 3, 2023

sitemap.xml.gz remains identical.

sitemap-index.xml can be used instead as an index file, which will link to sitemap1.xml.gz, sitemap2.xml.gz, ...

The default index size is 2000 which also considers the max file size to remain under Google's limit. (50k)

Also:

  • Fix robots.txt to not contain an internal link

@reebalazs
Copy link
Member Author

Fixes #4638

@netlify
Copy link

netlify bot commented Apr 3, 2023

Deploy Preview for volto ready!

Name Link
🔨 Latest commit b18e3aa
🔍 Latest deploy log https://app.netlify.com/sites/volto/deploys/64394b4dc3e5050007871b61
😎 Deploy Preview https://deploy-preview-4639--volto.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@cypress
Copy link

cypress bot commented Apr 3, 2023

Passing run #4939 ↗︎

0 489 20 0 Flakiness 0

Details:

Fix robots.txt to contain a public link
Project: Volto Commit: b18e3aabf5
Status: Passed Duration: 12:23 💡
Started: Apr 14, 2023 12:50 PM Ended: Apr 14, 2023 1:03 PM

This comment has been generated by cypress-bot as a result of this project's GitHub integration settings.

@reebalazs reebalazs changed the title WIP Generate a split sitemap WIP Generate a split sitemap (also fix robots.txt) Apr 5, 2023
@reebalazs reebalazs changed the title WIP Generate a split sitemap (also fix robots.txt) Generate a split sitemap (also fix robots.txt) Apr 5, 2023
@sneridagh sneridagh requested review from davisagli and a team April 11, 2023 15:22
`sitemap.xml.gz` remains identical.

`sitemap-index.xml` can be used instead as an index file, which will
link to `sitemap1.xml.gz`, `sitemap2.xml.gz`, ...

The default index size is 2000 which also considers the max file size
to remain under Google's limit. (50k)
Although it's all the same for Google with or without this, it's more
correct not to add the content encoding gzip header, as we just want
to transfer the gzipped file as a binary and not consider the browser
to decode it when downloaded.

(The other option would be to leave the content encoding header but then
just call the file as `.xml` without the `.gz` ending. That would
however only result in larger file sizes when saved and would give no
extra benefit. It would also lead to non-compatible changes.)
Replace http://backend from the robots.txt provided by the backend with
the public facing url.

Also, publish the index file instead of the single file that would be
rejected by Google.
@sneridagh sneridagh merged commit ba727b3 into master Apr 14, 2023
@sneridagh sneridagh deleted the ree-split-sitemap branch April 14, 2023 13:26
sneridagh added a commit that referenced this pull request Apr 14, 2023
Co-authored-by: Balázs Reé <ree@greenfinity.hu>
sneridagh added a commit that referenced this pull request Apr 17, 2023
* master: (22 commits)
  Release changelog notes for 16.20.1
  Release 17.0.0-alpha.5
  Generate a split sitemap (also fix robots.txt) (#4639)
  Fix search block in edit mode re-queries multiple blocks with an empty search text (#4694)
  Fix Move to top of folder ordering in folder content view (#4691)
  Changelog
  Revert "Add current page parameter to the route in the listing and search block pagination (#4159)" (#4695)
  Release generate-volto 7.0.0-alpha.4
  Force the resolution of the `react-error-overlay` package to `6.0.9` (#4687)
  Fix training links (#4635)
  Release 17.0.0-alpha.4
  Release changelog notes for 16.20.0 (#4684)
  Update to latest backend versions (#4682)
  Support RelationList field with StaticCatalogVocabulary and SelectWidget. (#4614)
  Load a theme via a `theme` key in `volto.config.js` or in `package.json` (#4625)
  docs: improve creating view documentation (#4636)
  fix sitemap.xml.gz is not compressed #4622 (v2) (#4663)
  Make URL a literal string to fix broken link (#4667)
  Move developer guidelines to contributing #4665 (#4666)
  Update Volto contributing to align with and refer to the new Plone co… (#4634)
  ...
sneridagh added a commit that referenced this pull request May 10, 2023
* master: (83 commits)
  Apply suggestion from browser for password field (#4524)
  (fix):Object.normaliseMail: Cannot read properties of null (#4558)
  Fix link in Volto, remove from linkcheck ignore in Documentation v6.0 (#4742)
  added documentation regarding the static middleware #4518 (#4736)
  Closes issue #4567 (#4570)
  Fix whitespace in locales created by the generator (#4737)
  Tidy up from synch with 16.x.x (#4728)
  Release notes from 16.20.2, 16.20.3, 16.20.4 (#4729)
  Use new URL 6.docs.plone.org (#4726)
  Security upgrade for momentjs (#4715)
  Don't decode querystring while adding apiExpanders (#4719)
  (Synchronize redundant block id in listing block on pasting) Fix duplicating listing block (#4239)
  Translate add-on control panel (cleaned up PR) (#4620)
  Fix Move to top of folder ordering in folder content view by searchin… (#4709)
  Fix robot.txt - the sitemap link should respect x-forwarded headers (#4704)
  Refactor faulty reorder elements in ObjectBrowserList widget (#4703)
  Release changelog notes for 16.20.1
  Release 17.0.0-alpha.5
  Generate a split sitemap (also fix robots.txt) (#4639)
  Fix search block in edit mode re-queries multiple blocks with an empty search text (#4694)
  ...
sneridagh added a commit that referenced this pull request May 11, 2023
* master: (185 commits)
  fix: unresponsive add page (#4507)
  Apply suggestion from browser for password field (#4524)
  (fix):Object.normaliseMail: Cannot read properties of null (#4558)
  Fix link in Volto, remove from linkcheck ignore in Documentation v6.0 (#4742)
  added documentation regarding the static middleware #4518 (#4736)
  Closes issue #4567 (#4570)
  Fix whitespace in locales created by the generator (#4737)
  Tidy up from synch with 16.x.x (#4728)
  Release notes from 16.20.2, 16.20.3, 16.20.4 (#4729)
  Use new URL 6.docs.plone.org (#4726)
  Security upgrade for momentjs (#4715)
  Don't decode querystring while adding apiExpanders (#4719)
  (Synchronize redundant block id in listing block on pasting) Fix duplicating listing block (#4239)
  Translate add-on control panel (cleaned up PR) (#4620)
  Fix Move to top of folder ordering in folder content view by searchin… (#4709)
  Fix robot.txt - the sitemap link should respect x-forwarded headers (#4704)
  Refactor faulty reorder elements in ObjectBrowserList widget (#4703)
  Release changelog notes for 16.20.1
  Release 17.0.0-alpha.5
  Generate a split sitemap (also fix robots.txt) (#4639)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants