You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a large DB with more than 100k records, I need to split it to multiple sitemap files. As your code, I use SitemapAndIndexStream to generate all sitemap files from sitemap-1.xml to sitemap-9.xml for example.
Now I want to run code to generate sitemap everyday, and it should only generate from last item on sitemap-9.xml to the new latest item on DB. If I run SitemapAndIndexStream again, it will query all the items on DB and generate all files again, it will take more resources and not good.
How can I do this?
The text was updated successfully, but these errors were encountered:
Yeah, you don't want to run SitemapAndIndexStream again.
Although... the DB index you should be using should have all of the sitemap records sequentially in an index, so the actual cost of the query should be very low in that it should not use a lot of DB I/O and should not use a lot of DB CPU.
100k records is not a lot. That would only need to write a 2-3 record sitemap index and 2-4 sitemap xml files. Regenerating all of that would only take a minute or so with the XML serialization being the longest part.
The alternative is much more complex. You'll need to pull back the index file, get the latest sitemap XML file out of it, fetch that file, parse the latest sitemap into items, add to the items collection, rotate into a new file (and add it to the index) when the last file fills up, then put back the old latest file, any newly added files, and the index file.
It's... not trivial. I have a system that I'm trying to open source that does all of that, but I can't make any promises.
Hello,
I have a large DB with more than 100k records, I need to split it to multiple sitemap files. As your code, I use
SitemapAndIndexStream
to generate all sitemap files from sitemap-1.xml to sitemap-9.xml for example.Now I want to run code to generate sitemap everyday, and it should only generate from last item on sitemap-9.xml to the new latest item on DB. If I run
SitemapAndIndexStream
again, it will query all the items on DB and generate all files again, it will take more resources and not good.How can I do this?
The text was updated successfully, but these errors were encountered: