-
-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow query for totalTiles in MBTiles -> PMTiles conversion #127
Comments
Is running in quiet mode with no logged output acceptable for your use case? |
As long as the side-effect means the query doesn't happen, it's definitely an improvement over the status quo. Ideally we'd also have a solution that allows for some progress reporting in CI—but of course we'd face similar problems mentioned in #117 related to thousands of log lines. Ideally, I'd advocate for a more CI-friendly style of progress reporting behind a flag as a good final state. |
I'm not sure if this is related but I'm experiencing really slow performance on converting My dataset is a 158 GB I'm currently on Pass 2: writing tiles with the progress bar saying:
I know it is a large dataset, but surely it shouldn't take 40 days on a high-spec desktop. The PC doesn't appear to be working hard (10% CPU, 8% Memory, 1% Disk) |
That sounds like a separate issue, can you upload your 158GB .mbtiles somewhere so others can reproduce? |
@bdon A share link to my |
@mem48 did you try running with |
…127] * redundant work because getting the count(*) of tiles requires a table scan.
@mem48 your mbtiles is missing an index:
|
@lseelenbinder are your mbtiles missing the index too? it's not a |
@mem48 on my laptop converting your file with the index added and |
Yes, there's an index, though the mbtiles uses the tiles view with a mapping + data table to deduplicate, so it's a bit slower than a simple setup. It looks like #142 fully fixes this for our purposes though. |
Thanks @bdon Adding the index fixed my problem and only took 45 minutes even with deduplication. Is it worth a check and warning in In case anybody else wants to reproduce my fix on Ubunut 22 was:
Adding an index with |
can you find a way to detect the missing index in SQL? |
1.18 removes the query for count because there's not much point in executing the same thing twice (table scan) for two passes. Looks like this is improved if there's a unique index on tiles per the MBTiles spec. Related to the missing index, we can't make any assumptions about whether the |
Appreciate these changes... I don't have timings from before, but it is way way faster working with my last conversion of 300 GB. Only took 1 hour (macOS M2 Max) writing on an NVMe
|
We're using
pmtiles convert
to do some large conversion jobs (including global data). Along the way, I realized that there's aselect count(*) from tiles
that only exists to produce accurate progress bars.What would your preferred approach to eliminating this query (it can take 10s of minutes on large enough archives for essentially no real result)? I could see coordinating this with #117 or adding a flag something along the lines of
--impreciseProgress
. I'm happy to do either, but would like to know what you'd prefer.The text was updated successfully, but these errors were encountered: