Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make migration stream more efficient #102

Merged
merged 7 commits into from
Dec 4, 2017
Merged

Conversation

springmeyer
Copy link
Contributor

While reviewing the migration stream code, I noticed that in the case of v2 tiles we were calling mapnik.VectorTile.info twice on the same buffer. This call is not cheap because it:

  • the function is sync and blocks the main event loop, which will slow down the ability of the stream to process concurrently
  • decodes the entire compressed tile (which requires significant memory allocation)
  • iterates over all the layers and features in the vector tile
  • Makes a number of calls to create v8 C++ objects to pass back to JS

This is likely slowing down unpacking of v2 vector tiles in a meaningful way. This PR fixes the problem.

/cc @mapbox/unpacker

@mapsam
Copy link
Contributor

mapsam commented Nov 8, 2017

@springmeyer I'd like to bring this along with #107 for the 7.x release. Is this branch still viable?

@springmeyer
Copy link
Contributor Author

@mapsam - sounds good to bring this along. The branch should be correct, but I would love some second eyes to make sure it is working right: 1) that mapnik.VectorTile.info is only getting called once for v2 tiles and 2) that nothing else is broken, exports work fine.

@springmeyer
Copy link
Contributor Author

@mapsam - just resolved the conflict with master via the github UI - hopefully should be good to review now.

@mapsam
Copy link
Contributor

mapsam commented Nov 9, 2017

Thanks @springmeyer! I'll work on getting tests passing.

@springmeyer
Copy link
Contributor Author

springmeyer commented Nov 9, 2017 via email

@mapsam
Copy link
Contributor

mapsam commented Nov 22, 2017

@springmeyer I added a sinon spy to the mapnik.VectorTile.info method e0483b6#diff-e9d67ed2ee5f98cf552e2ad8f6190725 to assert we are only calling it once per tile. Despite your changes, it looks like .info is called twice still - confirmed the same number of calls between master and this branch. Perhaps we're still calling it elsewhere?

@springmeyer
Copy link
Contributor Author

springmeyer commented Dec 1, 2017

@mapsam - sorry for the delay in responding to this one, and thanks again for helping get this out.

copy mbtiles with v1 tile logging

The test you are referring to e0483b6#diff-e9d67ed2ee5f98cf552e2ad8f6190725 looks like it is testing v1 tiles.

It is expected still that mapnik.VectorTile.info is called for v1 tiles before and after this PR. What this PR does is ensures only one call for the v2 case (where we don't need to migrate). See

var info = mapnik.VectorTile.info(tile.buffer);
var v2 = info.layers.every(function(layer) {
return layer.version === 2;
});
if (v2 === true) {
// Check for errors on the v2 tile
var err = checkForErrors(info);
if (err) return callback(err);
migrationStream.push(tile);
return callback();
}
migrate(tile, function(err, new_tile) {
if (err) return callback(err);
// Check for errors on newly-minted v2 tile
var err = checkForErrors(mapnik.VectorTile.info(new_tile.buffer));
if (err) return callback(err);
migrationStream.push(new_tile);
return callback();
});
.

So I think we need a test to ensure it is called only once for v2 tiles. That sound right?

@mapsam
Copy link
Contributor

mapsam commented Dec 4, 2017

Ah, that makes sense. Thanks @springmeyer - I've updated the test with a valid v2 fixture... what do you think of this?

test('confirm mapnik.VectorTile.info is only called once for v2 tiles', function(t) {
process.env.LOG_V1_TILES = true;
var fixture = path.resolve(__dirname, 'fixtures', 'valid-v2.mbtiles');
var src = 'mbtiles://' + fixture;
var dst = dsturi('valid.mbtiles');
sinon.spy(tilelive, 'copy');
sinon.spy(mapnikVT, 'info');
tileliveCopy(src, dst, {}, function(err) {
t.ifError(err, 'copied');
tileCount(dst, function(err, count) {
t.equal(tilelive.copy.getCall(0).args[2].type, 'list', 'uses list scheme for mbtiles');
t.equal(tilelive.copy.getCall(0).args[2].retry, undefined, 'passes options.retry to tilelive.copy');
t.equal(mapnikVT.info.callCount, count, 'called mapnik info as many times as there are tiles');
tilelive.copy.restore();
mapnikVT.info.restore();
t.end();
});
});
});

@mapsam
Copy link
Contributor

mapsam commented Dec 4, 2017

@springmeyer updated so the spy is in the pre-existing v2 tests, plus rearranged the migration stream code to make it a little more readable.

@springmeyer
Copy link
Contributor Author

Great, looks good to merge and roll out when you have time @mapsam - thanks for the help!

@mapsam
Copy link
Contributor

mapsam commented Dec 4, 2017

🍏 - thanks much @springmeyer for noticing this. Will plan on releasing a new minor version for this.

@mapsam mapsam merged commit f541fe9 into master Dec 4, 2017
@mapsam mapsam deleted the avoid-extra-info-call branch December 4, 2017 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants