Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

applySourceMap should be lossless #216

Open
mariusGundersen opened this issue Oct 24, 2015 · 7 comments
Open

applySourceMap should be lossless #216

mariusGundersen opened this issue Oct 24, 2015 · 7 comments
Labels
feat New feature

Comments

@mariusGundersen
Copy link

Using applySourceMap looses details when one of the sourcemaps has less details than the other one. This is the cause of problems when using gulp-uglify and gulp-concat together.

@mariusGundersen
Copy link
Author

Here is a test case that shows that an identity source map applied to another sourcemap results in data loss:

var sourceMap = require('source-map');
var SourceMapConsumer = sourceMap.SourceMapConsumer;
var SourceMapGenerator = sourceMap.SourceMapGenerator;

var firstSourceMap = {
  version: 3,
  file: 'firstResult.js',
  names: [],
  sources: ['source.js'],
  sourceRoot: 'http://example.com/www/js/',
  mappings: 'AAAA'
};

var secondSourceMap = {
  version: 3,
  file: 'secondResult.js',
  names: [],
  sources: ['firstResult.js'],
  sourceRoot: 'http://example.com/www/js/',
  mappings: 'AAAA,CAAC,CAAC,CAAC,CAAC'
};

var sm1 = new SourceMapConsumer(firstSourceMap);
var sm2 = new SourceMapConsumer(secondSourceMap);

console.log('==sm1==');
sm1.eachMapping(function (m) {
  console.log(m);
});
console.log('==sm2==');
sm2.eachMapping(function (m) {
  console.log(m);
});

var smg = SourceMapGenerator.fromSourceMap(sm2)
smg.applySourceMap(sm1);

var sm3 = new SourceMapConsumer(smg.toJSON());

console.log('==sm3==');
sm3.eachMapping(function (m) {
  console.log(m);
});

Here is the result:

==sm1==
{ source: 'http://example.com/www/js/source.js',
  generatedLine: 1,
  generatedColumn: 0,
  originalLine: 1,
  originalColumn: 0,
  name: null }
==sm2==
{ source: 'http://example.com/www/js/firstResult.js',
  generatedLine: 1,
  generatedColumn: 0,
  originalLine: 1,
  originalColumn: 0,
  name: null }
{ source: 'http://example.com/www/js/firstResult.js',
  generatedLine: 1,
  generatedColumn: 1,
  originalLine: 1,
  originalColumn: 1,
  name: null }
{ source: 'http://example.com/www/js/firstResult.js',
  generatedLine: 1,
  generatedColumn: 2,
  originalLine: 1,
  originalColumn: 2,
  name: null }
{ source: 'http://example.com/www/js/firstResult.js',
  generatedLine: 1,
  generatedColumn: 3,
  originalLine: 1,
  originalColumn: 3,
  name: null }
{ source: 'http://example.com/www/js/firstResult.js',
  generatedLine: 1,
  generatedColumn: 4,
  originalLine: 1,
  originalColumn: 4,
  name: null }
==sm3==
{ source: 'http://example.com/www/js/source.js',
  generatedLine: 1,
  generatedColumn: 0,
  originalLine: 1,
  originalColumn: 0,
  name: null }
{ source: 'http://example.com/www/js/source.js',
  generatedLine: 1,
  generatedColumn: 1,
  originalLine: 1,
  originalColumn: 0,
  name: null }
{ source: 'http://example.com/www/js/source.js',
  generatedLine: 1,
  generatedColumn: 2,
  originalLine: 1,
  originalColumn: 0,
  name: null }
{ source: 'http://example.com/www/js/source.js',
  generatedLine: 1,
  generatedColumn: 3,
  originalLine: 1,
  originalColumn: 0,
  name: null }
{ source: 'http://example.com/www/js/source.js',
  generatedLine: 1,
  generatedColumn: 4,
  originalLine: 1,
  originalColumn: 0,
  name: null }

Notice that the generated source map (sm3) has all generated columns point to the same original column.

@lydell
Copy link
Contributor

lydell commented Oct 25, 2015

This is completely expected. sm2.applySourceMap(sm1) goes through every mapping in sm2, and updates what each mapping points to, by asking sm1. In this case, sm1 says that any column on line 1 maps to source.js:1:0. As a result, all mappings in sm2 are updated to point to that same location.

In other words, when applying multiple source maps to each other, the resulting source map cannot be of higher resolution than the input source map with the lowest resolution. This is also mentioned in the documentation:

Note: The resolution for the resulting mappings is the minimum of this map and the supplied map.

sm1 seems to be an “identity” map made by putting one mapping at the beginning of each line. While that is a very simple method, it has the downside of loosing column information. Such a source map only gives you the line number (the column number is always zero).

To create a higher resolution identity map, one mapping per token is needed. That’s how grunt-concat does it. As further examples, source-map-dummy is a library that creates higher resolution identity maps, and source-map-concat uses source-map-dummy to concatenate files.

@mariusGundersen
Copy link
Author

So if this is the expected behavior for apply, then maybe we could make a new method that doesn't take the lowest resolution, but instead finds the union of the two resolutions. For example, wouldn't "AAAA,GAAG" also imply CAACand EAAE? So that if the applies sourcemap points to column 2 of the base sourcemap, it would end up pointing to column 2 in the generated sourcemap too?

I am working on a fix to this, as it is breaking the expectations of many users of sourcemaps, but I am fine with implementing it as a new method in addition to applySourceMap.

@lydell
Copy link
Contributor

lydell commented Oct 25, 2015

For example, wouldn't "AAAA,GAAG" also imply CAAC and EAAE?

Could we please not discuss using the encoded format, but instead using some human understandable representation?

@mariusGundersen
Copy link
Author

Could we please not discuss using the encoded format, but instead using some human understandable representation?

Sorry, typed it out on my phone; couldn't be bothered to type out a full JSON object.

The issue I'm trying to solve is when a mapping in the first sourceMap (A) doesn't match up with a mapping in the second sourceMap (B). There are two scenarios when this can happen: when the mapping in A covers a larger area than the mapping in B, and vice versa. Actually Mappings go from a point in the generated file to a point in the source file, but the characters following a mapping point can be considered to belong to that mapping. Let's simplify things a bit and assume we only have one source file and only one line, that makes it easier to describe, diagram and reason about. So we have the following source and generated code:

//source
var something = foo + bar;
//generated
var a=b+c;

The sourcemap would look like this:

//the numbers represents mappings
var something = foo + bar;
1   2         3 4   5 6  7
--------------------------
1   234567
var a=b+c;

//in ASCII-art, with the source at the top and the generated code at the bottom: 
var something = foo + bar;
|   |         / /   / /  /
|   |        / /   / /  /
|   |       / /   / /  /
|   |      / /   / /  /
|   |     / /   / /  /
|   |    / /   / /  /
|   |   / /   / /  /
|   |  / /   / /  /
|   | / /   / /  /
|   |/ /   / /  /
|   ||/   / /  /
|   |||  / /  /
|   ||| / /  /
|   |||/ /  /
|   ||||/  /
|   ||||| /
|   |||||/
var a=b+c;

This is what one sourcemap looks like. When we use applySourceMap we talk about two sourcemaps where one maps to the output of the other. So the above sourcemap could either be A (the first one, the parameter to applySourceMap) or B (the second one, on which we call applySourceMap). If it is B and A is the "identity" sourcemap, then it would look like this:

var something = foo + bar;
|
var something = foo + bar;
|   |         / /   / /  /
|   |        / /   / /  /
|   |       / /   / /  /
|   |      / /   / /  /
|   |     / /   / /  /
|   |    / /   / /  /
|   |   / /   / /  /
|   |  / /   / /  /
|   | / /   / /  /
|   |/ /   / /  /
|   ||/   / /  /
|   |||  / /  /
|   ||| / /  /
|   |||/ /  /
|   ||||/  /
|   ||||| /
|   |||||/
var a=b+c;

More generally, this is the scenario where sourcemap A has a lower resolution than sourcemap B. Or even more generally, one of the mappings in A covers several mappings in B:

 +---------+
A|1        |
 +----+----+
B|1   |2   |
 +----+----+

The problem now is finding what an arbitrary point below B maps to above A:

 ?    ?
 +---------+
A|1        |
 +----+----+
B|1   |2   |
 +----+----+
      ^
      |

Currently applySourceMap will map the point below B to the first question mark, but I would argue that it makes more sense to map it to the second question mark. This is pretty straight forward to implement, as I have done in this commit.


The other scenario is the reverse, where B is the "identity" sourcemap. It would look like this:

var something = foo + bar;
|   |         / /   / /  /
|   |        / /   / /  /
|   |       / /   / /  /
|   |      / /   / /  /
|   |     / /   / /  /
|   |    / /   / /  /
|   |   / /   / /  /
|   |  / /   / /  /
|   | / /   / /  /
|   |/ /   / /  /
|   ||/   / /  /
|   |||  / /  /
|   ||| / /  /
|   |||/ /  /
|   ||||/  /
|   ||||| /
|   |||||/
var a=b+c;
|
var a=b+c;

More generally, this is the scenario where sourcmap B has a lower resolution than sourcemap A. Or even more generally, one of the mappings in B covers several mappings in A:

 +----+----+
A|1   |2   |
 +----+----+
B|1        |
 +---------+

The problem is again finding what an arbitrary point below B maps to above A:

 ?    ?
 +----+----+
A|1   |2   |
 +----+----+
B|1        |
 +---------+
      ^
      |

With the current implementation of applySourceMap the result is the first question mark, but I would (again) argue that the second question mark is the right answer.

I have not quite figured out how to implement this in code, but I'm looking into it.


Sorry for the long comment and the many bad ASCII diagrams. The point I'm trying to get across is that either applySourceMap or a new method should find the union of the two sourcemaps rather than the intersection of the two sorcemaps as it does today.

@qraynaud
Copy link

qraynaud commented May 16, 2016

@mariusGundersen I totally agree that applySourceMap is not doing the useful thing, though it does what its documentation says.

I think it would be better to change the algorithm to do an union like you suggest, change major version number and add a "min: true" option to the function to keep the old behaviour (for those that might find it useful, though really I can't think of a useful use case for this).

Right now, 99% of applySourceMap calls are done by build systems trying to merge sourcemaps and the current algorithm is less than useful for this use case.

It would be awesome if it could be fixed in source-map because it would fix all build system in one package update rather than needing more complex modifications on all of those (though I wouuld immediately patch those I use if it's implemented in a new method).

Thanks in advance!

@robpalme
Copy link

@mariusGundersen Thank you for your very clear explanation of the existing and desired behaviour. Has there been any progress on this topic?

The current behaviour forces even simple transforms to generate excessively large sourcemaps map purely to ensure 1 mapping-per-token. So I see lots of value in providing this union behaviour as an additional option to the existing intersection behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature
Projects
None yet
Development

No branches or pull requests

5 participants