-
Notifications
You must be signed in to change notification settings - Fork 76
CSVMapper does not correctly parse objects with duplicate values. #41
Comments
Which version is this with? |
Sorry, I should have specified earlier. |
Ok. So, I still don't quite understand the problem: maybe I should run the code. But I do not see any duplication here; except for value "foo", "bar", "foo". If those are taken to be column names, then yes, use of duplicate property names is not supported. |
The columns names are defined in the schema as Col1, Col2, and Col3.
The data for those columns are: "foo","bar","foo"
I would expect the map to contain the following:
Instead I get the following
So the column names are lost and the data values become the column names. |
Ah yes; that looks wrong. Thank you for clarifying this; I hope to look into what is causing the problem. |
I've been looking into this and it looks like the issue is occurring as the CsvParser doesnt differentiate between a token that is JsonToken.FIELD_NAME and JsonToken.VALUE_STRING in the following package com.fasterxml.jackson.dataformat.csv;
...
public class CsvParser
extends ParserMinimalBase
{
...
@Override
public String getText() throws IOException, JsonParseException {
return _currentValue;
}
...
} which is necessary as the UntypedObjectDeserializer mapObject() iterates over the tokens: package com.fasterxml.jackson.databind.deser.std;
...
public class UntypedObjectDeserializer
extends StdDeserializer<Object>
implements ResolvableDeserializer, ContextualDeserializer
{
...
protected Object mapObject(JsonParser jp, DeserializationContext ctxt)
throws IOException, JsonProcessingException
{
...
String field1 = jp.getText(); // CsvParser will return _currentValue
jp.nextToken();
Object value1 = deserialize(jp, ctxt); // Calls jp.getText() internally for JsonToken.VALUE_STRING
...
return result;
}
...
} proposed fix: package com.fasterxml.jackson.dataformat.csv;
...
public class CsvParser
extends ParserMinimalBase
{
...
@Override
public String getText() throws IOException, JsonParseException {
if (_currToken == JsonToken.FIELD_NAME) {
return _currentName;
}
return _currentValue;
}
...
} |
@gribr Thank you for digging into this. I will have a look now. |
Thank you for troubleshooting this; fixed for 2.4.0 (and 2.3.4). Will also change |
Duplicate column value when parsing objects from the input stream with a schema causes dropped data.
It appears that the mapper uses the value for the column name and therefor, when there are duplicate data elements, only one is returned.
Example contained in code below.
The text was updated successfully, but these errors were encountered: