Skip to content

Commit

Permalink
SAV reader: Force all strings through iconv conversion
Browse files Browse the repository at this point in the history
See note in code and also Roche/pyreadstat#67
  • Loading branch information
evanmiller committed Aug 18, 2020
1 parent feb6e9d commit a8b0466
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion src/spss/readstat_sav_read.c
Original file line number Diff line number Diff line change
Expand Up @@ -934,7 +934,13 @@ static readstat_error_t sav_parse_machine_integer_info_record(const void *data,
}
ctx->input_encoding = src_charset;
}
if (src_charset && dst_charset && strcmp(src_charset, dst_charset) != 0) {
if (src_charset && dst_charset) {
// You might be tempted to skip the charset conversion when src_charset
// and dst_charset are the same. However, some versions of SPSS insert
// illegally truncated strings (e.g. the last character is three bytes
// but the field only has room for two bytes). So to prevent the client
// from receiving an invalid byte sequence, we ram everything through
// iconv, even if most of the time it will be a no-op.
iconv_t converter = iconv_open(dst_charset, src_charset);
if (converter == (iconv_t)-1) {
return READSTAT_ERROR_UNSUPPORTED_CHARSET;
Expand Down

0 comments on commit a8b0466

Please sign in to comment.