Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If a string has too many invalid UTF-8 characters, json::dump attempts to index an array out of bounds. #1445

Closed
dgavedissian opened this issue Jan 18, 2019 · 1 comment · Fixed by #1446

Comments

@dgavedissian
Copy link
Contributor

dgavedissian commented Jan 18, 2019

  • What is the issue you have?
    On Windows, with a specifically crafted JSON string object (initialised with std::string), json::dump crashes if ensure_ascii is set to true, and the error handler is set to error_handler_t::replace. Looking at the stack trace, it appears that dump_escaped inside serializer.hpp doesn't do any bounds checking inside the UTF8_REJECT case for string_buffer, which is hardcoded to be exactly 512 bytes.

  • Please describe the steps to reproduce the issue. Can you provide a small but working code example?
    Compile and run the following code on Windows (with optimisations disabled and in debug mode):

  nlohmann::json dump_test;
  std::vector<char> data = {
      (char)109,  (char)108,  (char)103,  (char)125,  (char)-122, (char)-53,  (char)115,
      (char)18,   (char)3,    (char)0,    (char)102,  (char)19,   (char)1,    (char)15,
      (char)-110, (char)13,   (char)-3,   (char)-1,   (char)-81,  (char)32,   (char)2,
      (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,
      (char)8,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,
      (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)-80,  (char)2,
      (char)0,    (char)0,    (char)96,   (char)-118, (char)46,   (char)-116, (char)46,
      (char)109,  (char)-84,  (char)-87,  (char)108,  (char)14,   (char)109,  (char)-24,
      (char)-83,  (char)13,   (char)-18,  (char)-51,  (char)-83,  (char)-52,  (char)-115,
      (char)14,   (char)6,    (char)32,   (char)0,    (char)0,    (char)0,    (char)0,
      (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,
      (char)64,   (char)3,    (char)0,    (char)0,    (char)0,    (char)35,   (char)-74,
      (char)-73,  (char)55,   (char)57,   (char)-128, (char)0,    (char)0,    (char)0,
      (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,    (char)0,
      (char)0,    (char)0,    (char)33,   (char)0,    (char)0,    (char)0,    (char)-96,
      (char)-54,  (char)-28,  (char)-26};
  dump_test["1"] = std::string{data.data(), data.size()};
  dump_test.dump(-1, ' ', true, nlohmann::json::error_handler_t::replace);
  • What is the expected behavior?
    It works fine.

  • And what is the actual behavior instead?
    It crashes on serializer.hpp on line 445 (in debug mode). The state of the local variables are:

-		this	0x0000006792b7e8c0 {o=shared_ptr {str="{\"1\":\"" } [1 strong ref] [make_shared] number_buffer={ size=64 } ...}	nlohmann::detail::serializer<nlohmann::basic_json<std::map,std::vector,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,bool,__int64,unsigned __int64,double,std::allocator,nlohmann::adl_serializer> > *
+		o	shared_ptr {str="{\"1\":\"" } [1 strong ref] [make_shared]	std::shared_ptr<nlohmann::detail::output_adapter_protocol<char> >
+		number_buffer	{ size=64 }	std::array<char,64>
+		loc	ucrtbased.dll!0x00007ffc1e3ce020 (load symbols for additional information) {decimal_point=0x00007ffc1e3ce458 "." ...}	const lconv *
		thousands_sep	0 '\0'	const char
		decimal_point	46 '.'	const char
+		string_buffer	{ size=512 }	std::array<char,512>
		indent_char	32 ' '	const char
+		indent_string	"                                                                                                                                                                                                        ...	std::basic_string<char,std::char_traits<char>,std::allocator<char> >
		error_handler	replace (1)	const nlohmann::detail::error_handler_t
		byte	230 'æ'	const unsigned char
		bytes	512	unsigned __int64
		bytes_after_last_accept	511	unsigned __int64
		codepoint	294	unsigned int
		ensure_ascii	true	const bool
		i	106	unsigned __int64
+		s	"mlg}†Ës\x12\x3"	const std::basic_string<char,std::char_traits<char>,std::allocator<char> > &
		state	1 '\x1'	unsigned char
		undumped_chars	1	unsigned __int64

The stack trace is:

 	json_test.exe!std::array<char,512>::operator[](unsigned __int64 _Pos) Line 152	C++
>	json_test.exe!nlohmann::detail::serializer<nlohmann::basic_json<std::map,std::vector,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,bool,__int64,unsigned __int64,double,std::allocator,nlohmann::adl_serializer> >::dump_escaped(const std::basic_string<char,std::char_traits<char>,std::allocator<char> > & s, const bool ensure_ascii) Line 445	C++
 	json_test.exe!nlohmann::detail::serializer<nlohmann::basic_json<std::map,std::vector,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,bool,__int64,unsigned __int64,double,std::allocator,nlohmann::adl_serializer> >::dump(const nlohmann::basic_json<std::map,std::vector,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,bool,__int64,unsigned __int64,double,std::allocator,nlohmann::adl_serializer> & val, const bool pretty_print, const bool ensure_ascii, const unsigned int indent_step, const unsigned int current_indent) Line 234	C++
 	json_test.exe!nlohmann::detail::serializer<nlohmann::basic_json<std::map,std::vector,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,bool,__int64,unsigned __int64,double,std::allocator,nlohmann::adl_serializer> >::dump(const nlohmann::basic_json<std::map,std::vector,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,bool,__int64,unsigned __int64,double,std::allocator,nlohmann::adl_serializer> & val, const bool pretty_print, const bool ensure_ascii, const unsigned int indent_step, const unsigned int current_indent) Line 165	C++
 	json_test.exe!nlohmann::basic_json<std::map,std::vector,std::basic_string<char,std::char_traits<char>,std::allocator<char> >,bool,__int64,unsigned __int64,double,std::allocator,nlohmann::adl_serializer>::dump(const int indent, const char indent_char, const bool ensure_ascii, const nlohmann::detail::error_handler_t error_handler) Line 1979	C++
  • Which compiler and operating system are you using? Is it a supported compiler?
    Windows 10. Visual Studio 2015 (latest version).

  • Did you use a released version of the library or the version from the develop branch?
    Version 3.5.0

  • If you experience a compilation error: can you compile and run the unit tests?
    Yes. All pass with Visual Studio 2015.

@dgavedissian dgavedissian changed the title If a string has too many invalid UTF-8 characters, json::dump indexes an array out of bounds. If a string has too many invalid UTF-8 characters, json::dump attempts to index an array out of bounds. Jan 18, 2019
@nlohmann
Copy link
Owner

Thanks for reporting!

nlohmann added a commit that referenced this issue Jan 20, 2019
attempt to fix #1445, flush buffer in serializer::dump_escaped in UTF8_REJECT case.
dnsmichi pushed a commit to Icinga/icinga2 that referenced this issue Dec 13, 2019
This includes the following fixes:

nlohmann/json#1436

> For a deeply-nested JSON object, the recursive implementation of json_value::destroy function causes stack overflow.

nlohmann/json#1708
nlohmann/json#1722

Stack size

nlohmann/json#1693 (comment)

Integer Overflow

nlohmann/json#1447

UTF8, json dump out of bounds

nlohmann/json#1445

Possibly influences #7532
Al2Klimov pushed a commit to Icinga/icinga2 that referenced this issue Dec 16, 2019
This includes the following fixes:

nlohmann/json#1436

> For a deeply-nested JSON object, the recursive implementation of json_value::destroy function causes stack overflow.

nlohmann/json#1708
nlohmann/json#1722

Stack size

nlohmann/json#1693 (comment)

Integer Overflow

nlohmann/json#1447

UTF8, json dump out of bounds

nlohmann/json#1445

Possibly influences #7532
N-o-X pushed a commit to Icinga/icinga2 that referenced this issue May 8, 2020
This includes the following fixes:

nlohmann/json#1436

> For a deeply-nested JSON object, the recursive implementation of json_value::destroy function causes stack overflow.

nlohmann/json#1708
nlohmann/json#1722

Stack size

nlohmann/json#1693 (comment)

Integer Overflow

nlohmann/json#1447

UTF8, json dump out of bounds

nlohmann/json#1445

Possibly influences #7532
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants