Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heap-buffer-overflow (OSS-Fuzz issue 585) #452

Closed
nlohmann opened this issue Feb 14, 2017 · 9 comments
Closed

Heap-buffer-overflow (OSS-Fuzz issue 585) #452

nlohmann opened this issue Feb 14, 2017 · 9 comments

Comments

@nlohmann
Copy link
Owner

Detailed report: https://clusterfuzz-external.appspot.com/testcase?key=5009340075343872

Project: json
Fuzzer: libFuzzer_json_parse_afl_fuzzer
Fuzz target binary: parse_afl_fuzzer
Job Type: libfuzzer_asan_json
Platform Id: linux

Crash Type: Heap-buffer-overflow READ 8
Crash Address: 0x6020000000b7
Crash State:
nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<cha
bool nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_strin
bool nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_strin

Sanitizer: address (ASAN)

Recommended Security Severity: Medium

Regressed: https://clusterfuzz-external.appspot.com/revisions?job=libfuzzer_asan_json&range=201702132129:201702140531

Reproducer Testcase: https://clusterfuzz-external.appspot.com/download/AMIfv97mm1JZ4V5k5VCl5P6E-ceJRwUXfsTrq4rlo9BIjo21iQKC195ix4UzRVSlS5NdE_d2mmLbqFB3kJw2nXzOxlNVi6d1s76lCZXC5I0n66KdH7a_lG_lxxBEB4Q_0bSXt9E6XKIstsm5R2om6cpc1OmVjAQ4j75qGxUILIki2AKyK2mVjoma5KsaVg5jG-QKXsyr8fZl3z0l1ccuc0na1dhUTUhdVwAf4VwBxbvEFYtgAKcqtU_LA22GE7DmfiGWXjbvsaxyyf_1BH7oe2fc94qkMwBvJqjh9-VV80Ma4YZRZ3hwohfhJfYN23ijQ0K_Ra6Z-37IE1eb_rri9IndPpkrTS6HUJKeVNsB3UxBCMfCq0uXQy0-Osh3J9PrX_kqrESs4mIxO6Us-YDeLsP5OzppvbaJZ2x6CSnNqfO7geUeYG7PP9o?testcase_id=5009340075343872


Issue filed automatically.

See https://github.com/google/oss-fuzz/blob/master/docs/reproducing.md for more information.

This bug is subject to a 90 day disclosure deadline. If 90 days elapse
without an upstream patch, then the bug report will automatically
become visible to the public.

Input: -012274

READ of size 8 at 0x6020000000b7 thread T0
SCARINESS: 23 (8-byte-read-heap-buffer-overflow)
#0 0x449a13 in StrtolFixAndCheck(void*, char const*, char**, char*, int) /src/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors.inc:2940
#1 0x449f41 in strtoll _asan_rtl_
#2 0x5c0beb in nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::lexer::strtonum::parse_integral(char**, std::__1::integral_constant<bool, true>) const /src/json/src/json.hpp:11048:24
#3 0x5c07d1 in bool nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::lexer::strtonum::parse<long>(long&, std::__1::integral_constant<bool, true>) const /src/json/src/json.hpp:11061:32
#4 0x5bfd58 in bool nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::lexer::strtonum::to<long, void>(long&) const /src/json/src/json.hpp:10953:24
#5 0x5a716e in nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::lexer::get_number(nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>&, nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::lexer::token_type) const /src/json/src/json.hpp:11125:39
#6 0x5954cf in nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::parser::parse_internal(bool) /src/json/src/json.hpp:11409:45
#7 0x59216d in nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::parser::parse() /src/json/src/json.hpp:11221:33
#8 0x584c36 in nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer> nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::parse<unsigned char const*, 0>(unsigned char const*, unsigned char const*, std::__1::function<bool (int, nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>::parse_event_t, nlohmann::basic_json<std::__1::map, std::__1::vector, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, bool, long, unsigned long, double, std::__1::allocator, nlohmann::adl_serializer>&)>) /src/json/src/json.hpp:6479:40
#9 0x582c7d in LLVMFuzzerTestOneInput /src/json/test/src/fuzzer-parse_json.cpp:34:19
#10 0x53593e in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:550:13
#11 0x536188 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long) /src/libfuzzer/FuzzerLoop.cpp:501:3
#12 0x513658 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) /src/libfuzzer/FuzzerDriver.cpp:268:6
#13 0x518b1f in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) /src/libfuzzer/FuzzerDriver.cpp:517:9
#14 0x512cc8 in main /src/libfuzzer/FuzzerMain.cpp:20:10
#15 0x7f5acb09c82f in __libc_start_main
@nlohmann
Copy link
Owner Author

Since 82fb613 , c95ff86

@nlohmann
Copy link
Owner Author

Observations:

  • std::strtoll seems to read past the input buffer
  • It happens for inputs like -0123 and longer, but parsing -01, -012 do not trigger the ASAN.

The error seems to be a missing NULL byte.

This works

std::vector<uint8_t> vec = {'-', '0', '1', '\0'};
char* m_start = reinterpret_cast<char*>(vec.data());
std::strtoll(m_start, nullptr, 10);

This does not:

std::vector<uint8_t> vec = {'-', '0', '1'};
char* m_start = reinterpret_cast<char*>(vec.data());
std::strtoll(m_start, nullptr, 10);

@nlohmann
Copy link
Owner Author

(related to #450, #379, and #302)

@gregmarr
Copy link
Contributor

Probably need to check for a null byte and either return an error or add it before calling strtoll.

@nlohmann
Copy link
Owner Author

Right. I think this is really an edge case which should not be triggered too often.

@nlohmann
Copy link
Owner Author

It seems I just have to call fill_line_buffer() before the conversion to make sure I am reading from a buffer rather than the raw stream.

@nlohmann
Copy link
Owner Author

nlohmann commented Feb 15, 2017

Ok. One fix would be to check if m_line_buffer is empty before a number is processed. If so, fill_line_buffer() must be called. Then, continuing is safe, because we read from a NULL-terminated string.

This approach works: all tests (including a regression test for this issue) pass with ASAN. But it is ugly, because (1) the lexer would properly read a number wrt. the JSON specification, (2) the parser would check if the line buffer is empty and refill it (totally the job of the lexer) and then (3) process the number and maybe reading more bytes than the lexer did in (1).

I could move the fill_line_buffer() call to the lexer's constructor. This would be symmetric to the case where we read from a stream. It is also nice to have this issue addressed inside the lexer leaving the parser as is. But it would mean that we always copy a given buffer into a member of the lexer. So far, I tried to avoid this, but maybe I misunderstand re2c's approach here.

Once we would accept that we always parse from a buffer that we control, we can exploit this and not only add \0 to the end, but also replace commas with the locale-version to make life easier. However, I would feel much more comfortable if we could process the input in a read-only manner without making copies.

This all may be possible (and is out of scope of this issue), but I am uncomfortable fixing this issue by switching to copying the input buffer in any case. (However, we may already do this, since at some point we reach the end of the input and must assert that there are sufficient bytes between the current character and the end of the buffer...) (Edit: We do eventually call fill_line_buffer(), but only copy/extend the part from m_start to the end which is usually a much smaller portion of the buffer rather than always copying the whole string.)

Any ideas?

@nlohmann nlohmann added the state: help needed the issue needs help to proceed label Feb 15, 2017
@nlohmann
Copy link
Owner Author

Another observation: In the first step when re2c recognizes numbers, it takes care of resizing the buffer if necessary. We only get into troubles if:

  1. the lexer does not see the need for resizing;
  2. std::strtoll reads past the number recognized by re2c

The second point can only occur if re2c stops reading a number before std::strtoll would. So far that I can see this, this can only happen for numbers of the form -?0([0-9])+

So basically, the grammar can be extended to reject such numbers, leaving everything else unchanged.

nlohmann added a commit that referenced this issue Feb 15, 2017
@nlohmann nlohmann removed the state: help needed the issue needs help to proceed label Feb 15, 2017
@nlohmann nlohmann self-assigned this Feb 16, 2017
@nlohmann nlohmann added this to the Release 2.1.1 milestone Feb 16, 2017
@nlohmann
Copy link
Owner Author

Reported as fixed with 973402c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants