Skip to content

[CompilerRT] Add numerical sanitizer #94322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

alexander-shaposhnikov
Copy link
Collaborator

This PR contains the compiler-rt changes that were split out from #85916.
Follow-up patch with numerical tests will be sent once the initial changes (instrumentation, clang, runtime) land.

Test plan:

  1. cd build/runtimes/runtimes-bins && ninja check-nsan
  2. ninja check-all

@llvmbot
Copy link
Member

llvmbot commented Jun 4, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Alexander Shaposhnikov (alexander-shaposhnikov)

Changes

This PR contains the compiler-rt changes that were split out from #85916.
Follow-up patch with numerical tests will be sent once the initial changes (instrumentation, clang, runtime) land.

Test plan:

  1. cd build/runtimes/runtimes-bins && ninja check-nsan
  2. ninja check-all

Patch is 90.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/94322.diff

23 Files Affected:

  • (modified) compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake (+1)
  • (modified) compiler-rt/cmake/config-ix.cmake (+12-1)
  • (added) compiler-rt/include/sanitizer/nsan_interface.h (+75)
  • (added) compiler-rt/lib/nsan/CMakeLists.txt (+61)
  • (added) compiler-rt/lib/nsan/nsan.cc (+828)
  • (added) compiler-rt/lib/nsan/nsan.h (+224)
  • (added) compiler-rt/lib/nsan/nsan.syms.extra (+2)
  • (added) compiler-rt/lib/nsan/nsan_flags.cc (+78)
  • (added) compiler-rt/lib/nsan/nsan_flags.h (+35)
  • (added) compiler-rt/lib/nsan/nsan_flags.inc (+49)
  • (added) compiler-rt/lib/nsan/nsan_interceptors.cc (+363)
  • (added) compiler-rt/lib/nsan/nsan_platform.h (+135)
  • (added) compiler-rt/lib/nsan/nsan_stats.cc (+158)
  • (added) compiler-rt/lib/nsan/nsan_stats.h (+92)
  • (added) compiler-rt/lib/nsan/nsan_suppressions.cc (+76)
  • (added) compiler-rt/lib/nsan/nsan_suppressions.h (+31)
  • (added) compiler-rt/lib/nsan/tests/CMakeLists.txt (+54)
  • (added) compiler-rt/lib/nsan/tests/NSanUnitTest.cpp (+67)
  • (added) compiler-rt/lib/nsan/tests/nsan_unit_test_main.cpp (+18)
  • (added) compiler-rt/test/nsan/CMakeLists.txt (+20)
  • (added) compiler-rt/test/nsan/Unit/lit.site.cfg.py.in (+10)
  • (added) compiler-rt/test/nsan/lit.cfg.py (+1)
  • (added) compiler-rt/test/nsan/lit.site.cfg.py.in (+14)
diff --git a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
index 2fe06273a814c..a914b62cd5c5f 100644
--- a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+++ b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
@@ -61,6 +61,7 @@ else()
 endif()
 set(ALL_MSAN_SUPPORTED_ARCH ${X86_64} ${MIPS64} ${ARM64} ${PPC64} ${S390X}
     ${LOONGARCH64})
+set(ALL_NSAN_SUPPORTED_ARCH ${X86} ${X86_64})
 set(ALL_HWASAN_SUPPORTED_ARCH ${X86_64} ${ARM64} ${RISCV64})
 set(ALL_MEMPROF_SUPPORTED_ARCH ${X86_64})
 set(ALL_PROFILE_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64} ${PPC32} ${PPC64}
diff --git a/compiler-rt/cmake/config-ix.cmake b/compiler-rt/cmake/config-ix.cmake
index ba740af9e1d60..85609a903896e 100644
--- a/compiler-rt/cmake/config-ix.cmake
+++ b/compiler-rt/cmake/config-ix.cmake
@@ -623,6 +623,9 @@ if(APPLE)
   list_intersect(MSAN_SUPPORTED_ARCH
     ALL_MSAN_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
+  list_intersect(NSAN_SUPPORTED_ARCH
+    ALL_NSAN_SUPPORTED_ARCH
+    SANITIZER_COMMON_SUPPORTED_ARCH)
   list_intersect(HWASAN_SUPPORTED_ARCH
     ALL_HWASAN_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
@@ -692,6 +695,7 @@ else()
   filter_available_targets(SHADOWCALLSTACK_SUPPORTED_ARCH
     ${ALL_SHADOWCALLSTACK_SUPPORTED_ARCH})
   filter_available_targets(GWP_ASAN_SUPPORTED_ARCH ${ALL_GWP_ASAN_SUPPORTED_ARCH})
+  filter_available_targets(NSAN_SUPPORTED_ARCH ${ALL_NSAN_SUPPORTED_ARCH})
   filter_available_targets(ORC_SUPPORTED_ARCH ${ALL_ORC_SUPPORTED_ARCH})
 endif()
 
@@ -726,7 +730,7 @@ if(COMPILER_RT_SUPPORTED_ARCH)
 endif()
 message(STATUS "Compiler-RT supported architectures: ${COMPILER_RT_SUPPORTED_ARCH}")
 
-set(ALL_SANITIZERS asan;dfsan;msan;hwasan;tsan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;asan_abi)
+set(ALL_SANITIZERS asan;dfsan;msan;hwasan;tsan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;nsan;asan_abi)
 set(COMPILER_RT_SANITIZERS_TO_BUILD all CACHE STRING
     "sanitizers to build if supported on the target (all;${ALL_SANITIZERS})")
 list_replace(COMPILER_RT_SANITIZERS_TO_BUILD all "${ALL_SANITIZERS}")
@@ -911,4 +915,11 @@ if (GWP_ASAN_SUPPORTED_ARCH AND
 else()
   set(COMPILER_RT_HAS_GWP_ASAN FALSE)
 endif()
+
+if (COMPILER_RT_HAS_SANITIZER_COMMON AND NSAN_SUPPORTED_ARCH AND
+    OS_NAME MATCHES "Linux")
+  set(COMPILER_RT_HAS_NSAN TRUE)
+else()
+  set(COMPILER_RT_HAS_NSAN FALSE)
+endif()
 pythonize_bool(COMPILER_RT_HAS_GWP_ASAN)
diff --git a/compiler-rt/include/sanitizer/nsan_interface.h b/compiler-rt/include/sanitizer/nsan_interface.h
new file mode 100644
index 0000000000000..057ca0473bb3c
--- /dev/null
+++ b/compiler-rt/include/sanitizer/nsan_interface.h
@@ -0,0 +1,75 @@
+//===-- sanitizer/nsan_interface.h ------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Public interface for nsan.
+//
+//===----------------------------------------------------------------------===//
+#ifndef SANITIZER_NSAN_INTERFACE_H
+#define SANITIZER_NSAN_INTERFACE_H
+
+#include <sanitizer/common_interface_defs.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/// User-provided default option settings.
+///
+/// You can provide your own implementation of this function to return a string
+/// containing NSan runtime options (for example,
+/// <c>verbosity=1:halt_on_error=0</c>).
+///
+/// \returns Default options string.
+const char *__nsan_default_options(void);
+
+// Dumps nsan shadow data for a block of `size_bytes` bytes of application
+// memory at location `addr`.
+//
+// Each line contains application address, shadow types, then values.
+// Unknown types are shown as `__`, while known values are shown as
+// `f`, `d`, `l` for float, double, and long double respectively. Position is
+// shown as a single hex digit. The shadow value itself appears on the line that
+// contains the first byte of the value.
+// FIXME: Show both shadow and application value.
+//
+// Example: `__nsan_dump_shadow_mem(addr, 32, 8, 0)` might print:
+//
+//  0x0add7359:  __ f0 f1 f2 f3 __ __ __   (42.000)
+//  0x0add7361:  __ d1 d2 d3 d4 d5 d6 d7
+//  0x0add7369:  d8 f0 f1 f2 f3 __ __ f2   (-1.000) (12.5)
+//  0x0add7371:  f3 __ __ __ __ __ __ __
+//
+// This means that there is:
+//   - a shadow double for the float at address 0x0add7360, with value 42;
+//   - a shadow float128 for the double at address 0x0add7362, with value -1;
+//   - a shadow double for the float at address 0x0add736a, with value 12.5;
+// There was also a shadow double for the float at address 0x0add736e, but bytes
+// f0 and f1 were overwritten by one or several stores, so that the shadow value
+// is no longer valid.
+// The argument `reserved` can be any value. Its true value is provided by the
+// instrumentation.
+void __nsan_dump_shadow_mem(const char *addr, size_t size_bytes,
+                            size_t bytes_per_line, size_t reserved);
+
+// Explicitly dumps a value.
+// FIXME: vector versions ?
+void __nsan_dump_float(float value);
+void __nsan_dump_double(double value);
+void __nsan_dump_longdouble(long double value);
+
+// Explicitly checks a value.
+// FIXME: vector versions ?
+void __nsan_check_float(float value);
+void __nsan_check_double(double value);
+void __nsan_check_longdouble(long double value);
+
+#ifdef __cplusplus
+} // extern "C"
+#endif
+
+#endif // SANITIZER_NSAN_INTERFACE_H
diff --git a/compiler-rt/lib/nsan/CMakeLists.txt b/compiler-rt/lib/nsan/CMakeLists.txt
new file mode 100644
index 0000000000000..00b16473bff0e
--- /dev/null
+++ b/compiler-rt/lib/nsan/CMakeLists.txt
@@ -0,0 +1,61 @@
+add_compiler_rt_component(nsan)
+
+include_directories(..)
+
+set(NSAN_SOURCES
+  nsan.cc
+  nsan_flags.cc
+  nsan_interceptors.cc
+  nsan_stats.cc
+  nsan_suppressions.cc
+)
+
+set(NSAN_HEADERS
+  nsan.h
+  nsan_flags.h
+  nsan_flags.inc
+  nsan_platform.h
+  nsan_stats.h
+  nsan_suppressions.h
+)
+
+append_list_if(COMPILER_RT_HAS_FPIC_FLAG -fPIC NSAN_CFLAGS)
+
+set(NSAN_DYNAMIC_LINK_FLAGS ${SANITIZER_COMMON_LINK_FLAGS})
+
+set(NSAN_CFLAGS ${SANITIZER_COMMON_CFLAGS})
+#-fno-rtti -fno-exceptions
+#    -nostdinc++ -pthread -fno-omit-frame-pointer)
+
+# Remove -stdlib= which is unused when passing -nostdinc++.
+# string(REGEX REPLACE "-stdlib=[a-zA-Z+]*" "" CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS})
+
+if (COMPILER_RT_HAS_NSAN)
+  foreach(arch ${NSAN_SUPPORTED_ARCH})
+    add_compiler_rt_runtime(
+      clang_rt.nsan
+      STATIC
+      ARCHS ${arch}
+      SOURCES ${NSAN_SOURCES}
+              $<TARGET_OBJECTS:RTInterception.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommon.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonLibc.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonCoverage.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonSymbolizer.${arch}>
+              $<TARGET_OBJECTS:RTUbsan.${arch}>
+      ADDITIONAL_HEADERS ${NSAN_HEADERS}
+      CFLAGS ${NSAN_CFLAGS}
+      PARENT_TARGET nsan
+    )
+  endforeach()
+
+  add_compiler_rt_object_libraries(RTNsan
+      ARCHS ${NSAN_SUPPORTED_ARCH}
+      SOURCES ${NSAN_SOURCES}
+      ADDITIONAL_HEADERS ${NSAN_HEADERS}
+      CFLAGS ${NSAN_CFLAGS})
+endif()
+
+if(COMPILER_RT_INCLUDE_TESTS)
+  add_subdirectory(tests)
+endif()
diff --git a/compiler-rt/lib/nsan/nsan.cc b/compiler-rt/lib/nsan/nsan.cc
new file mode 100644
index 0000000000000..29351ca111a3f
--- /dev/null
+++ b/compiler-rt/lib/nsan/nsan.cc
@@ -0,0 +1,828 @@
+//===-- nsan.cc -----------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// NumericalStabilitySanitizer runtime.
+//
+// This implements:
+//  - The public nsan interface (include/sanitizer/nsan_interface.h).
+//  - The private nsan interface (./nsan.h).
+//  - The internal instrumentation interface. These are function emitted by the
+//    instrumentation pass:
+//        * __nsan_get_shadow_ptr_for_{float,double,longdouble}_load
+//          These return the shadow memory pointer for loading the shadow value,
+//          after checking that the types are consistent. If the types are not
+//          consistent, returns nullptr.
+//        * __nsan_get_shadow_ptr_for_{float,double,longdouble}_store
+//          Sets the shadow types appropriately and returns the shadow memory
+//          pointer for storing the shadow value.
+//        * __nsan_internal_check_{float,double,long double}_{f,d,l} checks the
+//          accuracy of a value against its shadow and emits a warning depending
+//          on the runtime configuration. The middle part indicates the type of
+//          the application value, the suffix (f,d,l) indicates the type of the
+//          shadow, and depends on the instrumentation configuration.
+//        * __nsan_fcmp_fail_* emits a warning for an fcmp instruction whose
+//          corresponding shadow fcmp result differs.
+//
+//===----------------------------------------------------------------------===//
+
+#include <assert.h>
+#include <math.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+#include "sanitizer_common/sanitizer_atomic.h"
+#include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_libc.h"
+#include "sanitizer_common/sanitizer_report_decorator.h"
+#include "sanitizer_common/sanitizer_stacktrace.h"
+#include "sanitizer_common/sanitizer_symbolizer.h"
+
+#include "nsan/nsan.h"
+#include "nsan/nsan_flags.h"
+#include "nsan/nsan_stats.h"
+#include "nsan/nsan_suppressions.h"
+
+using namespace __sanitizer;
+using namespace __nsan;
+
+static constexpr const int kMaxVectorWidth = 8;
+
+// When copying application memory, we also copy its shadow and shadow type.
+// FIXME: We could provide fixed-size versions that would nicely
+// vectorize for known sizes.
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+__nsan_copy_values(const char *daddr, const char *saddr, uptr size) {
+  internal_memmove((void *)getShadowTypeAddrFor(daddr),
+                   getShadowTypeAddrFor(saddr), size);
+  internal_memmove((void *)getShadowAddrFor(daddr), getShadowAddrFor(saddr),
+                   size * kShadowScale);
+}
+
+// FIXME: We could provide fixed-size versions that would nicely
+// vectorize for known sizes.
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+__nsan_set_value_unknown(const char *addr, uptr size) {
+  internal_memset((void *)getShadowTypeAddrFor(addr), 0, size);
+}
+
+namespace __nsan {
+
+const char *FTInfo<float>::kCppTypeName = "float";
+const char *FTInfo<double>::kCppTypeName = "double";
+const char *FTInfo<long double>::kCppTypeName = "long double";
+const char *FTInfo<__float128>::kCppTypeName = "__float128";
+
+const char FTInfo<float>::kTypePattern[sizeof(float)];
+const char FTInfo<double>::kTypePattern[sizeof(double)];
+const char FTInfo<long double>::kTypePattern[sizeof(long double)];
+
+// Helper for __nsan_dump_shadow_mem: Reads the value at address `Ptr`,
+// identified by its type id.
+template <typename ShadowFT> __float128 readShadowInternal(const char *Ptr) {
+  ShadowFT Shadow;
+  __builtin_memcpy(&Shadow, Ptr, sizeof(Shadow));
+  return Shadow;
+}
+
+__float128 readShadow(const char *Ptr, const char ShadowTypeId) {
+  switch (ShadowTypeId) {
+  case 'd':
+    return readShadowInternal<double>(Ptr);
+  case 'l':
+    return readShadowInternal<long double>(Ptr);
+  case 'q':
+    return readShadowInternal<__float128>(Ptr);
+  default:
+    return 0.0;
+  }
+}
+
+class Decorator : public __sanitizer::SanitizerCommonDecorator {
+public:
+  Decorator() : SanitizerCommonDecorator() {}
+  const char *Warning() { return Red(); }
+  const char *Name() { return Green(); }
+  const char *End() { return Default(); }
+};
+
+namespace {
+
+// Workaround for the fact that Printf() does not support floats.
+struct PrintBuffer {
+  char Buffer[64];
+};
+template <typename FT> struct FTPrinter {};
+
+template <> struct FTPrinter<double> {
+  static PrintBuffer dec(double Value) {
+    PrintBuffer Result;
+    snprintf(Result.Buffer, sizeof(Result.Buffer) - 1, "%.20f", Value);
+    return Result;
+  }
+  static PrintBuffer hex(double Value) {
+    PrintBuffer Result;
+    snprintf(Result.Buffer, sizeof(Result.Buffer) - 1, "%.20a", Value);
+    return Result;
+  }
+};
+
+template <> struct FTPrinter<float> : FTPrinter<double> {};
+
+template <> struct FTPrinter<long double> {
+  static PrintBuffer dec(long double Value) {
+    PrintBuffer Result;
+    snprintf(Result.Buffer, sizeof(Result.Buffer) - 1, "%.20Lf", Value);
+    return Result;
+  }
+  static PrintBuffer hex(long double Value) {
+    PrintBuffer Result;
+    snprintf(Result.Buffer, sizeof(Result.Buffer) - 1, "%.20La", Value);
+    return Result;
+  }
+};
+
+// FIXME: print with full precision.
+template <> struct FTPrinter<__float128> : FTPrinter<long double> {};
+
+// This is a template so that there are no implicit conversions.
+template <typename FT> inline FT ftAbs(FT V);
+
+template <> inline long double ftAbs(long double V) { return fabsl(V); }
+template <> inline double ftAbs(double V) { return fabs(V); }
+
+// We don't care about nans.
+// std::abs(__float128) code is suboptimal and generates a function call to
+// __getf2().
+template <typename FT> inline FT ftAbs(FT V) { return V >= FT{0} ? V : -V; }
+
+template <typename FT1, typename FT2, bool Enable> struct LargestFTImpl {
+  using type = FT2;
+};
+
+template <typename FT1, typename FT2> struct LargestFTImpl<FT1, FT2, true> {
+  using type = FT1;
+};
+
+template <typename FT1, typename FT2>
+using LargestFT =
+    typename LargestFTImpl<FT1, FT2, (sizeof(FT1) > sizeof(FT2))>::type;
+
+template <typename T> T max(T a, T b) { return a < b ? b : a; }
+
+} // end anonymous namespace
+
+} // end namespace __nsan
+
+void __sanitizer::BufferedStackTrace::UnwindImpl(uptr pc, uptr bp,
+                                                 void *context,
+                                                 bool request_fast,
+                                                 u32 max_depth) {
+  using namespace __nsan;
+  return Unwind(max_depth, pc, bp, context, 0, 0, false);
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __nsan_print_accumulated_stats() {
+  if (nsan_stats)
+    nsan_stats->print();
+}
+
+static void nsanAtexit() {
+  Printf("Numerical Sanitizer exit stats:\n");
+  __nsan_print_accumulated_stats();
+  nsan_stats = nullptr;
+}
+
+// The next three functions return a pointer for storing a shadow value for `n`
+// values, after setting the shadow types. We return the pointer instead of
+// storing ourselves because it avoids having to rely on the calling convention
+// around long double being the same for nsan and the target application.
+// We have to have 3 versions because we need to know which type we are storing
+// since we are setting the type shadow memory.
+template <typename FT>
+static char *getShadowPtrForStore(char *StoreAddr, uptr N) {
+  unsigned char *ShadowType = getShadowTypeAddrFor(StoreAddr);
+  for (uptr I = 0; I < N; ++I) {
+    __builtin_memcpy(ShadowType + I * sizeof(FT), FTInfo<FT>::kTypePattern,
+                     sizeof(FTInfo<FT>::kTypePattern));
+  }
+  return getShadowAddrFor(StoreAddr);
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE char *
+__nsan_get_shadow_ptr_for_float_store(char *store_addr, uptr n) {
+  return getShadowPtrForStore<float>(store_addr, n);
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE char *
+__nsan_get_shadow_ptr_for_double_store(char *store_addr, uptr n) {
+  return getShadowPtrForStore<double>(store_addr, n);
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE char *
+__nsan_get_shadow_ptr_for_longdouble_store(char *store_addr, uptr n) {
+  return getShadowPtrForStore<long double>(store_addr, n);
+}
+
+template <typename FT>
+static bool isValidShadowType(const unsigned char *ShadowType) {
+  return __builtin_memcmp(ShadowType, FTInfo<FT>::kTypePattern, sizeof(FT)) ==
+         0;
+}
+
+template <int kSize, typename T> static bool isZero(const T *Ptr) {
+  constexpr const char kZeros[kSize] = {}; // Zero initialized.
+  return __builtin_memcmp(Ptr, kZeros, kSize) == 0;
+}
+
+template <typename FT>
+static bool isUnknownShadowType(const unsigned char *ShadowType) {
+  return isZero<sizeof(FTInfo<FT>::kTypePattern)>(ShadowType);
+}
+
+// The three folowing functions check that the address stores a complete
+// shadow value of the given type and return a pointer for loading.
+// They return nullptr if the type of the value is unknown or incomplete.
+template <typename FT>
+static const char *getShadowPtrForLoad(const char *LoadAddr, uptr N) {
+  const unsigned char *const ShadowType = getShadowTypeAddrFor(LoadAddr);
+  for (uptr I = 0; I < N; ++I) {
+    if (!isValidShadowType<FT>(ShadowType + I * sizeof(FT))) {
+      // If loadtracking stats are enabled, log loads with invalid types
+      // (tampered with through type punning).
+      if (flags().enable_loadtracking_stats) {
+        if (isUnknownShadowType<FT>(ShadowType + I * sizeof(FT))) {
+          // Warn only if the value is non-zero. Zero is special because
+          // applications typically initialize large buffers to zero in an
+          // untyped way.
+          if (!isZero<sizeof(FT)>(LoadAddr)) {
+            GET_CALLER_PC_BP;
+            nsan_stats->addUnknownLoadTrackingEvent(pc, bp);
+          }
+        } else {
+          GET_CALLER_PC_BP;
+          nsan_stats->addInvalidLoadTrackingEvent(pc, bp);
+        }
+      }
+      return nullptr;
+    }
+  }
+  return getShadowAddrFor(LoadAddr);
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE const char *
+__nsan_get_shadow_ptr_for_float_load(const char *load_addr, uptr n) {
+  return getShadowPtrForLoad<float>(load_addr, n);
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE const char *
+__nsan_get_shadow_ptr_for_double_load(const char *load_addr, uptr n) {
+  return getShadowPtrForLoad<double>(load_addr, n);
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE const char *
+__nsan_get_shadow_ptr_for_longdouble_load(const char *load_addr, uptr n) {
+  return getShadowPtrForLoad<long double>(load_addr, n);
+}
+
+// Returns the raw shadow pointer. The returned pointer should be considered
+// opaque.
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE char *
+__nsan_internal_get_raw_shadow_ptr(const char *addr) {
+  return getShadowAddrFor(const_cast<char *>(addr));
+}
+
+// Returns the raw shadow type pointer. The returned pointer should be
+// considered opaque.
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE char *
+__nsan_internal_get_raw_shadow_type_ptr(const char *addr) {
+  return reinterpret_cast<char *>(
+      getShadowTypeAddrFor(const_cast<char *>(addr)));
+}
+
+static ValueType getValueType(unsigned char c) {
+  return static_cast<ValueType>(c & 0x3);
+}
+
+static int getValuePos(unsigned char c) { return c >> kValueSizeSizeBits; }
+
+// Checks the consistency of the value types at the given type pointer.
+// If the value is inconsistent, returns ValueType::kUnknown. Else, return the
+// consistent type.
+template <typename FT>
+static bool checkValueConsistency(const unsigned char *ShadowType) {
+  const int Pos = getValuePos(*ShadowType);
+  // Check that all bytes from the start of the value are ordered.
+  for (uptr I = 0; I < sizeof(FT); ++I) {
+    const unsigned char T = *(ShadowType - Pos + I);
+    if (!(getValueType(T) == FTInfo<FT>::kValueType && getValuePos(T) == I)) {
+      return false;
+    }
+  }
+  return true;
+}
+
+// The instrumentation automatically appends `shadow_value_type_ids`, see
+// maybeAddSuffixForNsanInterface.
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+__nsan_dump_shadow_mem(const char *addr, size_t size_bytes,
+                       size_t bytes_per_line, size_t shadow_value_type_ids) {
+  const unsigned char *const ShadowType = getShadowTypeAddrFor(addr);
+  const char *const Shadow = getShadowAddrFor(addr);
+
+  constexpr const int kMaxNumDecodedValues = 16;
+  __float128 DecodedValues[kMaxNumDecodedValues];
+  int NumDecodedValues...
[truncated]

@vitalybuka
Copy link
Collaborator

I assume, this one depends on #85916 ?

@alexander-shaposhnikov
Copy link
Collaborator Author

This PR can be committed/reverted independently from #85916 since we've split out the integration tests. In particular, here check-nsan would run unit tests only.

Copy link

github-actions bot commented Jun 9, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.


// On Linux/x86_64, memory is laid out as follows:
//
// +--------------------+ 0x800000000000 (top of memory)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe @thurstond can provide a feedback for these mappings?

@alexander-shaposhnikov alexander-shaposhnikov force-pushed the rebased_nsan_compiler_rt branch 2 times, most recently from 19c8a0d to 6dc2728 Compare June 18, 2024 04:49
@alexander-shaposhnikov alexander-shaposhnikov merged commit cae6d45 into llvm:main Jun 19, 2024
4 of 5 checks passed
@MaskRay MaskRay mentioned this pull request Jun 20, 2024
@MaskRay
Copy link
Member

MaskRay commented Jun 20, 2024

I am sorry that I did not notice this patch but I feel sad that this has been committed with many style issues. I fully support numerical sanitizer but I think we really should do a better code style job for posterity. Created #96142

@alexander-shaposhnikov
Copy link
Collaborator Author

Ack

MaskRay added a commit that referenced this pull request Jun 20, 2024
The initial check-in of compiler-rt/lib/nsan #94322 has a lot of style
issues. Fix them before the history becomes more useful.

Pull Request: #96142
@ilovepi
Copy link
Contributor

ilovepi commented Jun 20, 2024

Hi, we're seeing some test failures in our Linux CI bots for arm64 and x64 after this patch. It appears that the c++ file for gtest is being compiled w/ clang instead of clang++. Its not clear to me why that's happening, from a quick look at the patch. Probably a small mistake in the CMake?

Error Message:

FAILED: compiler-rt/lib/nsan/tests/NsanTestObjects.gtest-all.cc.x86_64.o /b/s/w/ir/x/w/llvm_build/runtimes/runtimes-x86_64-unknown-linux-gnu-bins/compiler-rt/lib/nsan/tests/NsanTestObjects.gtest-all.cc.x86_64.o 
cd /b/s/w/ir/x/w/llvm_build/runtimes/runtimes-x86_64-unknown-linux-gnu-bins/compiler-rt/lib/nsan/tests && /b/s/w/ir/x/w/llvm_build/./bin/clang --target=x86_64-unknown-linux-gnu -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -ffile-prefix-map=/b/s/w/ir/x/w/llvm_build/runtimes/runtimes-x86_64-unknown-linux-gnu-bins=../../../llvm-llvm-project -ffile-prefix-map=/b/s/w/ir/x/w/llvm-llvm-project/= -no-canonical-prefixes -Wall -Wno-unused-parameter -Wno-unknown-warning-option -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta -I/b/s/w/ir/x/w/llvm-llvm-project/compiler-rt/include -nostdinc++ -g -Wno-covered-switch-default -Wno-suggest-override -DGTEST_NO_LLVM_SUPPORT=1 -DGTEST_HAS_RTTI=0 -I/b/s/w/ir/x/w/llvm-llvm-project/runtimes/../third-party/unittest/googletest/include -I/b/s/w/ir/x/w/llvm-llvm-project/runtimes/../third-party/unittest/googletest -I/b/s/w/ir/x/w/llvm-llvm-project/compiler-rt/lib/ -DSANITIZER_COMMON_REDEFINE_BUILTINS_IN_STD -O2 -g -fno-omit-frame-pointer -c -o NsanTestObjects.gtest-all.cc.x86_64.o /b/s/w/ir/x/w/llvm-llvm-project/third-party/unittest/googletest/src/gtest-all.cc
In file included from /b/s/w/ir/x/w/llvm-llvm-project/third-party/unittest/googletest/src/gtest-all.cc:38:
/b/s/w/ir/x/w/llvm-llvm-project/runtimes/../third-party/unittest/googletest/include/gtest/gtest.h:52:10: fatal error: 'cstddef' file not found
   52 | #include <cstddef>
      |          ^~~~~~~~~
1 error generated.

Failing bot: https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8744662427054892657/overview

Bug: https://fxbug.dev/348377270

I'm not sure how this is passing in buildbot. Can you take a look, and if its hard to fix quickly, revert until it can be relanded?

As an FYI on our build config, it isn't special, its a normal Linux toolchain, using the RUNTIMES build, and fails w/ both stage1 and stage2 toolchains. Of note is that our bots don't have a libc installed, and we build our toolchains hermetically w/ a curated Linux sysroot. There's a chance your patch isn't respecting the SYSROOT correctly, but I haven't looked closely enough to be sure.

That said, I'm fairly certain the error in this case is using the c compiler vs. the c++ compiler.

@alexander-shaposhnikov
Copy link
Collaborator Author

alexander-shaposhnikov commented Jun 20, 2024

thanks for the report, I'll look shortly (if don't find - will revert for now)
upd. i think i see the problem, this has slipped my eye, working on a fix

@ilovepi
Copy link
Contributor

ilovepi commented Jun 20, 2024

TY!


add_custom_target(NsanUnitTests)

# set(NSAN_UNITTEST_LINK_FLAGS ${COMPILER_RT_UNITTEST_LINK_FLAGS} -ldl)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these lines be commented out?

AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
This diff contains the compiler-rt changes / preparations for nsan.

Test plan:

1. cd build/runtimes/runtimes-bins && ninja check-nsan
2. ninja check-all
AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
The initial check-in of compiler-rt/lib/nsan llvm#94322 has a lot of style
issues. Fix them before the history becomes more useful.

Pull Request: llvm#96142
MaskRay added a commit that referenced this pull request Jul 11, 2024
#94322 defines .preinit_array to initialize nsan early.
DT_PREINIT_ARRAY can only be used with the main executable. GNU ld would
complain when a DSO has .preinit_array .
MaskRay added a commit that referenced this pull request Jul 12, 2024
#94322 defines .preinit_array to initialize nsan early.
DT_PREINIT_ARRAY can only be used with the main executable. GNU ld would
complain when a DSO has .preinit_array. Therefore,
nsan_preinit.cpp cannot be linked into `libclang_rt.nsan.so` (#98415).

Working with @alexander-shaposhnikov, we noticed that `Nsan-x86_64-Test
--gtest_output=json` without `.preinit_array` will sigsegv. This is
because googletest with the JSON output calls `localtime_r` , which
calls `free(0)` and fails when `REAL(free)` remains uninitialized
(nullptr). This is benign with the default output because malloc/free
are all paired and `REAL(free)(ptr)` is not called.

To fix the unittest failure, `__nsan_init` needs to be called early
(.preinit_array).
`asan/tests/CMakeLists.txt:ASAN_UNITTEST_INSTRUMENTED_LINK_FLAGS` ues
`-fsanitize=address` to ensure `asan_preinit.cpp.o` is linked into the
unittest executable. Port the approach and remove
`NSAN_TEST_RUNTIME_OBJECTS`.

Fix #98523

Pull Request: #98564
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
llvm#94322 defines .preinit_array to initialize nsan early.
DT_PREINIT_ARRAY can only be used with the main executable. GNU ld would
complain when a DSO has .preinit_array .
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
llvm#94322 defines .preinit_array to initialize nsan early.
DT_PREINIT_ARRAY can only be used with the main executable. GNU ld would
complain when a DSO has .preinit_array. Therefore,
nsan_preinit.cpp cannot be linked into `libclang_rt.nsan.so` (llvm#98415).

Working with @alexander-shaposhnikov, we noticed that `Nsan-x86_64-Test
--gtest_output=json` without `.preinit_array` will sigsegv. This is
because googletest with the JSON output calls `localtime_r` , which
calls `free(0)` and fails when `REAL(free)` remains uninitialized
(nullptr). This is benign with the default output because malloc/free
are all paired and `REAL(free)(ptr)` is not called.

To fix the unittest failure, `__nsan_init` needs to be called early
(.preinit_array).
`asan/tests/CMakeLists.txt:ASAN_UNITTEST_INSTRUMENTED_LINK_FLAGS` ues
`-fsanitize=address` to ensure `asan_preinit.cpp.o` is linked into the
unittest executable. Port the approach and remove
`NSAN_TEST_RUNTIME_OBJECTS`.

Fix llvm#98523

Pull Request: llvm#98564
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants