Skip to content

Commit

Permalink
Merge pull request llvm#3982 from haoNoQ/static-analyzer-cherrypicks-25
Browse files Browse the repository at this point in the history
Static analyzer cherrypicks 25
  • Loading branch information
haoNoQ authored Feb 23, 2022
2 parents eee7d51 + 3e8146f commit 2ea4f20
Show file tree
Hide file tree
Showing 107 changed files with 7,111 additions and 1,576 deletions.
82 changes: 82 additions & 0 deletions clang/docs/analyzer/checkers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2333,6 +2333,88 @@ A data is tainted when it comes from an unreliable source.
alpha.unix
^^^^^^^^^^^
.. _alpha-unix-StdCLibraryFunctionArgs:
alpha.unix.StdCLibraryFunctionArgs (C)
""""""""""""""""""""""""""""""""""""""
Check for calls of standard library functions that violate predefined argument
constraints. For example, it is stated in the C standard that for the ``int
isalnum(int ch)`` function the behavior is undefined if the value of ``ch`` is
not representable as unsigned char and is not equal to ``EOF``.
.. code-block:: c
void test_alnum_concrete(int v) {
int ret = isalnum(256); // \
// warning: Function argument constraint is not satisfied
(void)ret;
}
If the argument's value is unknown then the value is assumed to hold the proper value range.
.. code-block:: c
#define EOF -1
int test_alnum_symbolic(int x) {
int ret = isalnum(x);
// after the call, ret is assumed to be in the range [-1, 255]
if (ret > 255) // impossible (infeasible branch)
if (x == 0)
return ret / x; // division by zero is not reported
return ret;
}
If the user disables the checker then the argument violation warning is
suppressed. However, the assumption about the argument is still modeled. This
is because exploring an execution path that already contains undefined behavior
is not valuable.
There are different kind of constraints modeled: range constraint, not null
constraint, buffer size constraint. A **range constraint** requires the
argument's value to be in a specific range, see ``isalnum`` as an example above.
A **not null constraint** requires the pointer argument to be non-null.
A **buffer size** constraint specifies the minimum size of the buffer
argument. The size might be a known constant. For example, ``asctime_r`` requires
that the buffer argument's size must be greater than or equal to ``26`` bytes. In
other cases, the size is denoted by another argument or as a multiplication of
two arguments.
For instance, ``size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream)``.
Here, ``ptr`` is the buffer, and its minimum size is ``size * nmemb``
.. code-block:: c
void buffer_size_constraint_violation(FILE *file) {
enum { BUFFER_SIZE = 1024 };
wchar_t wbuf[BUFFER_SIZE];
const size_t size = sizeof(*wbuf); // 4
const size_t nitems = sizeof(wbuf); // 4096
// Below we receive a warning because the 3rd parameter should be the
// number of elements to read, not the size in bytes. This case is a known
// vulnerability described by the the ARR38-C SEI-CERT rule.
fread(wbuf, size, nitems, file);
}
**Limitations**
The checker is in alpha because the reports cannot provide notes about the
values of the arguments. Without this information it is hard to confirm if the
constraint is indeed violated. For example, consider the above case for
``fread``. We display in the warning message that the size of the 1st arg
should be equal to or less than the value of the 2nd arg times the 3rd arg.
However, we fail to display the concrete values (``4`` and ``4096``) for those
arguments.
**Parameters**
The checker models functions (and emits diagnostics) from the C standard by
default. The ``ModelPOSIX`` option enables the checker to model (and emit
diagnostics) for functions that are defined in the POSIX standard. This option
is disabled by default.
.. _alpha-unix-BlockInCriticalSection:
alpha.unix.BlockInCriticalSection (C)
Expand Down
2 changes: 1 addition & 1 deletion clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,7 @@ def StdCLibraryFunctionArgsChecker : Checker<"StdCLibraryFunctionArgs">,
"or is EOF.">,
Dependencies<[StdCLibraryFunctionsChecker]>,
WeakDependencies<[CallAndMessageChecker, NonNullParamChecker, StreamChecker]>,
Documentation<NotDocumented>;
Documentation<HasAlphaDocumentation>;

} // end "alpha.unix"

Expand Down
10 changes: 5 additions & 5 deletions clang/include/clang/StaticAnalyzer/Checkers/SValExplainer.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
std::string Str;
llvm::raw_string_ostream OS(Str);
S->printPretty(OS, nullptr, PrintingPolicy(ACtx.getLangOpts()));
return OS.str();
return Str;
}

bool isThisObject(const SymbolicRegion *R) {
Expand Down Expand Up @@ -69,7 +69,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
std::string Str;
llvm::raw_string_ostream OS(Str);
OS << "concrete memory address '" << I << "'";
return OS.str();
return Str;
}

std::string VisitNonLocSymbolVal(nonloc::SymbolVal V) {
Expand All @@ -82,7 +82,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
llvm::raw_string_ostream OS(Str);
OS << (I.isSigned() ? "signed " : "unsigned ") << I.getBitWidth()
<< "-bit integer '" << I << "'";
return OS.str();
return Str;
}

std::string VisitNonLocLazyCompoundVal(nonloc::LazyCompoundVal V) {
Expand Down Expand Up @@ -123,7 +123,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
OS << "(" << Visit(S->getLHS()) << ") "
<< std::string(BinaryOperator::getOpcodeStr(S->getOpcode())) << " "
<< S->getRHS();
return OS.str();
return Str;
}

// TODO: IntSymExpr doesn't appear in practice.
Expand Down Expand Up @@ -177,7 +177,7 @@ class SValExplainer : public FullSValVisitor<SValExplainer, std::string> {
else
OS << "'" << Visit(R->getIndex()) << "'";
OS << " of " + Visit(R->getSuperRegion());
return OS.str();
return Str;
}

std::string VisitNonParamVarRegion(const NonParamVarRegion *R) {
Expand Down
17 changes: 17 additions & 0 deletions clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,11 @@ ANALYZER_OPTION(bool, ShouldDisplayCheckerNameForText, "display-checker-name",
"Display the checker name for textual outputs",
true)

ANALYZER_OPTION(bool, ShouldSupportSymbolicIntegerCasts,
"support-symbolic-integer-casts",
"Produce cast symbols for integral types.",
false)

ANALYZER_OPTION(
bool, ShouldConsiderSingleElementArraysAsFlexibleArrayMembers,
"consider-single-element-arrays-as-flexible-array-members",
Expand All @@ -336,6 +341,18 @@ ANALYZER_OPTION(
"might be modeled by the analyzer to never return NULL.",
false)

ANALYZER_OPTION(
bool, ShouldIgnoreBisonGeneratedFiles, "ignore-bison-generated-files",
"If enabled, any files containing the \"/* A Bison parser, made by\" "
"won't be analyzed.",
true)

ANALYZER_OPTION(
bool, ShouldIgnoreFlexGeneratedFiles, "ignore-flex-generated-files",
"If enabled, any files containing the \"/* A lexical scanner generated by "
"flex\" won't be analyzed.",
true)

//===----------------------------------------------------------------------===//
// Unsigned analyzer options.
//===----------------------------------------------------------------------===//
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
//===- CallDescription.h - function/method call matching --*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
/// \file This file defines a generic mechanism for matching for function and
/// method calls of C, C++, and Objective-C languages. Instances of these
/// classes are frequently used together with the CallEvent classes.
//
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CALLDESCRIPTION_H
#define LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CALLDESCRIPTION_H

#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/Optional.h"
#include "llvm/Support/Compiler.h"
#include <vector>

namespace clang {
class IdentifierInfo;
} // namespace clang

namespace clang {
namespace ento {

enum CallDescriptionFlags : unsigned {
CDF_None = 0,

/// Describes a C standard function that is sometimes implemented as a macro
/// that expands to a compiler builtin with some __builtin prefix.
/// The builtin may as well have a few extra arguments on top of the requested
/// number of arguments.
CDF_MaybeBuiltin = 1 << 0,
};

/// This class represents a description of a function call using the number of
/// arguments and the name of the function.
class CallDescription {
friend class CallEvent;
using MaybeCount = Optional<unsigned>;

mutable Optional<const IdentifierInfo *> II;
// The list of the qualified names used to identify the specified CallEvent,
// e.g. "{a, b}" represent the qualified names, like "a::b".
std::vector<std::string> QualifiedName;
MaybeCount RequiredArgs;
MaybeCount RequiredParams;
int Flags;

public:
/// Constructs a CallDescription object.
///
/// @param QualifiedName The list of the name qualifiers of the function that
/// will be matched. The user is allowed to skip any of the qualifiers.
/// For example, {"std", "basic_string", "c_str"} would match both
/// std::basic_string<...>::c_str() and std::__1::basic_string<...>::c_str().
///
/// @param RequiredArgs The number of arguments that is expected to match a
/// call. Omit this parameter to match every occurrence of call with a given
/// name regardless the number of arguments.
CallDescription(CallDescriptionFlags Flags,
ArrayRef<const char *> QualifiedName,
MaybeCount RequiredArgs = None,
MaybeCount RequiredParams = None);

/// Construct a CallDescription with default flags.
CallDescription(ArrayRef<const char *> QualifiedName,
MaybeCount RequiredArgs = None,
MaybeCount RequiredParams = None);

CallDescription(std::nullptr_t) = delete;

/// Get the name of the function that this object matches.
StringRef getFunctionName() const { return QualifiedName.back(); }

/// Get the qualified name parts in reversed order.
/// E.g. { "std", "vector", "data" } -> "vector", "std"
auto begin_qualified_name_parts() const {
return std::next(QualifiedName.rbegin());
}
auto end_qualified_name_parts() const { return QualifiedName.rend(); }

/// It's false, if and only if we expect a single identifier, such as
/// `getenv`. It's true for `std::swap`, or `my::detail::container::data`.
bool hasQualifiedNameParts() const { return QualifiedName.size() > 1; }

/// @name Matching CallDescriptions against a CallEvent
/// @{

/// Returns true if the CallEvent is a call to a function that matches
/// the CallDescription.
///
/// \note This function is not intended to be used to match Obj-C method
/// calls.
bool matches(const CallEvent &Call) const;

/// Returns true whether the CallEvent matches on any of the CallDescriptions
/// supplied.
///
/// \note This function is not intended to be used to match Obj-C method
/// calls.
friend bool matchesAny(const CallEvent &Call, const CallDescription &CD1) {
return CD1.matches(Call);
}

/// \copydoc clang::ento::matchesAny(const CallEvent &, const CallDescription &)
template <typename... Ts>
friend bool matchesAny(const CallEvent &Call, const CallDescription &CD1,
const Ts &...CDs) {
return CD1.matches(Call) || matchesAny(Call, CDs...);
}
/// @}
};

/// An immutable map from CallDescriptions to arbitrary data. Provides a unified
/// way for checkers to react on function calls.
template <typename T> class CallDescriptionMap {
friend class CallDescriptionSet;

// Some call descriptions aren't easily hashable (eg., the ones with qualified
// names in which some sections are omitted), so let's put them
// in a simple vector and use linear lookup.
// TODO: Implement an actual map for fast lookup for "hashable" call
// descriptions (eg., the ones for C functions that just match the name).
std::vector<std::pair<CallDescription, T>> LinearMap;

public:
CallDescriptionMap(
std::initializer_list<std::pair<CallDescription, T>> &&List)
: LinearMap(List) {}

template <typename InputIt>
CallDescriptionMap(InputIt First, InputIt Last) : LinearMap(First, Last) {}

~CallDescriptionMap() = default;

// These maps are usually stored once per checker, so let's make sure
// we don't do redundant copies.
CallDescriptionMap(const CallDescriptionMap &) = delete;
CallDescriptionMap &operator=(const CallDescription &) = delete;

CallDescriptionMap(CallDescriptionMap &&) = default;
CallDescriptionMap &operator=(CallDescriptionMap &&) = default;

LLVM_NODISCARD const T *lookup(const CallEvent &Call) const {
// Slow path: linear lookup.
// TODO: Implement some sort of fast path.
for (const std::pair<CallDescription, T> &I : LinearMap)
if (I.first.matches(Call))
return &I.second;

return nullptr;
}
};

/// An immutable set of CallDescriptions.
/// Checkers can efficiently decide if a given CallEvent matches any
/// CallDescription in the set.
class CallDescriptionSet {
CallDescriptionMap<bool /*unused*/> Impl = {};

public:
CallDescriptionSet(std::initializer_list<CallDescription> &&List);

CallDescriptionSet(const CallDescriptionSet &) = delete;
CallDescriptionSet &operator=(const CallDescription &) = delete;

LLVM_NODISCARD bool contains(const CallEvent &Call) const;
};

} // namespace ento
} // namespace clang

#endif // LLVM_CLANG_STATICANALYZER_CORE_PATHSENSITIVE_CALLDESCRIPTION_H
Loading

0 comments on commit 2ea4f20

Please sign in to comment.