Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema Extraction Preprocessor #182

Merged
merged 58 commits into from
Aug 16, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
b1992bd
First iteration of schema extraction generator
Anilm3 Jul 6, 2023
5ce911e
Fix typo
Anilm3 Jul 6, 2023
1d2c651
Small improvements, basic type-inference
Anilm3 Jul 6, 2023
05d1f9c
Equals operator for schema pointer
Anilm3 Jul 7, 2023
5354ee2
Rough Preprocessor
Anilm3 Jul 7, 2023
a528cb4
Current work in progress
Anilm3 Jul 8, 2023
13d601f
Work in progress
Anilm3 Jul 9, 2023
2a108b9
Add objects for evaluation
Anilm3 Jul 9, 2023
06c8d19
End-to-end preprocessors (no parser)
Anilm3 Jul 10, 2023
b1b124a
Preprocessor parsing
Anilm3 Jul 10, 2023
8cccd44
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Jul 10, 2023
eaf8399
End-to-End schema inference through the context
Anilm3 Jul 10, 2023
422affe
Merge branch 'anilm3/schema-extract-preprocessor' of github.com:DataD…
Anilm3 Jul 10, 2023
568a307
Remove comment
Anilm3 Jul 10, 2023
0dad1d6
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Jul 11, 2023
56da350
Simple preprocessor test and schema json schema
Anilm3 Jul 12, 2023
c6c40e2
Remove type inference, add float and null types including helpers
Anilm3 Jul 12, 2023
439dfb8
Improve float and null values to/from parsers
Anilm3 Jul 12, 2023
930e29b
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Jul 26, 2023
0c6eb6e
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Jul 27, 2023
86d21b2
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Jul 27, 2023
c44a45e
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Jul 28, 2023
f65bb11
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Aug 2, 2023
a4d06b0
Add utils.cpp to build source
Anilm3 Aug 2, 2023
da1b381
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Aug 8, 2023
d6b0eb0
Use expressions on preprocessor, other fixes
Anilm3 Aug 8, 2023
94eb054
Cleanup merge mistakes
Anilm3 Aug 8, 2023
cd9f930
Stray mistake
Anilm3 Aug 8, 2023
9cc130c
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Aug 8, 2023
d9ac4bf
Use unique_ptr for generator pointer
Anilm3 Aug 8, 2023
9e05c68
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Aug 11, 2023
76db282
Merge branch 'anilm3/schema-extract-preprocessor' of github.com:DataD…
Anilm3 Aug 11, 2023
0aa249f
Fix incorrect header
Anilm3 Aug 11, 2023
727518c
Remove attributes
Anilm3 Aug 12, 2023
b5789e5
Add multiple tags per scalar, remove duplicate tools
Anilm3 Aug 12, 2023
1dd9128
Small benchmark for schema extraction
Anilm3 Aug 12, 2023
01089c6
Fixes
Anilm3 Aug 13, 2023
e83c452
Fix unordered comparison
Anilm3 Aug 13, 2023
5298723
Revert ranges
Anilm3 Aug 13, 2023
532ff94
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Aug 14, 2023
f72c262
Merge branch 'master' into anilm3/schema-extract-preprocessor
Anilm3 Aug 14, 2023
73b482d
Some tests
Anilm3 Aug 14, 2023
38ae222
More tests
Anilm3 Aug 14, 2023
a9f6638
More tests and add correct limits
Anilm3 Aug 14, 2023
a38ab96
Generalize json schema comparison
Anilm3 Aug 14, 2023
a35fe53
Basically no change
Anilm3 Aug 14, 2023
5afc7ea
Mostly unnecessary changes and a few more tests
Anilm3 Aug 15, 2023
9a299d7
Order-independent json equality
Anilm3 Aug 15, 2023
2cf6e44
Fix path on windows?
Anilm3 Aug 15, 2023
bb2513f
Another attempt...
Anilm3 Aug 15, 2023
087222f
Another attempt...
Anilm3 Aug 15, 2023
679672c
Minor improvements, more tests
Anilm3 Aug 15, 2023
c9dd556
Use std::variant
Anilm3 Aug 15, 2023
edb0689
Reduce allocations
Anilm3 Aug 15, 2023
b1697ad
Cleanup and small fixes
Anilm3 Aug 15, 2023
a10c4e0
Use json in infer_schema tool
Anilm3 Aug 15, 2023
c9e5486
More tests
Anilm3 Aug 16, 2023
2e01180
Address comments and parsing tests
Anilm3 Aug 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -78,13 +78,16 @@ set(LIBDDWAF_SOURCE
${libddwaf_SOURCE_DIR}/src/rule.cpp
${libddwaf_SOURCE_DIR}/src/ruleset_info.cpp
${libddwaf_SOURCE_DIR}/src/ip_utils.cpp
${libddwaf_SOURCE_DIR}/src/preprocessor.cpp
${libddwaf_SOURCE_DIR}/src/iterator.cpp
${libddwaf_SOURCE_DIR}/src/log.cpp
${libddwaf_SOURCE_DIR}/src/obfuscator.cpp
${libddwaf_SOURCE_DIR}/src/utils.cpp
${libddwaf_SOURCE_DIR}/src/waf.cpp
${libddwaf_SOURCE_DIR}/src/exclusion/input_filter.cpp
${libddwaf_SOURCE_DIR}/src/exclusion/object_filter.cpp
${libddwaf_SOURCE_DIR}/src/exclusion/rule_filter.cpp
${libddwaf_SOURCE_DIR}/src/generator/extract_schema.cpp
${libddwaf_SOURCE_DIR}/src/parser/common.cpp
${libddwaf_SOURCE_DIR}/src/parser/parser.cpp
${libddwaf_SOURCE_DIR}/src/parser/parser_v1.cpp
Expand Down
3 changes: 2 additions & 1 deletion include/ddwaf.h
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,8 @@ struct _ddwaf_result
ddwaf_object events;
/** Array of actions generated, this is guaranteed to be an array **/
ddwaf_object actions;
/** Map containing all derived objects in the format (address, value) **/
ddwaf_object derivatives;
/** Total WAF runtime in nanoseconds **/
uint64_t total_runtime;
};
Expand Down Expand Up @@ -332,7 +334,6 @@ ddwaf_object* ddwaf_object_invalid(ddwaf_object *object);
**/
ddwaf_object* ddwaf_object_null(ddwaf_object *object);


/**
* ddwaf_object_string
*
Expand Down
43 changes: 43 additions & 0 deletions schema/types.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
{
"title": "Serialized schema types for API Security",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$defs": {
"type": {
"oneOf": [
{
"enum": [0, 1, 2, 4, 8, 16],
"description": "scalar types"
},
{
"type": "array",
"description": "array of types",
"items": {
"$ref": "#"
},
"minItems": 1
},
{
"type": "object",
"description": "record type",
"additionalProperties": {
"$ref": "#"
}
}
]
},
"metadata": {
"type": "array",
"items": {
"type": "string",
"pattern": "^[A-Za-z][A-Za-z0-9\\.\\-\\_:\\/]{0,199}$"
},
"minItems": 1
}
},
"type": "array",
"prefixItems": [
{ "$ref": "#/$defs/type" },
{ "$ref": "#/$defs/metadata" }
],
"minItems": 1
}
33 changes: 30 additions & 3 deletions src/context.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,14 @@

namespace ddwaf {

DDWAF_RET_CODE context::run(
const ddwaf_object &newParameters, optional_ref<ddwaf_result> res, uint64_t timeout)
DDWAF_RET_CODE context::run(ddwaf_object &input, optional_ref<ddwaf_result> res, uint64_t timeout)
{
if (res.has_value()) {
ddwaf_result &output = *res;
output = DDWAF_RESULT_INITIALISER;
}

if (!store_.insert(newParameters)) {
if (!store_.insert(input, ruleset_->free_fn)) {
DDWAF_WARN("Illegal WAF call: parameter structure invalid!");
return DDWAF_ERR_INVALID_OBJECT;
}
Expand All @@ -48,8 +47,17 @@ DDWAF_RET_CODE context::run(

const event_serializer serializer(*ruleset_->event_obfuscator);

optional_ref<ddwaf_object> derived;
if (res.has_value()) {
ddwaf_result &output = *res;
ddwaf_object_map(&output.derivatives);
derived.emplace(output.derivatives);
}

memory::vector<ddwaf::event> events;
try {
eval_preprocessors(derived, deadline);

const auto &rules_to_exclude = filter_rules(deadline);
const auto &objects_to_exclude = filter_inputs(rules_to_exclude, deadline);
events = match(rules_to_exclude, objects_to_exclude, deadline);
Expand Down Expand Up @@ -96,6 +104,25 @@ const memory::unordered_map<rule *, filter_mode> &context::filter_rules(ddwaf::t
return rules_to_exclude_;
}

void context::eval_preprocessors(optional_ref<ddwaf_object> &derived, ddwaf::timer &deadline)
{
for (const auto &[id, preproc] : ruleset_->preprocessors) {
if (deadline.expired()) {
DDWAF_INFO("Ran out of time while evaluating preprocessors");
Anilm3 marked this conversation as resolved.
Show resolved Hide resolved
throw timeout_exception();
}

auto it = preprocessor_cache_.find(preproc.get());
if (it == preprocessor_cache_.end()) {
auto [new_it, res] =
preprocessor_cache_.emplace(preproc.get(), preprocessor::cache_type{});
it = new_it;
}

preproc->eval(store_, derived, it->second, deadline);
}
}

const memory::unordered_map<rule *, context::object_set> &context::filter_inputs(
const memory::unordered_map<rule *, filter_mode> &rules_to_exclude, ddwaf::timer &deadline)
{
Expand Down
11 changes: 7 additions & 4 deletions src/context.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ class context {
public:
using object_set = std::unordered_set<const ddwaf_object *>;

explicit context(ruleset::ptr ruleset) : ruleset_(std::move(ruleset)), store_(ruleset_->free_fn)
explicit context(ruleset::ptr ruleset) : ruleset_(std::move(ruleset))
{
rule_filter_cache_.reserve(ruleset_->rule_filters.size());
input_filter_cache_.reserve(ruleset_->input_filters.size());
Expand All @@ -41,8 +41,9 @@ class context {
context &operator=(context &&) = delete;
~context() = default;

DDWAF_RET_CODE run(const ddwaf_object &, optional_ref<ddwaf_result>, uint64_t);
DDWAF_RET_CODE run(ddwaf_object &, optional_ref<ddwaf_result>, uint64_t);

void eval_preprocessors(optional_ref<ddwaf_object> &derived, ddwaf::timer &deadline);
// These two functions below return references to internal objects,
// however using them this way helps with testing
const memory::unordered_map<rule *, filter_mode> &filter_rules(ddwaf::timer &deadline);
Expand All @@ -62,7 +63,9 @@ class context {
using input_filter = exclusion::input_filter;
using rule_filter = exclusion::rule_filter;

// Cache of filters and conditions
memory::unordered_map<preprocessor *, preprocessor::cache_type> preprocessor_cache_;

// Caches of filters and conditions
memory::unordered_map<rule_filter *, rule_filter::cache_type> rule_filter_cache_;
memory::unordered_map<input_filter *, input_filter::cache_type> input_filter_cache_;

Expand Down Expand Up @@ -94,7 +97,7 @@ class context_wrapper {
context_wrapper &operator=(context_wrapper &&) noexcept = delete;
context_wrapper &operator=(const context_wrapper &) = delete;

DDWAF_RET_CODE run(const ddwaf_object &data, optional_ref<ddwaf_result> res, uint64_t timeout)
DDWAF_RET_CODE run(ddwaf_object &data, optional_ref<ddwaf_result> res, uint64_t timeout)
{
memory::memory_resource_guard guard(&mr_);
return ctx_->run(data, res, timeout);
Expand Down
3 changes: 3 additions & 0 deletions src/context_allocator.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
#pragma once

#include "memory_resource.hpp"
#include <list>
#include <string>
#include <unordered_map>
#include <unordered_set>
Expand Down Expand Up @@ -104,4 +105,6 @@ using unordered_map =
template <class T, class Hash = std::hash<T>, class Pred = std::equal_to<T>>
using unordered_set = std::unordered_set<T, Hash, Pred, context_allocator<T>>;

template <class T> using list = std::list<T, context_allocator<T>>;

} // namespace ddwaf::memory
2 changes: 1 addition & 1 deletion src/exclusion/object_filter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ memory::unordered_set<const ddwaf_object *> object_filter::match(
continue;
}

const auto *object = store.get_target(target);
auto *object = store.get_target(target);
if (object == nullptr) {
continue;
}
Expand Down
2 changes: 1 addition & 1 deletion src/expression.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ std::optional<event::match> expression::evaluator::eval_condition(
continue;
}

const ddwaf_object *object = store.get_target(target.root);
auto *object = store.get_target(target.root);
if (object == nullptr) {
continue;
}
Expand Down
32 changes: 32 additions & 0 deletions src/generator/base.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
// Unless explicitly stated otherwise all files in this repository are
// dual-licensed under the Apache-2.0 License or BSD-3-Clause License.
//
// This product includes software developed at Datadog (https://www.datadoghq.com/).
// Copyright 2021 Datadog, Inc.

#pragma once

#include <memory>
#include <string>
#include <string_view>
#include <vector>

#include <utils.hpp>

namespace ddwaf::generator {

class base {
public:
using ptr = std::unique_ptr<base>;

base() = default;
virtual ~base() = default;
base(const base &) = default;
base(base &&) = default;
base &operator=(const base &) = default;
base &operator=(base &&) = default;

virtual ddwaf_object generate(const ddwaf_object *input) = 0;
};

} // namespace ddwaf::generator
Loading