Kubernetes #103

mazhurin · 2022-01-19T15:14:59Z

Kubernetes deployment
Incident detector
Saving incidents request sets into cloud storage using s3 interface
Streaming challenges directly to ElasticSearch

Unit tests and linting fixes.

… configuration istead of full python model names.

model_path parameter

…e.training.model refers to the full module path of the Model class(not the Enum key).

Config fix for Model

Country and Host features. Stratified sampling(parameter 'max_samples_per_host'). Support for nested features in JSON parser(geoip feature fix).

Production tuning 1

* IPCache * Two ip caches: passed and pending * Docker image with unit tests complete. * The new version of spark-iforest * Kafka is up and running * Spark secret. * spark encryption configuration added * spark ssl for ui, standalone and history config. * self.__cache.count() replaced with head() * send to kafka by partition id * bug fix in accummulator reset * disable broadcast spark.sql.autoBroadcastJoinThreshold * Sliding window in postprocesing is optional. Set sliding_window config param to zero to disable sliding_window(defaut is also zero now) * Redis merge logging count() before and after * Postprocessing: rollback to sending challenge with collect() due to performance issues * append mode for Redis write. * Fix for default 'challenged' column set to 0. * Spark worker Dockerfile * pyspark bump to 2.4.7 * Argo dockerfile moved to dockerfiles folder. * spark 2.4.6 * S3 support. Alternating cache files instead of renaming. * No s3 deletion. * use_storage option for request_set_cache * Support for using kafka for sensitive data. * Support for send_by_partitions for sensitive data. * Support for github raw configs. Support for storing sensitive data in Kafka. * Git config support with ssh. * Kubernetes client mode spark deployment. * White list urls from dashboard url link. * Whitelisting IPs in preprocessing * Whitelisting IPs without UDF Co-authored-by: Maria Karanasou <karanasou@gmail.com>

client_mode in the config. Separate clearing_house connection.

Fixing whitelist ips. Switched to left_anti join.

* Kubernetes deployment. Not finished. * Jupyter notebook with spark. * Kafka ACL

* DB reader * SQL based incident detector * Attack detection and chunks removed from AttackDetection task. * Incident detector added. * Incident Labeler class. Tested in Jupyter notebook. * Optional scaling in AnomalyModel * Some fixes in Jupyter notebooks.

start in whitelist urls fix in sending to kafka

…is being detected. (#100)

…onal low rate attack detection. (#102)

mazhurin and others added 30 commits June 4, 2020 18:34

linting fixes

e8c4468

linting fixes

75672e5

Action renamed to Unit Tests

176c0ba

Not linting esretriever and spark/iforest

432a9fe

Merge pull request #13 from equalitie/action_fix

5c6ccd8

Unit tests and linting fixes.

Parameter model_path is implemented. ModelEnum names are used int the…

4a106e1

… configuration istead of full python model names.

Fix for ModelEnum in TrainingPipeline

8307b47

Merge pull request #14 from equalitie/model_path

3456190

model_path parameter

Rollback of the config change in model_path PR. Now, as before, engin…

8d17664

…e.training.model refers to the full module path of the Model class(not the Enum key).

Unused import removed.

3140cd3

Merge pull request #15 from equalitie/model_config_fix

be2e481

Config fix for Model

New features: country and host. (#20)

70ac66d

Country and Host features. Stratified sampling(parameter 'max_samples_per_host'). Support for nested features in JSON parser(geoip feature fix).

initial implementation of split: WIP

d5a7aa5

Service Provider implementation

0402d82

sample vectors for prediction pipeline

779d168

semi-functional kafka streaming for prediction + simulation script

9d88a1d

Functional prediction pipeline

85caf4f

client pipeline adjustments + license in new files

262a360

functional redis steps - store / retrieve (full isac)

a0934c6

moved tasks and pipelines in a separate package

086dc85

Fix spark config for redis

01f3122

flake8ing

179b907

flake8ing

6d444e9

remove training pipeline base

8e9d421

Add sample vectors for id_client1 to facilitate testing

d851f4d

doc and refactoring of vector simulation script

032a3fe

flake8ing

9a38f49

flake8ing - again..

32f11e5

put service provider in a separate file

316f629

unittests pt1

a93e9fb

mkaranasou and others added 29 commits April 17, 2021 14:32

troubleshooting missing features

bb89683

missing features in predictions schema

b0cfaa6

debugging merge with sensitive data

0205b1d

revert change

cb4292b

debugging prod

1483cd3

bugfix: message schema vs features schema

81955a9

remove logging

1c64502

logging for second issue

1f276ef

logging

afac851

apply whitelist ips

289aeeb

add features to json columns

a44c853

removing logs

1250408

TypeError: send_to_kafka() takes from 3 to 5 positional arguments but 6

4d40f81

reverting

2d93475

Merge pull request #83 from equalitie/production_tuning_1

aaafa66

Production tuning 1

Support for kafka connection configuration per client. (#85)

9a98e71

client_mode in the config. Separate clearing_house connection.

Fixing whitelist ips (#86)

ed134b4

Fixing whitelist ips. Switched to left_anti join.

Redis password optional. Sending predictions to client fix. (#87)

94ffeff

Kubernetes deployment (#92)

9887d34

* Kubernetes deployment. Not finished. * Jupyter notebook with spark. * Kafka ACL

Incident detection sql (#93)

0c94b75

* DB reader * SQL based incident detector * Attack detection and chunks removed from AttackDetection task. * Incident detector added. * Incident Labeler class. Tested in Jupyter notebook. * Optional scaling in AnomalyModel * Some fixes in Jupyter notebooks.

start in whitelist urls

af1c704

start in whitelist urls fix in sending to kafka

Kafka ACL and client onboarding added in README (#95)

8db8ec9

Support for sending challenged ips to Elasticsearch (#97)

ef2c73f

Support for sending passed challenged ips to Elasticsearch (#98)

5ee2661

Support for sending passed challenged ips to Elasticsearch (#99)

0f843ef

Dynamic threshold. We use a more aggressive threshold if an incident …

7f5501d

…is being detected. (#100)

Fix the labeler: save all columng of request_sets (#101)

e40266d

JAVA sdk downgraded to fix s3 issue. Incident detector null fix. Opti…

f5c4d3a

…onal low rate attack detection. (#102)

mazhurin merged commit e48390c into master Jan 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes #103

Kubernetes #103

mazhurin commented Jan 19, 2022

Kubernetes #103

Kubernetes #103

Conversation

mazhurin commented Jan 19, 2022