Skip to content

Commit

Permalink
Add Python module to accomplish OCSF compliant events (#159)
Browse files Browse the repository at this point in the history
* Adding Python script that receives a continuous json stream over stdin and outputs parquet to Security Lake

* Adding logstash pipeline for python script

* encode_parquet() function fixed to handle lists of dictionaries

* Correct error in encode_parquet()

* Avoid storing the block ending in the output buffer

* Add comments on handling files and streams with pyarrow for future reference

* Add s3 handling reference links

* Write parquet directly to bucket

* Added basics of map_to_ocsf() function

* Minor fixes

* Map alerts to OCSF as they are read

* Add script to convert Wazuh events to OCSF

Also adds a simple test script

* Add OCSF converter + Parquet encoder + test scripts

* Update .gitignore

* Include the contents of the alert under unmapped

* Add support for different OCSF schema versions

* Use custom ocsf module to map alerts

* Modify script to use converter class

* Code polish and fix errors

* Remove unnecessary type declaration from debug flag

* Improved parquet encoding

* Initial commit for test env's docker-compose.yml

* Remove sudo references from docker-compose.yml

* Add operational Python module to transform events to OCSF

* Create minimal Docker environment to test and develop the integration.

* Fix events-generator's Inventory starvation

* Remove files present in #147

* Cleanup

* Add FQDN hostnames to services for certificates creation

* Add S3 Ninja (Mock) (#165)

* Setup certificates in Wazuh Indexer and Logstash containers (#166)

* Add certificate generator service

* Add certificate config to docker compose file

* Use secrets for certificates

* Disable permission handling inside cert's generator entrypoint.sh

* Back to using a bind mount for certs

* Have entrypoint.sh generate certs with 1000:1000 ownership

* Correct certificate permissions and bind mounting

* Add security initialization variable to compose file

* Fix permissions on certs generator entrypoint

* Add cert generator config file

* Remove old cert generator dir

* Set indexer hostname right in pipeline file

* Roll back commented code

---------

Signed-off-by: Álex Ruiz <alejandro.ruiz.becerra@wazuh.com>
Co-authored-by: Álex Ruiz <alejandro.ruiz.becerra@wazuh.com>

* Fix Logstash pipelines

* Remove unused file

* Implement OCSF severity normalize function

---------

Signed-off-by: Álex Ruiz <alejandro.ruiz.becerra@wazuh.com>
Co-authored-by: Fede Tux <federico.galland@wazuh.com>
Co-authored-by: Federico Gustavo Galland <99492720+f-galland@users.noreply.github.com>
  • Loading branch information
3 people committed Sep 9, 2024
1 parent a536560 commit 08c288f
Show file tree
Hide file tree
Showing 26 changed files with 1,044 additions and 12 deletions.
6 changes: 6 additions & 0 deletions integrations/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
elastic
opensearch
splunk
common
config
docker/certs
37 changes: 37 additions & 0 deletions integrations/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
## Wazuh indexer integrations

This folder contains integrations with third-party XDR, SIEM and cybersecurity software.
The goal is to transport Wazuh's analysis to the platform that suits your needs.

### Amazon Security Lake

Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers,
on premises, and cloud sources into a purpose-built data lake stored in your account. With Security Lake,
you can get a more complete understanding of your security data across your entire organization. You can
also improve the protection of your workloads, applications, and data. Security Lake has adopted the
Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service normalizes
and combines security data from AWS and a broad range of enterprise security data sources.

##### Usage

A demo of the integration can be started using the content of this folder and Docker.

```console
docker compose -f ./docker/amazon-security-lake.yml up -d
```

This docker compose project will bring a *wazuh-indexer* node, a *wazuh-dashboard* node,
a *logstash* node and our event generator. On the one hand, the event generator will push events
constantly to the indexer. On the other hand, logstash will constantly query for new data and
deliver it to the integration Python program, also present in that node. Finally, the integration
module will prepare and send the data to the Amazon Security Lake's S3 bucket.
<!-- TODO continue with S3 credentials setup -->

For production usage, follow the instructions in our documentation page about this matter.
(_when-its-done_)

As a last note, we would like to point out that we also use this Docker environment for development.

### Other integrations

TBD
180 changes: 180 additions & 0 deletions integrations/amazon-security-lake/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
wazuh-event.ocsf.json
*.parquet
Dockerfile

# Created by https://www.toptal.com/developers/gitignore/api/python
# Edit at https://www.toptal.com/developers/gitignore?templates=python

### Python ###
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

### Python Patch ###
# Poetry local configuration file - https://python-poetry.org/docs/configuration/#local-configuration
poetry.toml

# ruff
.ruff_cache/

# LSP config files
pyrightconfig.json

# End of https://www.toptal.com/developers/gitignore/api/python
Loading

0 comments on commit 08c288f

Please sign in to comment.