Amazon Security Lake integration - Logstash #135

AlexRuiz7 · 2024-01-19T17:11:29Z

Description

Wazuh's Amazon Security Lake integration as source will use Logstash as a data forwarder. The data has to be forwarded from Wazuh's indices to an Amazon S3 bucket. Logstash provide input and output plugins that will allow us to do that.

Tasks

Implement a Logstash pipeline to send events from an index to an S3 bucket.
Document the process for reproducibility.

AlexRuiz7 · 2024-01-19T18:19:49Z

Follow the Wazuh indexer integration using Logstash to install Logstash and the logstash-input-opensearch plugin.

RPM: https://www.elastic.co/guide/en/logstash/current/installing-logstash.html#_yum

# Install Logstash
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
echo "[logstash-8.x]" >> /etc/yum.repos.d/logstash.repo
echo "name=Elastic repository for 8.x packages" >> /etc/yum.repos.d/logstash.repo
echo "baseurl=https://artifacts.elastic.co/packages/8.x/yum" >> /etc/yum.repos.d/logstash.repo
echo "gpgcheck=1" >> /etc/yum.repos.d/logstash.repo
echo "gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch" >> /etc/yum.repos.d/logstash.repo
echo "enabled=1" >> /etc/yum.repos.d/logstash.repo
echo "autorefresh=1" >> /etc/yum.repos.d/logstash.repo
echo "type=rpm-md" >> /etc/yum.repos.d/logstash.repo
sudo yum install logstash

# Install plugins (logstash-output-s3 is already installed)
sudo /usr/share/logstash/bin/logstash-plugin install logstash-input-opensearch # logstash-output-s3

# Copy certificates
mkdir -p /etc/logstash/wi-certs/
cp /etc/wazuh-indexer/certs/root-ca.pem /etc/logstash/wi-certs/root-ca.pem
chown logstash:logstash /etc/logstash/wi-certs/root-ca.pem

# Configuring new indexes
SKIP

# Configuring a pipeline

# Keystore
## Prepare keystore
set +o history
echo 'LOGSTASH_KEYSTORE_PASS="123456"'| sudo tee /etc/sysconfig/logstash
export LOGSTASH_KEYSTORE_PASS=123456
set -o history
sudo chown root /etc/sysconfig/logstash
sudo chmod 600 /etc/sysconfig/logstash
sudo systemctl start logstash

## Create keystore
sudo -E /usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash create

## Store Wazuh indexer credentials (admin user)
sudo -E /usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash add WAZUH_INDEXER_USERNAME
sudo -E /usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash add WAZUH_INDEXER_PASSWORD

# Pipeline
sudo touch /etc/logstash/conf.d/wazuh-s3.conf
# Replace with cp /vagrant/wazuh-s3.conf /etc/logstash/conf.d/wazuh-s3.conf
sudo systemctl stop logstash
sudo -E /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/wazuh-s3.conf --path.settings /etc/logstash/
    |- Success: `[INFO ][logstash.agent           ] Pipelines running ...`

# Start Logstash
sudo systemctl enable logstash
sudo systemctl start logstash

Output

[root@rhel7 vagrant]# sudo -E /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/wazuh-s3.conf --path.settings /etc/logstash/
Using bundled JDK: /usr/share/logstash/jdk
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/concurrent-ruby-1.1.9/lib/concurrent-ruby/concurrent/executor/java_thread_pool_executor.rb:13: warning: method redefined; discarding old to_int
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/concurrent-ruby-1.1.9/lib/concurrent-ruby/concurrent/executor/java_thread_pool_executor.rb:13: warning: method redefined; discarding old to_f
Sending Logstash logs to /var/log/logstash which is now configured via log4j2.properties
[2024-01-25T16:45:01,461][INFO ][logstash.runner          ] Log4j configuration path used is: /etc/logstash/log4j2.properties
[2024-01-25T16:45:01,462][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"8.12.0", "jruby.version"=>"jruby 9.4.5.0 (3.1.4) 2023-11-02 1abae2700f OpenJDK 64-Bit Server VM 17.0.9+9 on 17.0.9+9 +indy +jit [x86_64-linux]"}
[2024-01-25T16:45:01,464][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, -Dlogstash.jackson.stream-read-constraints.max-string-length=200000000, -Dlogstash.jackson.stream-read-constraints.max-number-length=10000, -Djruby.regexp.interruptible=true, -Djdk.io.File.enableADS=true, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[2024-01-25T16:45:01,465][INFO ][logstash.runner          ] Jackson default value override `logstash.jackson.stream-read-constraints.max-string-length` configured to `200000000`
[2024-01-25T16:45:01,465][INFO ][logstash.runner          ] Jackson default value override `logstash.jackson.stream-read-constraints.max-number-length` configured to `10000`
[2024-01-25T16:45:01,611][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2024-01-25T16:45:02,107][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2024-01-25T16:45:02,535][INFO ][org.reflections.Reflections] Reflections took 114 ms to scan 1 urls, producing 132 keys and 468 values
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/amazing_print-1.5.0/lib/amazing_print/formatter.rb:37: warning: previous definition of cast was here
/usr/share/logstash/vendor/bundle/jruby/3.1.0/gems/nokogiri-1.15.5-java/lib/nokogiri/xml/node.rb:1007: warning: method redefined; discarding old attr
[2024-01-25T16:45:03,881][INFO ][logstash.codecs.json     ] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[2024-01-25T16:45:03,907][INFO ][logstash.javapipeline    ] Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[2024-01-25T16:45:26,616][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>4, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>500, "pipeline.sources"=>["/etc/logstash/conf.d/wazuh-s3.conf"], :thread=>"#<Thread:0x33eda5f2 /usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2024-01-25T16:45:27,017][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.4}
[2024-01-25T16:45:27,426][INFO ][logstash.inputs.opensearch][main] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[2024-01-25T16:45:27,427][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2024-01-25T16:45:27,439][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

Bibliography

AlexRuiz7 · 2024-01-25T13:09:12Z

Enabled server-side encryption: https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html

f-galland · 2024-01-30T18:23:16Z

We analyzed the option of writing a custom ruby based filter for logstash that would transcode to parquet, but the s3 output plugin from logstash doesn't support parquet's binary file format.

Just for reference, in order to run parquet encoding in ruby, some dependencies are needed under ubuntu/debian:

sudo apt update
sudo apt install -y -V ca-certificates lsb-release wget ruby-dev build-essential
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt update
sudo apt install -y -V libarrow-dev # For C++
gem install red-arrow
gem install red-parquet

Parquet output can be generated from a json file as follows:

#!/usr/bin/env ruby

require 'arrow'
require 'parquet'


table = Arrow::Table.load("test.json", format: :json)

table.save("output.parquet")

AlexRuiz7 · 2024-01-31T12:16:25Z

Conclusions

We've got a base for the Logstash's pipeline and have verified it works. We'll evolve the pipeline depending on the chosen proposal to transform the data.

Check #145

AlexRuiz7 added level/task Task issue type/research Research issue labels Jan 19, 2024

AlexRuiz7 self-assigned this Jan 19, 2024

AlexRuiz7 mentioned this issue Jan 19, 2024

Amazon Security lake integration as source #128

Closed

6 tasks

wazuhci added this to Release 4.9.0 Jan 19, 2024

wazuhci moved this to In progress in Release 4.9.0 Jan 19, 2024

AlexRuiz7 mentioned this issue Jan 25, 2024

Init. Amazon Security Lake integration #143

Merged

7 tasks

wazuhci moved this from In progress to On hold in Release 4.9.0 Jan 26, 2024

wazuhci moved this from On hold to In progress in Release 4.9.0 Jan 30, 2024

wazuhci moved this from In progress to On hold in Release 4.9.0 Jan 30, 2024

AlexRuiz7 mentioned this issue Jan 31, 2024

Amazon Security Lake integration - DTD - AWS Lambda #146

Closed

AlexRuiz7 closed this as completed in #143 Jan 31, 2024

wazuhci moved this from On hold to Done in Release 4.9.0 Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon Security Lake integration - Logstash #135

Amazon Security Lake integration - Logstash #135

AlexRuiz7 commented Jan 19, 2024

AlexRuiz7 commented Jan 19, 2024 •

edited

Loading

AlexRuiz7 commented Jan 25, 2024

f-galland commented Jan 30, 2024

AlexRuiz7 commented Jan 31, 2024

Amazon Security Lake integration - Logstash #135

Amazon Security Lake integration - Logstash #135

Comments

AlexRuiz7 commented Jan 19, 2024

Description

Tasks

AlexRuiz7 commented Jan 19, 2024 • edited Loading

Bibliography

AlexRuiz7 commented Jan 25, 2024

f-galland commented Jan 30, 2024

AlexRuiz7 commented Jan 31, 2024

Conclusions

AlexRuiz7 commented Jan 19, 2024 •

edited

Loading