Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mmap buffer, add mmap buffer support to access log and userspace utility (#537) #2273

Merged
merged 120 commits into from
Nov 28, 2024

Conversation

ai-tmpst
Copy link
Contributor

No description provided.

@ai-tmpst ai-tmpst requested a review from krizhanovsky October 30, 2024 11:13
In #537 we need a way to deliver log data to userspace. Introduce a set of
per-cpu ring buffer mapped to userspace.

Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
__alloc_percpu_gfp() is not required, we can use alloc_percpu_gfp().

Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
Map the data area consecutively, allowing seamless handling of cases where
reads or writes cross the buffer's end boundary.

Additionally, code review comments from the previous patch have been
addressed.

Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
This patch introduces support for access log output via mmap buffer.

The `access_log` option in the configuration file can now be set to
`dmesg`, `mmap`, or both. Additionally, a new `mmap_log_buffer_size`
option enables configuration of the mmap buffer size.

Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
@ai-tmpst ai-tmpst force-pushed the ai-537-2 branch 2 times, most recently from b4a80eb to da7dbac Compare October 30, 2024 11:41
This patch introduces a userspace utility that reads access logs from an
mmap buffer, which is mapped to `/dev/tempesta_mmap_log`, and forwards
log messages to a ClickHouse server.

Signed-off-by: Alexander Ivanov <ai@tempesta-tech.com>
Copy link
Contributor

@krizhanovsky krizhanovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least the user space code must be heavily reworked. Please reuse any code from https://github.com/tempesta-tech/manager/tree/main , which could be useful.

Also please do test the solution on multi-cpu instance (at least 4 CPUs) under a load generator for at least a couple of minutes.

utils/clickhouse.cpp Outdated Show resolved Hide resolved
utils/tfw_logger.cpp Outdated Show resolved Hide resolved
utils/tfw_logger.cpp Outdated Show resolved Hide resolved
utils/Makefile Show resolved Hide resolved
utils/Makefile Outdated
@@ -0,0 +1,20 @@
CXX := g++
LDFLAGS := -lclickhouse-cpp-lib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's check if the library is installed in the Makefile and install if if not

In my setups (Ubuntu 24, were I built Tempesta FW before) I did

$ apt install ccache nasm lld
$ git clone --recursive https://github.com/ClickHouse/ClickHouse.git
$ git submodule sync --recursive
$ git submodule update --init --recursive
$ cd ClickHouse
$ mkdir build && cd build
$ cmake .. -DENABLE_LIBRARIES=1 -DENABLE_HDFS=0 -DENABLE_ROCKSDB=0
$ make -j$(nproc)

Also on one of the setups I had old rustc, so I also had to update it:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
rustup update stable

...well, after that I had more issues with Rust, OpenSSL and other building issue, so let's just

  1. add Clickhouse library to https://tempesta-tech.com/knowledge-base/Requirements/ as an optional requirement
  2. add build option like NO_CLICKHOUSE, which allows to build Tempesta FW with no ClickHouse

Probably, this isn't the last time we build ClickHouse (at least we need to build it for CI), so it'd be alos useful to add a small subchapter in requirements how to build it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to build ClickHouse server from sources?
Maybe there is a mix-up in the projects? The user space utility requires https://github.com/ClickHouse/clickhouse-cpp.git.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My build fails without -lzstd -llz4 -lcityhash

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remember how I resolved it. I think, I compiled clickhouse-cpp with static libs, but it looks, it would be better to compile dynamically and add -lzstd -llz4 -lcityhash to the Makefile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed Makefile to build with the libraries from contrib directory in the building directory of clickhouse-cpp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried to build this branch under Ubuntu 24 LTS and got this problem:

....
[ 97%] Building CXX object clickhouse/CMakeFiles/clickhouse-cpp-lib.dir/client.cpp.o
[ 98%] Building CXX object clickhouse/CMakeFiles/clickhouse-cpp-lib.dir/query.cpp.o
[100%] Linking CXX static library libclickhouse-cpp-lib.a
make[5]: Leaving directory '/root/tempesta/utils/clickhouse-cpp/build'
[100%] Built target clickhouse-cpp-lib
make[4]: Leaving directory '/root/tempesta/utils/clickhouse-cpp/build'
make[3]: Leaving directory '/root/tempesta/utils/clickhouse-cpp/build'
make[2]: Leaving directory '/root/tempesta/utils'
make tfw_logger
make[2]: Entering directory '/root/tempesta/utils'
g++ -O3 -Wall -Wextra -std=c++23 -DPAGE_SIZE=4096 -DSPDLOG_SHARED_LIB -DSPDLOG_COMPILED_LIB -DSPDLOG_FMT_EXTERNAL  -I clickhouse-cpp -c tfw_logger.cc -o tfw_logger.o
In file included from clickhouse-cpp/clickhouse/columns/column.h:3,
                 from clickhouse-cpp/clickhouse/block.h:3,
                 from clickhouse-cpp/clickhouse/query.h:3,
                 from clickhouse-cpp/clickhouse/client.h:3,
                 from clickhouse-cpp/clickhouse/base/endpoints_iterator.h:3,
                 from clickhouse-cpp/clickhouse/base/socket.h:6,
                 from tfw_logger.cc:35:
clickhouse-cpp/clickhouse/columns/../types/types.h:3:10: fatal error: absl/numeric/int128.h: No such file or directory
    3 | #include "absl/numeric/int128.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [Makefile:46: tfw_logger.o] Error 1
make[2]: Leaving directory '/root/tempesta/utils'
make[1]: *** [Makefile:40: all] Error 2
make[1]: Leaving directory '/root/tempesta/utils'
make: *** [Makefile:168: build] Error 2

Copy link
Contributor Author

@ai-tmpst ai-tmpst Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you install libabsl-dev?
On Ubuntu 22 I had such dependencies:

cmake
ninja-build
libabsl-dev
libboost-program-options1.74-dev
libfmt-dev
libspdlog-dev

utils/tfw_logger.cpp Outdated Show resolved Hide resolved
utils/tfw_logger.cpp Outdated Show resolved Hide resolved
utils/tfw_logger.cpp Outdated Show resolved Hide resolved
utils/tfw_logger.cpp Outdated Show resolved Hide resolved
utils/clickhouse.h Outdated Show resolved Hide resolved
utils/tfw_logger.cpp Outdated Show resolved Hide resolved
fw/mmap_buffer.c Outdated Show resolved Hide resolved
fw/access_log.h Outdated Show resolved Hide resolved
fw/access_log.c Outdated Show resolved Hide resolved
fw/access_log.c Outdated Show resolved Hide resolved
fw/access_log.c Outdated Show resolved Hide resolved
fw/access_log.c Outdated Show resolved Hide resolved
utils/clickhouse.h Outdated Show resolved Hide resolved
utils/mmap_buffer.h Outdated Show resolved Hide resolved
utils/mmap_buffer.h Outdated Show resolved Hide resolved
return
fi

utils/tfw_logger -H "$mmap_host" -l "$mmap_log" -u "$mmap_user" -p "$mmap_password" ||
Copy link
Contributor

@EvgeniiMekhanik EvgeniiMekhanik Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see log file in my system (when I start Tempesta FW). Can you check please.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I don't see any daemon in ps -A output

Copy link
Contributor

@EvgeniiMekhanik EvgeniiMekhanik Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check that daemon was started and work not only code return code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

- Destructor should not throw, but `client_->Insert(table_name_, block_);`
  can throw, so we should use try/catch to handle it.
- We should join threads to prevent programm termination.
- Use const for exceptions.

if (__atomic_load_n(&buf_->is_ready, __ATOMIC_ACQUIRE)) [[likely]] {
is_running_ = true;
r = read();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If read throw thread will be stopped, but other threads still work. Should we stop them also in case of error??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We told about it on the last meeting and decided that other threads should work. It's better to have part of logs than nothing.

@EvgeniiMekhanik
Copy link
Contributor

I catch client_->Insert(table_name_, block_); fails with DB::Exception: Table default.access_log does not exist. In this case thread finished with exception. Is it ok? SHould we skip this exception and try later? Also how to avid this problem.

Copy link
Contributor

@EvgeniiMekhanik EvgeniiMekhanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can merge, but we should implement new task with several TODO:

  1. Implement separate config in standart (may be jsonor so on format) for tfw_logger, not use Tempesta FW config for tfw_logger options.
  2. Devide errors which can occurs during tfw_logger work into two parts. Critical errors when we stop all daemon threads and daemon (now in case of any error we stop only one thread) when it occurs. Not critical error which we log in special file and continue to work.
  3. During loading Tempesta FW check that daemon works not only return code of starting script, because currently we can catch exception if clickhouse is not started one second later.
  4. Not try to start daemon on each Tempesta reload
  5. May be implement simple failower for this daemon to restart it if it crash

Check the daemon PID file after tfw_logger execution. If file doesn't
appear in 3 seconds unload Tempesta modules and print error message.
If we have multiple access_log tags, only the last tag affects the
access_log_type variable. We need take into account all of such tags.
Don't set access_log_type variable to ACCESS_LOG_OFF at every tag
handling.

access_log tag can't be changed in the process of Tempesta FW work
so we don't have issues on reload.
In the next commits we will output errors from another place. Move this
code to a separate function.
Check if Tempesta FW state is not "start" before tfw_logger execution.
During a stop tfw_logger sometimes hangs on block insertion (on a recv()
call). Remove insertion code in TfwClickhouse destructor. Since we commit
block every 100 ms we don't lose many events.
Added reconnection logic for ClickHouse server after non-critical
exceptions. The timeout between reconnection attempts doubles up to
a defined limit and resets upon successful data delivery. In case of
a critical exception, the thread terminates.

Additionally, implemented a mechanism to prevent repeated logging of
the same errors.
@ai-tmpst
Copy link
Contributor Author

@EvgeniiMekhanik for some reason can't reply directly to your comments.

I catch client_->Insert(table_name_, block_); fails with DB::Exception: Table default.access_log does not exist. In this case thread finished with exception. Is it ok? SHould we skip this exception and try later? Also how to avid this problem.

Now such exceptions lead to reconnection. Also access_log table is created in TfwClickhouse constructor.

  1. Decided to leave it as it is.
  2. Done.
  3. Now we are waiting till PID file appears.
  4. Done.
  5. Left it for the future.

We should break loop if there is no connection
to clickhouse server, but `stop_flag` is set.
constexpr std::chrono::milliseconds max_wait(100);
constexpr size_t max_events = 1000;
constexpr char table_creation_querry[] = "CREATE TABLE IF NOT EXISTS "
"access_log (timestamp DateTime64, address IPv6, method UInt8, "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question why method is not string? It seems that number is internal for Tempesta and not useful for administrator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, integer is more lightweight. I described methods/numbers compliance in wiki.

@@ -0,0 +1,3 @@
#!/bin/bash

apt install -y cmake ninja-build libboost-program-options-dev libfmt-dev libspdlog-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this file be in /scripts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, should it?
I thought it is tfw_logger dependencies and it would be better to store the script in the same directory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's better to keep shell stuff separately. But only if it isn't the last change and it doesn't take long time

@krizhanovsky krizhanovsky merged commit ae73817 into master Nov 28, 2024
1 check was pending
@krizhanovsky krizhanovsky deleted the ai-537-2 branch November 28, 2024 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants