-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to filter dataset by fields or regex #117
Comments
Out of curiosity - is this functionality and those referenced / related intended for the TUI only? |
Good question, the original thought was to make these filters for the terminal, however, I didn't think much about having them available in the HTML output. Not sure yet how this would work, perhaps allowing the user to set initial filters in the config file or since there are plans to have the HTML output be real-time, have some sort of subset filtering in the client side. Any thoughts? |
A small related note - I think most of the rapid requests that are coming in for additional functionality, aggregation and related UI - would be better grouped in a separate argument. For example:
for Rich-User-Interface. Thereafter and into the future it can be included as part of standard views if its common to most user expectations or perhaps adaptively enabled based on the log-file and the scheme therein that matches RUI options. Regarding HTML - if you dont mind using jQuery & DataTables then for the specific purposes of sort / filter I'd recommend: Its a If however you do not wish to have such dependencies - then we have our work cut out :-D |
I am also waiting for this. It would be great to have it. |
@kyberorg you can help if you're that keen. |
@imclean557 I'm sure there's plenty of people willing to help if there was an active branch. Your comment is most unhelpful. |
+1 |
Just to add some usage context and workaround on this. I have a cluster of web servers, and I run goaccess like this on my central syslog-ng machine: goaccess /var/log/hosts/*/nginx/*.log \
--log-format='%^:%^:%^:%^: %v %h %^[%d:%t %^] "%r" %s %b %L "%R" "%u"' \
--date-format=%d/%b/%Y --time-format=%T --persist --restore \
--db-path /var/goaccess/db -o /var/goaccess/www/index.html \
-o /var/goaccess/www/report.json This aggregate all requests of the cluster and produce one global report. The current issue would need to be implemented to be able to select which vhost to see in the main report. As a workaround, I create "per-vhost" logs like this: VHOSTS="vods.kuon.ch www.kuon.ch"
for f in /var/log/hosts/*/nginx/*.log
do
for vhost in $VHOSTS
do
# Create destination directory
host=$(basename $(dirname $(dirname $f)))
outdir=/var/goaccess/vhosts/$vhost/$host
mkdir -p $outdir
out=$outdir/$(basename $f)
# Filter logs
# NOTE: $vhost will be matched as regex, you may need escaping
rg "^\w+ \d+ \d+:\d+:\d+ \S+ \S+ \S+ access: $vhost " $f > $out
done
done
# Remove empty logfiles
find /var/goaccess/vhosts -size 0 -delete
# Remove empty dirs
find /var/goaccess/vhosts -type d -empty -delete
for vhost in $VHOSTS
do
db=/var/goaccess/db_vhosts/$vhost
mkdir -p $db
out=/var/goaccess/www/$vhost/
mkdir -p $out
goaccess /var/goaccess/vhosts/$vhost/*/*.log \
--log-format='%^:%^:%^:%^: %v %h %^[%d:%t %^] "%r" %s %b %L "%R" "%u"' \
--date-format=%d/%b/%Y --time-format=%T --persist --restore \
--db-path $db -o $out/index.html \
-o $out/report.json
done
It is a bit "quick & dirty" but it works for the time being. |
To anyone still following this thread, I found it way easier to just use promtail and grafana, instead of reinventing the wheel with goaccess log parsing, storage, etc. |
I beg to differ. I had a setup with grafana and loki but it was very hard to get some particular insight. Sure, you can have one very nice panel with the stats, that you can look at, but it doesn't really tell you anything. With Also, I switched to syslog-ng and it is so much better and easier than all new fancy solutions like promtail. Don't get me wrong, I get why all those solutions exists (having to route logs through the internet, better scalability...), but for our use at our size, plain log files are just easier. I don't think those tools are exclusive. You can use Finally this kind of setup depends on many things, the number of servers, the criticality of the mission, the size of the team, the skills of the team... I can only advice on trying what fits your situation best. |
My use case is described in #2599 (I persist the database on-disk, so there is currently no way to remove a visitor from the I understand that this issue is trying to be "generic" (i.e. being able to filter based on any field), and real-time (i.e. ability to set a filter from TUI, command-line, or HTML report) - but I feel this the scope is too wide to actually be actionable/possible to implement (need to write different filter mechanisms for the TUI/CLI/HTML interfaces...) @allinurl I think it would be good to establish a list of what users actually expect to achieve with this feature. For me, a simple |
Its been a while since I started monitoring this but I think my main requirement was also to exclude certain fixed IPs from my monitoring host from appearing on the list. |
@nodiscc, good observation. As I previously explained, it's currently not practical to implement a direct "exclude-ip-from-report" functionality when retrieving data from the persisted store because, at that stage, the data has already been processed. To introduce this feature, we need to restructure how data is stored, making it a bit more complex than a straightforward filtering process. Although there are challenges, progress is being made and will be out sooner than later. @Hufschmidt, you can achieve exclusion using |
Wait for a date filter for a long time |
@bear0330 hard at work on this feature! wait won't be in vain ;) |
I am also eagerly awaiting this enhancement. I wish i could filter the HTML-Report at least by date (range), to be able to show stats for a specific date (i use persistence and my reports include several days/months). |
Add the ability to filter the results within the UI (Terminal & HTML) - e.g. filter by fields such as host, request, etc. then display only data matching that filter criteria, or enter a regex to match in the request and restrict display to only those matching entries.
Ideally this would spin up a new thread so multiple datasets can be analyzed at the same time. Each dataset should live on its own dashboard.
The text was updated successfully, but these errors were encountered: