Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F/unset value #1108

Merged
merged 53 commits into from
Jul 16, 2016
Merged

F/unset value #1108

merged 53 commits into from
Jul 16, 2016

Conversation

bazsi
Copy link
Collaborator

@bazsi bazsi commented Jul 4, 2016

This branch implements the unset() rewrite operation and at the same time incorporates a number of cleanups on the serialization code.

I also have another 20 patches to further the cleanup and to introduce a number of performance improvements of the same, but I've figured it is easier to break them in to to facilitate review.

@juhaszviktor as the original author of the serialization code, could you pls help me in reviewing this?

Attn: @csabamajor

@bazsi
Copy link
Collaborator Author

bazsi commented Jul 4, 2016

The mentioned followup patches can be found here:

https://github.com/balabit/syslog-ng/tree/f/logmsg-serialize-performance

I am submitting a pull request once this goes in.

@juhaszviktor
Copy link
Contributor

In short, LGTM 👍
I really liked reading patches and I think the result is very good, Thanks!
I've just some (sub)minor notes, but in generally it get my +1.

We have to investigate why the travis job fails, but at first look, it is independent from your modifications

return FALSE;
/* entry as a whole is inside the allocated NVTable */
if (GPOINTER_TO_UINT(entry) + entry->alloc_len > GPOINTER_TO_UINT(nvtable) + nvtable->size)
return FALSE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these checks, validations should be in separated static inline functions, with good names and in that case we have not need for the comments.
But this is a very minor note

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with this one, however these checks are all interdependent, so I felt this one is easier to follow than to extract conditions in separate functions.

We basically check if the pointer points below or after the designated area.

@lbudai
Copy link
Collaborator

lbudai commented Jul 6, 2016

On tomorrow I'll will also review it.
@bazsi: will you need some help from the team to make Travis happy?

@bkil
Copy link

bkil commented Jul 9, 2016

@bazsi
grep 'error' in copyright log:

error: mismatch lib/logmsg/tests/test_nvtable.c expected:LGPLv2.1+_SSL detected:GPLv2+_SSL
error: mismatch lib/logmsg/tests/test_tags.c expected:LGPLv2.1+_SSL detected:GPLv2+_SSL
error: mismatch lib/value-pairs/tests/test_value_pairs.c expected:LGPLv2.1+_SSL detected:GPLv2+_SSL
error: mismatch lib/value-pairs/tests/test_value_pairs_walk.c expected:LGPLv2.1+_SSL detected:GPLv2+_SSL

I opted to increase the verbosity variable recently, but if this is confusing, we can reduce it again so it only grep's errors.

grep -i ' error ' in distcheck:

../lib/logmsg/nvtable.c: In function ‘nv_table_get_entry_slow’:
../lib/logmsg/nvtable.c:215:17: error: declaration of ‘index’ shadows a global declaration [-Werror=shadow]
../lib/logmsg/nvtable.c: In function ‘nv_table_reserve_table_entry’:
../lib/logmsg/nvtable.c:259:21: error: declaration of ‘index’ shadows a global declaration [-Werror=shadow]
../lib/logmsg/nvtable.c: In function ‘nv_table_foreach_entry’:
../lib/logmsg/nvtable.c:588:17: error: declaration of ‘index’ shadows a global declaration [-Werror=shadow]

In file included from ../lib/logmsg/nvtable-serialize.c:28:0:
../lib/logmsg/nvtable-serialize-endianutils.h: In function ‘nv_table_data_swap_bytes’:
../lib/logmsg/nvtable-serialize-endianutils.h:72:17: error: declaration of ‘index’ shadows a global declaration [-Werror=shadow]
../lib/logmsg/nvtable-serialize-endianutils.h: In function ‘nv_table_struct_swap_bytes’:
../lib/logmsg/nvtable-serialize-endianutils.h:99:17: error: declaration of ‘index’ shadows a global declaration [-Werror=shadow]
../lib/logmsg/nvtable-serialize.c: In function ‘_deserialize_dynamic_entries’:
../lib/logmsg/nvtable-serialize.c:311:17: error: declaration of ‘index’ shadows a global declaration [-Werror=shadow]
../lib/logmsg/nvtable-serialize.c: In function ‘_write_struct’:
../lib/logmsg/nvtable-serialize.c:474:17: error: declaration of ‘index’ shadows a global declaration [-Werror=shadow]

grep ' error' in cmake:

Building C object modules/native/CMakeFiles/syslog-ng-native-connector.dir/native-parser.c.o
../lib/libsyslog-ng.so.3.8.alpha0�[32mBuilding C object modules/native/CMakeFiles/syslog-ng-native-connector.dir/parser.c.o
: undefined reference to `log_rewrite_unset_new'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [syslog-ng-ctl/syslog-ng-ctl] Error 1
make[1]: *** [syslog-ng-ctl/CMakeFiles/syslog-ng-ctl.dir/all] Error 2

../lib/libsyslog-ng.so.3.8.alpha0: undefined reference to `log_rewrite_unset_new'
clang: error: linker command failed with exit code 1 (use -v to see invocation)

We should change make -j install to make -j install || make install for cmake to make errors more readible.

The make distcheck log is not clear because the return codes got mixed up, that's why it does not retry without make -j. Basically we need to either put the make V=1; inside the $S=42 branch above, or switch the return code generation branches inside exec_prop_check(). Actually, I originally intended it to behave like the latter solution, i.e., code 42 = propagation error, code not in [0,42] = build failure return code.

I can submit PRs next week if nobody else volunteers until then.

@bkil
Copy link

bkil commented Jul 9, 2016

@bkil-syslogng We need to || cat test-suite.log @ distcheck similar to bkil-syslogng#23

@ihrwein
Copy link
Contributor

ihrwein commented Jul 9, 2016

The CMake related error is fixed by this commit: 15d5e2a

@bkil-syslogng
Copy link
Contributor

Clarification of build log errors in Travis added here: #1116

@lbudai
Copy link
Collaborator

lbudai commented Jul 12, 2016

@bkil-syslogng : that patch should not be part of a MacOSX support branch
Would it be possible to send a separate PR?
(and a note to the travis.yaml: it started to have a smell... )

@bkil-syslogng
Copy link
Contributor

@lbudai It was also mentioned by @bazsi last week that we should start to save Travis scripts to separate scripts. I agree if that's what you mean. I think we have too many unprocessed pull requests already, but if you insist I could create more.

@bazsi
Copy link
Collaborator Author

bazsi commented Jul 15, 2016

I have now pushed an update to this branch to fix the warnings in travis. For some reason I don't get these shadowing warnings even with -Wshadow enabled. I am not sure why though, maybe a change in gcc versions?

gcc (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010

... hmm.. it seems that gcc 4.8 changed the meaning of -Wshadow:

The option -Wshadow no longer warns if a declaration shadows a function declaration, unless the former declares a function or pointer to function, because this is a common and valid case in real-world code.

This seems to be a lot more practical. Travis is stuck on an earlier gcc?

@bkil-syslogng
Copy link
Contributor

@bazsi I noticed this earlier as well. Updated gcc variants are available even for container Travis, so I propose that we should eventually upgrade.

I originally planned to add new matrix elements, so that we compile on both earlier and newer compilers, but if you think that we can throw out older compilers to ease Travis' resource usage, I'm all in for it.

I've separated the clarification of the build log from Mac OS X support, so we can leave that one floating for some more months:
#1122

@bazsi
Copy link
Collaborator Author

bazsi commented Jul 15, 2016

for warning checks yeah. we still need to support old versions to compile
though.

Bazsi

On Fri, Jul 15, 2016 at 11:13 AM, Tamas Nagy notifications@github.com
wrote:

@bazsi https://github.com/bazsi I noticed this earlier as well. Updated
gcc variants are available even for container Travis, so I propose that we
should eventually upgrade.

I originally planned to add new matrix elements, so that we compile on
both earlier and newer compilers, but if you think that we can throw out
older compilers to ease Travis' resource usage, I'm all in for it.

I've separated the clarification of the build log from Mac OS X support,
so we can leave that one floating for some more months:
#1122 #1122


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1108 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AArldpcPABJgTQCiB7sxdc0DoPilaX1xks5qV08zgaJpZM4JEL3e
.

bazsi added 8 commits July 15, 2016 18:32
This function allows the user to go back to the beginning of the
archive, by resetting pos to 0. It is going to be used by a test program.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
There's a handy stopwatch function in testutils that displays the amount
of time a code needed to execute.

This patch adds a parameter "iterations" to tell the function the
number of loops we were going through, so that it can report the
iterations per second in addition to the runtime.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
This test program is relevent to logmsg only, so move from the "global"
unit test directory to the logmsg one.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
The sdata-serialize module was not used in any way, rather some functions
perform this functionality in logmsg-serialize. Since sdata is only
an array of guint32 values, there does not seem to be a need to create
a separate module for this.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Instead of relying on local includes in the same directory, include
stuff using full path to make the context more obvious.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
bazsi and others added 25 commits July 15, 2016 18:32
Instead of poking into the dynamic value table of the NVTable during the
update operation, do that in a separate memory area.  This makes it possible
to do handle updates incrementally, using the same loop.

Earlier, we had to do two stages during update: first we updated referenced
handles and SDATA entries and then with another loop the dynamic entry
table. The reason was that this update operation broke the sorted nature
of the dynamic entry table, which also broke handle based lookup.

This change creates a separate NVDynValue array where we put our updates,
then we sort and move it in if everything goes as planned.

Also makes the code easier to follow and allows a number of optimizations
in the future.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
This also changes the for loop to be a memcpy().

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
The only role this function has now is to update the SDATA array, simplify
parameters and rename to better reflect its purpose.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Instead of allocating arrays on the heap, use local variables for
performance reasons.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
…tter

An NVDynValue is confusing in a number of ways:
  * NVTable is storing a list of name-value pairs, so NVDynValue is what,
    especially as NVEntry is the struct that describes a specific
    name-value pair
  * the concept of dynamic vs. static entries is not always clear as we
    are talking about an unsorted list of name-value pairs
  * we sometimes use dyn_value in other cases dyn_entries when referring
    to the same thing.

So this patch attempts to clarify naming as a first step by renaming
NVDynValue to NVIndexEntry, as it is the struct that is one element in
the lookup table that is used to look up name-value pairs by handle.

So in essence, the dynamic-entries array is an index that facilitates
handle based lookup.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
This patch makes that function a bit easier to read.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
…in_indirect_entry()

The other calls in _fixup_entry() all have their predicates outside,
and moving them inside would make the code more difficult to read,
so unify calls within fixup_entry() so that all predicates get called
outside.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
…ry()

This patch splits _fixup_handle_in_index_entry() to two parts:
  - one that fixes up the index entry
  - one that fixups the entry in the SDATA array

The latter becomes a separate call from _fixup_entry(), so that the steps
needed to fix up a deserialized log message becomes clearer.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
In order to fix up the index in the NVTable instance, we are shadowing the
"index" (used to be called dynamic values) and update the values there.

Earlier this code simply assumed that we get the entries the order of the
index (e.g. sorted by handle value), and simply used a counter to allocate the
new index entries.

Instead of that, this patch converts this solution to just calculate the
index slot, which makes the cur_index_entry superficial.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
In order to make it possible to call this from serialization code.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Instead of allocating the base struct and then realloc()ing it later, read
the size first into a temp variable and do the large allocation in one go
for performance reasons.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
This patch adds validation to the size, index_size and used members and
prevents syslog-ng from reading a too-large NVTable by limiting its size to
NV_TABLE_MAX_SIZE.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
This patch implements an additional validation on the NVEntry instances
within NVTable.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Instead of poking into the LogMessage->payload field in various places,
use a specific field for this purpose.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Unset values are either those that have not yet been set in the message or
those that have explicitly been unset.

Previously, unset values simply returned the empty string, making it impossible
to find out if a name-value was explicitly set to the empty string.

This patch introduces a two functions at the NVTable level:

  - nv_table_unset()
  - nv_table_get_value_if_set()

nv_table_get() will continue to return an empty string for unset values.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
…nset" bit

Earlier syslog-ng versions didn't have a support for the unset operation,
thus they might have set that bit to garbage. This patch changes
the deserialization code, so that based on the NVTable header, it
clears the unset bit (and the rest of the NVEntry bits as well) in case
the current NVTable is read from an earlier version.

Although this is a change in the on-disk format, this is NOT a version bump,
for the following reasons:

1) old syslog-ng writes, new one reads:

   The NVTable header bit NVT_SUPPORTS_UNSET is going to be zero, we clear
   the unset bit on all NVEntry instances. All NVEntries retain their
   value.

2) new syslog-ng writes, old one reads:

   The NVTable heeader bit NVT_SUPPORTS_UNSET is non-zero, which the
   old syslog-ng ignores. The unset bit on NVEntry instances also
   remains set, but that is also ignored by the old syslog-ng instance.

   Whenever the new syslog-ng actually unset()s a value, that'll at the
   same time set that value to the empty string, so unset() values
   will not trickle into old syslog-ng versions either.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
This patch adds a new rewrite operation to unset a specific name-value pair.

Syntax:
  rewrite { unset(value('foobar')); };

It supports condition() option as well.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
The global symbol index() is defined on some of our platforms which
causes -Wshadow to trigger a warning. This patch renames
those local variables to index_table instead.

Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Tibor Benke <ihrwein@gmail.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
Signed-off-by: Balazs Scheidler <balazs.scheidler@balabit.com>
@bazsi
Copy link
Collaborator Author

bazsi commented Jul 15, 2016

I think this patchset is now set. I have now rebased it to master, and once it is green by travis, I am going ahead and merge it, based on the 👍 by @juhaszviktor above.

Let me know if you have objections.

@bazsi bazsi merged commit 9e4ea5f into master Jul 16, 2016
@ihrwein ihrwein deleted the f/unset-value branch July 19, 2016 06:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants