Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMDB - fix integration, restoring ability of use lmdb with nginx-modsecurity #2688

Merged
merged 6 commits into from
Apr 29, 2022

Conversation

ziollek
Copy link
Contributor

@ziollek ziollek commented Mar 6, 2022

Hi, I would like expand works started in issue: #2601. It turns out that these fixes cause significant problems with the ModSecurity-nginx stack.
Long story short - You cannot use readonly transactions in a child process if the lmdb environment (MDB_env) has been opened in a process which is forked afterwards (Use an MDB_env* in the process which opened it, without fork()ing ).

I prepared changes which improve that behavior by delaying opening the environment until the first use of any collection. Additionally, there should be only one open environment per process - that's why I introduced dedicated singleton wrapper responsible for creating the lmdb environment and providing it to each lmdb collection.

@ziollek ziollek mentioned this pull request Mar 6, 2022
@martinhsv
Copy link
Contributor

Thanks, @ziollek , for the submission (and others who have reviewed it).

I'll add this to v3.0.7 project.

@ziollek
Copy link
Contributor Author

ziollek commented Mar 19, 2022

Thanks @martinhsv, I saw that static-check fails because of some warning for modsecurity.cc.
It turns out that this warning was suppressed earlier but for a specific line number.
I updated the line number for this suppression in the last commit so rerun ci pls.


MDBEnvProvider::MDBEnvProvider() :
m_env(NULL), initialized(false) {
pthread_mutex_init(&m_lock, NULL);
Copy link
Contributor

@martinhsv martinhsv Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering if the call to pthread_mutex_init might be better inside the if-block at line 538 in init().

As it is, it looks like it might at least be theoretically possible to have a thread-safety issue here where pthread_mutex_init could get called on an already-initialized m_lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, m_lock mutex is responsible for controlling access to flag: initialized and preventing calling mdb_env_create multiple times. If we moved it into if-block it wouldn't fulfill its aim. It's worth noticing - that the init() function is designed to be called from multiple threads simultaneously.

To my mind, it's not possible to call multiple pthread_mutex_init on the same structure via current implementation (but of course, I'm not an expert ;) ). Even if we create a few new instances directly through the constructor: new MDBEnvProvider(), there will be an independent m_lock structure in each instance, am I right?

Copy link
Contributor

@martinhsv martinhsv Apr 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. My earlier comments weren't fully thought through.

However, I do still wonder about a sequence of events like this:
Thread 1:

  • calls GetInstance and runs to line 508
  • line 508 is executed, the boolean expression is evaluated to true
  • execution advances to the beginning of 509.

Thread 2:

  • calls GetInstance and runs to line 508
  • line 508 is executed, the boolean expression is again evaluated to true
  • execution advances all the way to the end of GetInstance(); an MDBEnvProvider has been constructed -- let's call it MDBEnvProvider object A
  • MDBEnvProvider object A is returned to the caller of GetInstance

Thread 1:

  • completes executing GetInstance; a different new MDBEnvProvider is constructed ('MDBEnvProvider object B')
  • MDBEnvProvider object B is returned to the caller of GetInstance

And we have two objects instead of 1. (And now both of these separate MDBEnvProvider objects can have init() called. That would include having mdb_dbi_open called both times -- and this possibility is part of what prompted the original #2601 fix.)

It's been a while since I've done anything with Singletons, so I've had to do a bit of review.

Your mechanism (static member pointer variable, check for NULL, then create the object via new) is the Gang-of-Four mechanism, but it seems not fully thread-safe.

Since the code base is firmly in C++11, might a Meyer's Singleton be a better implementation?

(Apologies if I've mucked something up, as I said, it's been a while since I've used a this Design Pattern.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your detailed clarification, I agree entirely. I've made a silent assumption that GetInstance() is called only by the main thread. I've been biased by focusing on nginx-connector use case - where this assumption is fulfilled indeed.

I'll replace this implementation with one that is thread-safe, as You suggested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinhsv I reimplemented the singleton pattern to a thread-safe version (Meyer's Singleton). Thanks a lot for all suggestions.

Copy link
Contributor

@martinhsv martinhsv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably get merged as it stands, but let me know what you think about my two obsevations -- especially the one at line 522. Thx.

void MDBEnvProvider::init() {
pthread_mutex_lock(&m_lock);
if (!initialized) {
MDB_txn *txn;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have guaranteed that only one MDBEnvProvider can be constructed in a given process, is it perhaps worth considering moving the contents of this initialization block to the MDBEnvProvider constructor?

There's nothing unsafe about the structure right now, but making that change (I think) would allow us to omit using m_lock every time GetEnv and GetDBI are called. (Indeed we could get rid of the init() function entirely then. And, I think, get rid of the m_lock member variable entirely.)

(I'm mostly interested in the compute cycles saved by omitting this locking/unlocking on each of those calls if it's not truly necessary. It's not a huge savings, but still.)

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree. It could be significantly simplified and any locking is not necessary.

pthread_mutex_unlock(&m_lock);
}

void MDBEnvProvider::close() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This close function isn't currently called anywhere.

I'm thinking we'd want an MDBEnvProvider destructor to call this?

(I suppose in practice it doesn't matter that much since the object won't be destructed until the process shuts down.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, now it should be moved to destructor. I will fix that

@ziollek
Copy link
Contributor Author

ziollek commented Apr 29, 2022

@martinhsv I've adjusted the code to optimizations that You suggested.

@martinhsv
Copy link
Contributor

Looks, I think. I'll merge it today. Thanks for the work on this.

@martinhsv martinhsv merged commit 83c302e into owasp-modsecurity:v3/master Apr 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants