Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

firewalld modules reloads firewalld excessively #61

Closed
csschwe opened this issue Jul 12, 2016 · 6 comments
Closed

firewalld modules reloads firewalld excessively #61

csschwe opened this issue Jul 12, 2016 · 6 comments
Labels
enhancement New feature or request
Milestone

Comments

@csschwe
Copy link
Contributor

csschwe commented Jul 12, 2016

After some in depth testing I have found that if firewalld is restarted over and over again in a short period of time weird things start happening. The easiest to see issue is that DNS lookups start failing.

eg.. run 'firewall-cmd --reload' in a loop and also run 'puppet agent -t' in a loop.

while true;do firewall-cmd --reload; done
while true; do puppet agent -t; sleep 1; done

You will get errors like
Error: Could not send report: getaddrinfo: Name or service not known

I think the best solution will be to update all of the code doing reloads like custom_service and firewalld_service to common reloads reduce the number of restarts.

    exec { 'firewalld::custom_service::reload':
      path        =>'/usr/bin:/bin',
      command     => 'firewall-cmd --reload',
      refreshonly => true,
    }

    exec { 'firewalld::reload':
      path        =>'/usr/bin:/bin',
      command     => 'firewall-cmd --reload',
      refreshonly => true,
      require     => Exec['firewalld::custom_service::reload'],
    }

    Exec['firewalld::custom_service::reload'] -> Firewalld_service <| |>
    Service['firewalld'] -> Firewalld_zone <| |> ~> Exec['firewalld::reload']
    Service['firewalld'] -> Firewalld_rich_rule <| |> ~> Exec['firewalld::reload']
    Service['firewalld'] -> Firewalld_service <| |> ~> Exec['firewalld::reload']
    Service['firewalld'] -> Firewalld_port <| |> ~> Exec['firewalld::reload']
@crayfishx
Copy link
Contributor

I don't think this is to do with the number of times it's restarted per-se, but what your test does is make firewall-cmd --reload commands happen at the time that Puppet is trying to send the report to the master. Running the first while loop in the background will probably cause a lot of failures, not just puppet. Reducing the number of reloads per run may reduce the risk, but maybe a better aproach would be a sleep after the reload to give firewalld a chance to finish reloading before puppet tries to move on?

Are you seeing problems just on a normal puppet run or do you have to go the extreme (two while loops) in order to simulate the problem?

@csschwe
Copy link
Contributor Author

csschwe commented Jul 13, 2016

This problem occurs during normal puppet runs if the firewall is being modified. The actual puppet errors are not usually an issue for my catalogs because puppet will just run again resources will apply once DNS lookups are working.

If you do a sleep 5 between the firewalld reloads the same thing will still happen, it just doesn't happen nearly as often. The strange thing is that sometimes it takes 20 - 30 seconds before DNS lookups start working agin.

The issue I am having right now is all of the custom services are created and then the services get added to the zone ~4 - 5 at the same time which will trigger 8 - 10 firewalld reloads within a few seconds just like in my example. Adding the sleep didn't actually seem to help much because the problem can still impact background processes (which is a bigger issue) even though most of the actual puppet errors went away.

The biggest issue seen so far is that if tcserver happens to be starting at this same time and some of the applications it is starting requires one or more DNS lookups. This causes the application to not start correctly even though tcsever believes it is running correctly. Currently the only way I see to resolve this 100% of the time would be to add some puppet code to make sure tcserver starts after firewalld and possibly validate DNS lookups are working first.

Reducing the number of restarts to the minimum required does help significantly reduce this occurrence.

I am going to also open a ticket to one of the Linux vendors to determine if there is an actual issues with firewalld.

@csschwe
Copy link
Contributor Author

csschwe commented Jul 19, 2016

Here is the response I received. I believe the module needs to reduce the number of times that --reload is called to a minimum based on this information.

A firewall reload involves reloading all configuration files and recreating the whole firewall configuration. While reloading, the policy for built-in chains is set to DROP for security reasons and is then reset to ACCEPT at the end. Service disruption is therefore possible during the reload.

There is an enhancement coming (targeted for RHEL 7.3) to shorten the window of time where the policy is set to DROP. This should help, but won’t eliminate the window completely.

t-woerner/firewalld@c91b2fa

@crayfishx
Copy link
Contributor

The firewall should not be being reloaded more than once in a run unless you are managing a lot of services - can you post some --debug output showing when and how many times --reload is being called?

@csschwe
Copy link
Contributor Author

csschwe commented Aug 14, 2016

I can work on the debug output. But here is the sample code (leaving out unnecessary stuff) for the example:
Basically everytime we add a custom service there are 2 reloads. In my testing with firewalld every single reload has a possible impact.

1) custom_service { 'monitoring_app' }  -> exec{ "firewalld::custom_service::reload-monitoring_app":}
2) firewalld_service { 'allow monitoring_app'} -> Function (lib/puppet/provider/firewalld.rb) reloads firewalld
3) custom_service { 'primary_app' }  -> exec{ "firewalld::custom_service::reload-primary_app":}
4) firewalld_service { 'allow primary_app' } -> Function (lib/puppet/provider/firewalld.rb) reloads firewalld
5) custom_service { 'managment_app' }  -> exec{ "firewalld::custom_service::reload-managment_app":} -> Function (lib/puppet/provider/firewalld.rb) reloads firewalld
6) firewalld_service { 'allow managment_app' } -> Function (lib/puppet/provider/firewalld.rb) reloads firewalld

My current work around it to run puppet using tags that apply the firewall configuration before doing any other work.

@crayfishx
Copy link
Contributor

I think the best solution will be to update all of the code doing reloads like custom_service and firewalld_service to common reloads reduce the number of restarts.

Custom services, followed by any other kind of resource will cause two reloads - I believe this was by design since if you add a custom service then try and use it later on in the same puppet run, it will fail because the custom service doesn't exist until after a reload (the same with firewalld_service resources. Apart from service and custom_service everything should notify the Exec['firewalld::reload'] resource - it may get notified many times but it will only execute once.

What we can probably do quite easily is move the custom_service::reload exec to the base class, so there is only one of them, and execute it from there. That should improve things - I'll throw together a PR later for you to test...

@crayfishx crayfishx added enhancement New feature or request accepted labels Aug 16, 2016
@crayfishx crayfishx modified the milestones: 3.1.1, 3.1.2, 3.1.3 Aug 16, 2016
@csschwe csschwe closed this as completed Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants