Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for infinite DHCP leases #1176

Closed
wants to merge 1 commit into from

Conversation

cybertron
Copy link
Contributor

The proper way to do this would be to set the lease length in the
libvirt network config, but unfortunately the version of libvirt we
have in CentOS 8 right now doesn't support that. This adds a script
that manually modifies the libvirt dnsmasq configuration to make the
leases infinite. It is started in the background, and killed when
make clean is run. It can be enabled by setting INFINITE_LEASES=1.

02_configure_host.sh Outdated Show resolved Hide resolved
@cybertron
Copy link
Contributor Author

/retest

The error message includes approximately all the errors, so let's see if we can roll the dice and get a more useful one this time... :-P

@hardys
Copy link

hardys commented Jan 15, 2021

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hardys

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2021
Copy link
Collaborator

@derekhiggins derekhiggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes inline, also wondering if you can mention briefly why its needed.

# do it properly.
if [ ${INFINITE_LEASES:-0} -eq 1 ]
then
./infinite-leases.sh &
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder should we redirect io to somewhere, I think its holding ./02_configure_host.sh from finishing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using this locally and it hasn't caused any problems. Are you seeing something different?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I was mistaken when I mentioned 02_configure_host.sh , what I'm seeing is that after the deployment infinite-leases.sh is still running. If for some reason the loop to edit the leases file is executed then the terminal where I had run the make command from gets the output
"Added infinite leases to /var/lib/libvirt/dnsmasq/ostestbm.hostsfile"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess I've never had the script trigger after make was complete. I could probably remove the log message entirely. It was mostly there to let me know the script was working, but it isn't particularly necessary.


pid=$(ps aux | grep dnsmasq | grep "$NETWORK_NAME" | grep -v root | awk '{print $2}')
sudo kill -s SIGHUP $pid
echo "Added infinite leases to $FILENAME"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is triggered after make completes, as is it is output to where ever make was run from

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really designed to be run after deployment. The problem the script was designed to solve is the fact that the hosts file gets rewritten multiple times during the deployment, so if you manually added the leases they might get removed again. After deployment that's no longer a problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So would it be ok to just kill the script at the end of the deployment?

do
if grep -q -v infinite $FILENAME
then
sudo perl -pi -e 's/(.*?)(,infinite|)$/\1,infinite/' $FILENAME
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding this to the hostsfile means that we cant restart the libvirt network (I guess the lease-init script can't parse it), maybe we don't care but I thought it was worth pointing out.

[derekh@u07 dev-scripts]$ sudo virsh net-start ostestbm
error: Failed to start network ostestbm
error: internal error: Child process (VIR_BRIDGE_NAME=ostestbm /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/ostestbm.conf --leasefile-ro --dhcp-script=/usr/libexec/libvirt_leaseshelper) unexpected exit status 11: 
dnsmasq: lease-init script returned exit code 1
[derekh@u07 dev-scripts]$ kill: not enough arguments
Added infinite leases to /var/lib/libvirt/dnsmasq/ostestbm.hostsfile

Also the scripts that generates the clouds.yaml no longer works, as it can't get the host IP addresses with virsh

[derekh@u07 dev-scripts]$ virsh net-dhcp-leases ostestbm
error: failed to get network 'ostestbm'
error: Network not found: no network with matching name 'ostestbm'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. There was a bug opened for it: https://bugzilla.redhat.com/show_bug.cgi?id=1908053

It was just a problem with libvirt's handling of infinite leases in general. Even setting them properly through the libvirt xml was failing. I guess this will be fixed when we get to a newer version of libvirt, at which point we should be able to drop this hack.

It is a bit of an issue though since one of the reasons I wanted to get this in dev-scripts was so we could use it for ci, which won't work if the post-deploy steps are failing. Still useful for development, but if we do get a ci job configured before getting a fixed libvirt I'll have to find some workaround.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if its not on by default then its not a big problem, and it'll have to be sorted somehow before getting switched on.


# Hack to force use of infinite leases in versions of libvirt that don't
# support setting the lease time in the network config.
# This should be removed when we have a new enough version of libvirt to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add what that version number would be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll see if I can find it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done.

Copy link
Contributor Author

@cybertron cybertron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to support a feature where we convert infinite DHCP leases to static configuration on the node. It's sort of a workaround for the fact that OpenShift doesn't have good support for static node networking configuration. The feature has already merged and this script is what I used while I was working on it.


# Hack to force use of infinite leases in versions of libvirt that don't
# support setting the lease time in the network config.
# This should be removed when we have a new enough version of libvirt to
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll see if I can find it.

# do it properly.
if [ ${INFINITE_LEASES:-0} -eq 1 ]
then
./infinite-leases.sh &
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been using this locally and it hasn't caused any problems. Are you seeing something different?

do
if grep -q -v infinite $FILENAME
then
sudo perl -pi -e 's/(.*?)(,infinite|)$/\1,infinite/' $FILENAME
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. There was a bug opened for it: https://bugzilla.redhat.com/show_bug.cgi?id=1908053

It was just a problem with libvirt's handling of infinite leases in general. Even setting them properly through the libvirt xml was failing. I guess this will be fixed when we get to a newer version of libvirt, at which point we should be able to drop this hack.

It is a bit of an issue though since one of the reasons I wanted to get this in dev-scripts was so we could use it for ci, which won't work if the post-deploy steps are failing. Still useful for development, but if we do get a ci job configured before getting a fixed libvirt I'll have to find some workaround.


pid=$(ps aux | grep dnsmasq | grep "$NETWORK_NAME" | grep -v root | awk '{print $2}')
sudo kill -s SIGHUP $pid
echo "Added infinite leases to $FILENAME"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really designed to be run after deployment. The problem the script was designed to solve is the fact that the hosts file gets rewritten multiple times during the deployment, so if you manually added the leases they might get removed again. After deployment that's no longer a problem.

The proper way to do this would be to set the lease length in the
libvirt network config, but unfortunately the version of libvirt we
have in CentOS 8 right now doesn't support that. This adds a script
that manually modifies the libvirt dnsmasq configuration to make the
leases infinite. It is started in the background, and killed when
make clean is run. It can be enabled by setting INFINITE_LEASES=1.
Currently, due to a bug with libvirt's handling of infinite leases,
this will cause make to fail. The cluster will deploy fine, but
some of the post-deploy steps don't work. This is "normal" and
should be fixed in future versions of libvirt.

Background: This is to support testing and dev of a feature that
converts infinite DHCP leases to static configuration so a node no
longer has a dependency on the DHCP server.
@cybertron
Copy link
Contributor Author

It looks like we do have new enough libvirt on centos 8 now if we use the advanced-virtualization repo. That's probably better than this method. New PR pushed at #1191

@cybertron cybertron closed this Feb 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants