Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default to reading cluster information from librados #42

Merged
merged 2 commits into from
Sep 25, 2024

Conversation

davidturner-sie
Copy link
Contributor

Several times I've had issues with the reading cluster information from subprocess.getoutput. These changes have worked for me locally. If librados doesn't connect, it will default to getting output from the BASH CLI. It may make sense to add a series of CLI options to dictate how to connect to the Ceph cluster rather than just assuming defaults.

except ValueError:
eprint('Error loading remapped pgs')
sys.exit(1)

# nautilus compat
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nautilus is 5 major versions ago. No need to mention when the change to the json output happened. Also moved it inside of the initial try block.

@davidturner-sie
Copy link
Contributor Author

Assuming this gets merged I'm also working some modifications to add a --run option to natively run the upmap commands using librados. I would like to loop through twice internally. The first time doing the rm upmaps, then back through to create all of the proper upmaps. That way the script would only be needed to run once. The default behavior would remain unchanged of displaying all of the upmaps and sleeps as normal.

@drakonstein
Copy link
Contributor

@dvanders Are you still poking around this repo? Or did you leave a successor when you moved on?

@thmour
Copy link
Collaborator

thmour commented Sep 5, 2024

Hello @davidturner-sie, can you give us an output example of the mon_command versus bash, what is the difference? I have the vanilla script running without issues on v15 and v18 with ubuntu 18.04 and 22.04.

Also sometimes the script needs to run 2-3 times to do the proper job, twice is not always enough.

@drakonstein
Copy link
Contributor

BTW, I'm @davidturner-sie. This is my personal account. I commented from this account as I've had previous interactions using this account and wanted to maintain that relationship.

I was running into the same issue as #41. The output was just the exceptions for getting the OSD information or upmap information. Over the last year it's been getting more and more buggy with those. But it would eventually output the proper commands after erroring out several times. Last week I left it in a loop to get the upmap commands and it failed straight for 20 minutes.
while ! ./upmap-remapped > upmaps.sh; do sleep 2; done
After making this modification to use librados it has worked every time. The output of the script is identical when using mon_command and the successful output of the vanilla script. My environment is running on Gentoo which might be a factor to failing to get the output using the vanilla script, it is also weird that it worked sometimes and failed the rest of the time.

I didn't want to rip out the vanilla workflow in case the cluster connection doesn't work using librados. I also know that many companies use this script in their processes so any future changes shouldn't modify the default behavior of outputting the commands to be run.

IRT the future idea to run one loop of rm and then one loop creating all of the upmaps that has gotten me to an ideal state in all of my clusters for many years now. Before doing the rm's and then creating I would run it twice on replica clusters or many times on erasure coded clusters and then I would eventually run the rm's, run it one last time, and it would be as close as it gets. The problem with Erasure Coding is that it might move an osd from one shard to another, but this implementation of upmap doesn't work with that. That's more theorizing for future implementation and not addressed in this PR. I just wanted to start a conversation around it.

@thmour
Copy link
Collaborator

thmour commented Sep 5, 2024

I was asking if you could catch an instance where the jq based command fails and show us the exact output, to identify which part of the output is changed, so we can find the source of the problem. Running the ceph command is using the rados to connect to the cluster anyway. If the mon_command output is enough and we can ditch the jq dependency, that is good. I would like that you rewrite the script to use the just the mon command only and read the default config or a user provided one, this is what those ceph commands do anyways in the subprocess. Also if @dvanders agrees as well.

@drakonstein
Copy link
Contributor

Gotcha, I can't currently get that cluster to show the same behavior. The cluster had 20k+ remapped PGs and I was thinking it might be a buffer issue. I've only ever had the issues in prod and never in our preprod deployments which also leads me to think it's something with the size of the output. That said, as soon as I switched it to librados rather than using the bash cli, I haven't had a single problem gathering the cluster information in the script.

I like the idea to set it up to just use the rados library. I figure we'll need to allow for config file, keyring, and user id to roughly match the format of the ceph command. Also it should read the corresponding environment variables, if they exist, for anything not specified.

Dan mentioned at Ceph Days NYC, 2024 that he has an idea to build this into the balancer. I don't know how that impacts any plans for potential work done.

@dvanders
Copy link
Collaborator

@drakonstein thanks for this. We see those json parsing problems all the time, so this will be valuable.
@thmour the main issue is that the ceph json output often has unparseable words such as Inf or DBL_MAX or things like that.

@dvanders dvanders merged commit c140921 into cernceph:master Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants