Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a temporary file for df output to avoid PIPE limits. #1294

Merged
merged 1 commit into from
Jan 14, 2015

Conversation

conorbranagan
Copy link
Contributor

subprocess.PIPE is limited in size because the data is buffered in
memory. When running the Agent on a host that has many containers,
the number of mountpoints can be very large. Because of this buffer limit,
the shell call can hang for these huge outputs.

To workaround this, we will write the output to a temporary file and
read it from there.

This is a "quick" fix for a specific case and there might be a better
general approach. But either way, we will still need to handle the
case of large outputs from our subprocess commands otherwise the
whole Agent collector simply hangs.

refs:

cc @remh @LeoCavaille

@remh
Copy link

remh commented Jan 14, 2015

Thanks @conorbranagan nice debugging session!

I'm more in favor of rewriting the disk check to use psutil and stop using subprocess. Those are pretty small and it should be pretty straightforward.

But i don't want to guarantee that it would be part of the 5.2 release and would rather aim at doing this for 5.3.

If it's a critical fix we could merge it as it for 5.2 and rewrite the disk check for 5.3 ?

I'm open to more suggestions though.

@conorbranagan
Copy link
Contributor Author

OK if we're going to go the psutil direction, then I'd prefer to get this small change in 5.2 since it's critical for a particular customer so they can get their Agents running again going forward (without having to keep a patched version around).

What do you think?

@@ -906,12 +907,15 @@ def _get_subprocess_output(command, log):
Run the given subprocess command and return it's output. Raise an Exception
if an error occurs.
"""
proc = sp.Popen(command, stdout=sp.PIPE, close_fds=True, stderr=sp.PIPE)
stdout_f = tempfile.TemporaryFile('rw')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick but could you embed that in a "with" statement so won't have to wait for the garbage collection for the file to be deleted ?

subprocess.PIPE is limited in size because the data is buffered in
memory. When running the Agent on a host that has many containers,
the number of mountpoints can be very large. Because of this buffer limit,
the shell call can hang for these huge outputs.

To workaround this, we will write the output to a temporary file and
read it from there.

This is a "quick" fix for a specific case and there might be a better
general approach. But either way, we will still need to handle the
case of large outputs from our subprocess commands otherwise the
whole Agent collector simply hangs.

ref: http://thraxil.org/users/anders/posts/2008/03/13/Subprocess-Hanging-PIPE-is-your-enemy/
@conorbranagan conorbranagan force-pushed the conor/subprocess-pipe-limit branch from b2dbb87 to f298832 Compare January 14, 2015 20:15
remh pushed a commit that referenced this pull request Jan 14, 2015
Use a temporary file for df output to avoid PIPE limits.
@remh remh merged commit 012550c into master Jan 14, 2015
@remh remh deleted the conor/subprocess-pipe-limit branch March 23, 2015 17:28
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling f298832 on conor/subprocess-pipe-limit into * on master*.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants