Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply command does not report failure on large terraform projects #93

Open
benweibel opened this issue Dec 30, 2020 · 1 comment
Open

Comments

@benweibel
Copy link

What I am seeing is

  • I call your tool's apply function
  • Terraform begins applying
  • Terraform fails -- Note that I am able to see the terraform failure in my python logs
  • Logs stop flowing. Your python code does not return

Here are two guesses of mine based on the underlying tooling you are using in this area of your code:
1. No timeout is set for Popen.communicate
On the line linked below, you do not follow the recommended procedure for checking the status of a subprocess when using Popen.communicate. I understand this functionality was added in subprocess version 3.3. I do not see this tool's version pinned anywhere in your repo so I have no idea what you are running.

ret_code = p.returncode

Please add a timeout and use the try/except block recommended in the tool's documentation. This is that block, copied directly from the subprocess documentation.

proc = subprocess.Popen(...)
try:
    outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
    proc.kill()
    outs, errs = proc.communicate()

2. Popen.communicate does not have an infinite buffer
Terraform may simply be overflowing the buffer of the subprocess tool. In subprocess' documentation, the tool details this explicitly:

Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

Perhaps consider using one of the other functions that directly set the retcode of the process
Popen.returncode

The child return code, set by poll() and wait() (and indirectly by communicate()). A None value indicates that the process hasn’t terminated yet.

A negative value -N indicates that the child was terminated by signal N (POSIX only).

I think that cutting off the end of the returned information from out and err here by calling https://docs.python.org/3/library/subprocess.html#subprocess.Popen.wait is not a perfect solution,, but it is definitely better than not returning due to a silent buffer overflow.

Please let me know if you have questions or if I can help in any way!

Thanks,
Ben Weibel

@Utkarsh-Kickdrum
Copy link

Is there any work around.
What I need is If suppose tf.apply() fails or if my terraform script execution faces some error, can I catch that error and stop further execution of my python script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants