-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helpers: try to get exact lines from dmesg that appeared during a test #76
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your inconvenience here, but the proposed solution can't work on CI. See my comment below. I'm not happy with the current DmesgOopsFinder
implementation, it calls dmesg
twice and copies the full buffer between two vms to get very short piece of system log. I proposed to use journalctl
to get very precise system log part, see point 4 in #56
It tries to cut everything before line with substring "Start test" in it. But none of the tests I tried, push such line to the kernel buffer.
What was the tests that doesn't push that line? We need to fix that.
helpers/dmesg.py
Outdated
@@ -3,6 +3,8 @@ | |||
from __future__ import print_function | |||
import re | |||
from . import remote | |||
import subprocess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're still use python2 for the tests, and official documentation encourage to use the backported subprocess32
module instead. I've also run into some troubles with original subprocess
module, like lack of required timeout
argument for Popen
call. Please use subprocess32
and fallback to default if not found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out, subprocess
module is not required here, so it's remove.
helpers/dmesg.py
Outdated
@@ -37,15 +39,29 @@ class DmesgOopsFinder(object): | |||
|
|||
def __init__(self): | |||
self.node = remote.tempesta | |||
self.proc = subprocess.Popen(['dmesg', '-w'], stdout=subprocess.PIPE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You create subprocess directly. The test framework originally was designed to use several virtual (or physical) machines at once. One of the main reasons was to make test framework tolerant to Tempesta crashes. If both Tempesta and the framework are started on the same vm, Tempesta crash will lose the tests progress. If framework is started on a separate vm, it's more tolerant to Tempesta crashes: if crash happens, the framework waits Tempesta vm to reboot and continues with the next tests.
However during development, we start the framework and Tempesta on the same vm just for simplicity. So the proposed change works for you. But it won't work for CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Moved back to self.node.run_cmd()
. Now it runs journalctl
to get a slice of the kernel buffer.
7ef2aa0
to
2d3d84c
Compare
helpers/dmesg.py
Outdated
@@ -37,15 +38,11 @@ class DmesgOopsFinder(object): | |||
|
|||
def __init__(self): | |||
self.node = remote.tempesta | |||
self.start_time = time.time() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tempesta can be started on other vm, not the same that was used for framework itself. So time between systems may differ, and we need to get current time value on Tempesta node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to date +%s.%N
running on the remote host.
Yep, new patch version uses
("Start test" was not printed to a kernel log until |
2d3d84c
to
228e1f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Please mark p.4 of #56 as resolved
It turned out the current version of
DmesgOopsFinder
is not precise enough. It tries to cut everything before line with substring "Start test" in it. But none of the tests I tried, push such line to the kernel buffer. In that case the whole output ofdmesg
was used. That makes a single line with "ERROR" substring to mark all subsequent tests as failed.We can remove that fallback, but then there is no way to determine which parts of the log to take based only on the log contents anyway. However, we know that we need only the part that appeared after the moment
DmesgOopsFinder
is constructed.The code in the patch polls output of
dmesg -w
to extract only lines we are interested in.