-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] n3fit unit testing #590
Conversation
I think adding tests for the different pieces of The one I will add myself is the integration test at the fit level as that one is a bit more tricky. |
Can this run at the CI? |
Also I have to say that a big reason for merging the libnnpf, nnpdfcpp and validphys repos was not to have to deal with repeated build infrastructure such as conda recipes, test scripts, CI configurations and the like. That suggests that it wouldn't be the worst idea to move n3fit into the validphys namespace as discussed at some point. |
Ups, forgot. I was very happy because none of the commits failed to pass. |
Good. This seems to be running something now. |
Btw, given the embarrassing experience with #587 being able to run the whole new machinery in the ci would be quite useful (although not sure it is really possible with apfel). |
Apfel is not possible but apfel is not part of the whole thing in this case (thanks god!) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This look very good and useful.
However it would be good if the various functions had some docstings. |
I've added docstrings to the auxiliary functions. |
@Zaharid @scarrazza have a look when you have time to the At the moment I am considering the The pipeline for travis is:
Then we want to have some kind of flag so if you know something is supposed to change you can regenerate the result or part of the result. I don't know is whether this should be regenerated by Travis (the results depend on many random seeds coming from different places so I am guessing even if you run under conda, Mac and Linux won't get the exact same results) Note: for now it is not running because it won't pass as I am not sure how to tell Travis to create regression data. Maybe it needs to be created, uploaded to the nnpdf server and then downloaded back for the next iterations? |
0540b51
to
19fd450
Compare
n3fit/src/n3fit/tests/test_fit.py
Outdated
import subprocess as sp | ||
|
||
log = logging.getLogger(__name__) | ||
REGRESSION_FOLDER = pathlib.Path().absolute() / "regressions" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can use importlib.resources
instead? Not sure it is good for anything other than being the new recommended way. It does have the theoretical advantage that you can put the package in a zip file and have it working.
@Zaharid question. To first approximation I see I can shamelessly do
for instance. But is this the way to do it? Want to know for sure before cluttering the server with random tests. |
By which I mean before keeping on cluttering the server. I'll remove what I already uploaded. |
I imagine I can also create a fit called regression and upload the full fit and keep overwriting that one. Don't know. |
We keep the existing test regression data in the git repo itself. The idea is that you compute it and you compare with the existing local file anywhere. I'd say that's doable for a fit if you don't look at things like the replica files, and would be preferred because it is simpler. If not, there is a way to more or less transparently download a validphys output file. nnpdf/validphys2/src/validphys/loader.py Line 462 in 99f3e50
nnpdf/validphys2/src/validphys/loader.py Line 814 in 99f3e50
|
You mean uploaded by the user? The problem is that I don't think the Max and Linux regression tests, for instance, will produce the same results. Or the user and Travis for that matter since we are depending on several random seeds. |
19fd450
to
5f40fa3
Compare
Apparently there is something like that: https://stackoverflow.com/a/57094699/1007990 While it is a surprisingly difficult problem (see e.g. #77) it seems to me that is is something that should have been considered rather early in the design (seems clear tf is not really for scientists). In any case you cannot claim to have any handling of the random seeds (as per the sphinx docs) if it turns out it's going to lead to different results across the board. |
Can't manage to reproduce results between system 1 with gcc 9 pip-installed-tf and system 2 with gcc8 and conda-installed-tf (other than that the two systems should be equivalent). I'll study a bit more. I don't know if I cannot use the tf seeding of that stackoverflow thread because I'm letting Keras manage the initialization or because I am using the v1 compatibility features (edit: or because I am doing something wrong which is ofc entirely possible) If I don't manage I'll fallback to making travis upload the fit. |
6b74d25
to
2de9067
Compare
c04479d
to
7986c62
Compare
The test runs correctly in Linux and it does work in OS X with the Asking for community input: should I set the flag such that it works in OS X (the test is single threaded so the fact that two OMP libs are linked should make no difference) or should we try to find out whether the problem is in the build/run script so that this test also checks whether the OpenMP installation is correct? edit: I'll push the change anyway so I can be sure it works without me touching anything... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the environment thing, which I think needs to be changed, this is good to go as far as I am concerned. The other things are minor but of course wouldn't hurt.
from numpy.testing import assert_almost_equal | ||
|
||
log = logging.getLogger(__name__) | ||
REGRESSION_FOLDER = pathlib.Path(__file__).with_name("regressions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer using importlib.resources
here, but don't mind it particularly.
THRESHOLD = 1e-6 | ||
|
||
# Helper functions | ||
def generate_input_had(flavs=3, xsize=2, ndata=4, n_combinations=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would look really nice with hypothesis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tried and it takes too long unless I put a minmax limit very strict which sort of defeats the purpose :(
Hello, this is @Zaharid's automated QA script. Please note it is highly experimental. I ran pylint on your changes and found some new issues. On
On
On
|
Pylint makes a good point that the test should fail if the process returns non zero exit status. |
On Wed, Nov 6, 2019 at 11:17 PM Juacrumar ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In n3fit/src/n3fit/tests/test_fit.py
<#590 (comment)>:
> +
+
+def test_performfit():
+ # read up the old info file
+ old_fitinfo = load_data(REGRESSION_FOLDER / f"{QUICKNAME}.fitinfo")
+ # create a /tmp folder
+ tmp_name = tempfile.mkdtemp(prefix="nnpdf-")
+ tmp_path = pathlib.Path(tmp_name)
+ # cp runcard to tmp folder
+ shutil.copy(QUICKPATH, tmp_path)
+ os.chdir(tmp_path)
+ # run the fit
+ os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'
+ # The flag KMP_DUPLICATE_LIB_OK is necessary to avoid some errors
+ # related to the linking of OMP in travis when running under MacOS
+ sp.run(f"{EXE} {QUICKCARD} {REPLICA}", shell=True)
Maybe I am reading it wrong, but env sets a new environment while I
rather want to inherit the environment with one extra variable which is
what I want.
With respect to the list, they always use a string when using shell=True.
Maybe I should begin by doing a quick test without shell=True.
Well
subprocess.run(['python', '-c', 'import os; print(os.environ)'],
env={'foo':'bar', **os.environ})
environ({'foo': 'bar', 'CINNAMON_VERSION': '4.2.4', 'COLORTERM':
'truecolor', 'CONDA_EXE': '/home/zah/anaconda3/bin/conda',
'CONDA_PYTHON_EXE': '/home/zah/anaconda3/bin/python', 'CONDA_SHLVL': '0',
'DBUS_SESSION_BUS_ADDRESS':
'unix:abstract=/tmp/dbus-nToXLHHW06,guid=d272461ced73abc5d55daf455dc2a173',
'DEFAULTS_PATH': '/usr/share/gconf/cinnamon.default.path',
'DESKTOP_SESSION': 'cinnamon', 'DISPLAY': ':0', 'GDMSESSION': 'cinnamon',
'GDM_LANG': 'en_US', 'GJS_DEBUG_OUTPUT': 'stderr', 'GJS_DEBUG_TOPICS': 'JS
ERROR;JS LOG', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated',
'GNOME_TERMINAL_SCREEN':
'/org/gnome/Terminal/screen/41042373_850e_482b_95d9_bb51c1fa6c72',
'GNOME_TERMINAL_SERVICE': ':1.74', 'GPG_AGENT_INFO':
'/run/user/1000/gnupg/S.gpg-agent:0:1', 'GTK_MODULES': 'gail:atk-bridge',
'GTK_OVERLAY_SCROLLING': '1', 'HOME': '/home/zah', 'LANG': 'en_US.UTF-8',
'LANGUAGE': 'en_US', 'LC_ADDRESS': 'es_ES.UTF-8', 'LC_IDENTIFICATION':
'es_ES.UTF-8', 'LC_MEASUREMENT': 'es_ES.UTF-8', 'LC_MONETARY':
'es_ES.UTF-8', 'LC_NAME': 'es_ES.UTF-8', 'LC_NUMERIC': 'es_ES.UTF-8',
'LC_PAPER': 'es_ES.UTF-8', 'LC_TELEPHONE': 'es_ES.UTF-8', 'LOGNAME': 'zah',
'MANDATORY_PATH': '/usr/share/gconf/cinnamon.mandatory.path',
'MATHEMATICA_HOME': '/usr/local/Wolfram/Mathematica/11.1', 'PATH':
'/home/zah/anaconda3/condabin:/home/zah/anaconda3/bin/:/home/zah/.cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin',
'PWD': '/home/zah', 'QT_ACCESSIBILITY': '1', 'SESSION_MANAGER':
'local/zah-XPS13-9333:@/tmp/.ICE-unix/1614,unix/zah-XPS13-9333:/tmp/.ICE-unix/1614',
'SHELL': '/usr/bin/fish', 'SHLVL': '1', 'SSH_AGENT_PID': '1694',
'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'TERM': 'xterm-256color',
'USER': 'zah', 'VTE_VERSION': '5202', 'XAUTHORITY':
'/home/zah/.Xauthority', 'XDG_CONFIG_DIRS':
'/etc/xdg/xdg-cinnamon:/etc/xdg', 'XDG_CURRENT_DESKTOP': 'X-Cinnamon',
'XDG_DATA_DIRS':
'/usr/share/cinnamon:/usr/share/gnome:/home/zah/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share:/var/lib/snapd/desktop:/var/lib/snapd/desktop',
'XDG_GREETER_DATA_DIR': '/var/lib/lightdm-data/zah', 'XDG_RUNTIME_DIR':
'/run/user/1000', 'XDG_SEAT': 'seat0', 'XDG_SEAT_PATH':
'/org/freedesktop/DisplayManager/Seat0', 'XDG_SESSION_DESKTOP': 'cinnamon',
'XDG_SESSION_ID': 'c2', 'XDG_SESSION_PATH':
'/org/freedesktop/DisplayManager/Session0', 'XDG_SESSION_TYPE': 'x11',
'XDG_VTNR': '7'})
Out[6]: CompletedProcess(args=['python', '-c', 'import os;
print(os.environ)'], returncode=0)
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#590?email_source=notifications&email_token=ABLJWUSLFTMWDB7JJ47REW3QSNGANA5CNFSM4I7LUDG2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCKSRQ3Q#discussion_r343378918>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLJWUVS56DCWGQMDGOOTKTQSNGANANCNFSM4I7LUDGQ>
.
|
So I did understand it correctly. But ok, I'll pass the whole of |
17959db
to
9b4a7c9
Compare
n3fit/src/n3fit/tests/test_fit.py
Outdated
# The flag KMP_DUPLICATE_LIB_OK is necessary to avoid some errors | ||
# related to the linking of OMP in travis when running under MacOS | ||
sp.run(f"{EXE} {QUICKCARD} {REPLICA}", shell=True) | ||
proc = sp.run(f"{EXE} {QUICKCARD} {REPLICA}", shell=True, env=new_environment, cwd = tmp_path) | ||
assert proc.returncode == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just add check=True
to subprocess.run.
------------------------------
With respect to the list, they always use a string when using shell=True.
Maybe I should begin by doing a quick test without shell=True.
The recommendation is not to use shell=True because you get proper escaping
independent of the shell and platform which avoid accidental or intentional
problems with quotes and pipes. Not a huge problem here because the call is
simple enough and static, but the best practice is to pass a list of
arguments and shell=False.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#590?email_source=notifications&email_token=ABLJWUSLFTMWDB7JJ47REW3QSNGANA5CNFSM4I7LUDG2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCKSRQ3Q#discussion_r343378918>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABLJWUVS56DCWGQMDGOOTKTQSNGANANCNFSM4I7LUDGQ>
.
|
I know, I was trying to see whether it worked in Os X but Travis kicked me out :( |
Ok, ready, I think I did not forget anything? |
This PR adds tests files for
n3fit
. Whenever possible (mostly the backend) while for the fit it will be rather a regression test.I will edit this PR with the things that are tested (so I can copy it verbatim to the documentation at the end).
Unit tests