[WIP] n3fit unit testing #590

scarlehoff · 2019-10-10T11:34:27Z

This PR adds tests files for n3fit. Whenever possible (mostly the backend) while for the fit it will be rather a regression test.

I will edit this PR with the things that are tested (so I can copy it verbatim to the documentation at the end).

Unit tests

Tests that all backend operations do exactly what you would expect numpy to do.
Tests that loss functions are doing what is expected from them.
Tests for the DIS and Hadronic convolutions

scarlehoff · 2019-10-11T12:26:25Z

I think adding tests for the different pieces of n3fit could be a good issue for getting to learn the code. The ones that are already there can be used as a starting point.

The one I will add myself is the integration test at the fit level as that one is a bit more tricky.

Zaharid · 2019-10-11T13:02:43Z

Can this run at the CI?

Zaharid · 2019-10-11T13:17:43Z

Also I have to say that a big reason for merging the libnnpf, nnpdfcpp and validphys repos was not to have to deal with repeated build infrastructure such as conda recipes, test scripts, CI configurations and the like.

That suggests that it wouldn't be the worst idea to move n3fit into the validphys namespace as discussed at some point.

scarlehoff · 2019-10-11T13:18:03Z

Ups, forgot. I was very happy because none of the commits failed to pass.

Zaharid · 2019-10-11T15:41:12Z

Good. This seems to be running something now.

Zaharid · 2019-10-11T15:43:02Z

Btw, given the embarrassing experience with #587 being able to run the whole new machinery in the ci would be quite useful (although not sure it is really possible with apfel).

scarlehoff · 2019-10-11T15:46:44Z

Apfel is not possible but apfel is not part of the whole thing in this case (thanks god!)

Zaharid

This look very good and useful.

n3fit/src/n3fit/tests/test_backend.py

n3fit/src/n3fit/tests/test_layers.py

Zaharid · 2019-10-15T22:14:42Z

However it would be good if the various functions had some docstings.

scarlehoff · 2019-10-16T08:34:20Z

However it would be good if the various functions had some docstings.

I've added docstrings to the auxiliary functions.
To the rest I am not sure what is the "protocol", in the sense that the docstring for test_function is basicallly "tests function"? (or test method).
Let me know.

scarlehoff · 2019-10-17T09:23:36Z

@Zaharid @scarrazza have a look when you have time to the test_fit.py file, it contains one example of what can be used to test a fit.

At the moment I am considering the .fitinfo file because it contains all the chi2, the arclenghts, the epoch in which the fit stopped and whether the positivity passed (adding all other files is trivial ofc, this is just a first prototype).

The pipeline for travis is:

create a tmp folder
move there a quick runcard and run it
check the result hasn't changed with respect to the regression folder

Then we want to have some kind of flag so if you know something is supposed to change you can regenerate the result or part of the result. I don't know is whether this should be regenerated by Travis (the results depend on many random seeds coming from different places so I am guessing even if you run under conda, Mac and Linux won't get the exact same results)

Note: for now it is not running because it won't pass as I am not sure how to tell Travis to create regression data. Maybe it needs to be created, uploaded to the nnpdf server and then downloaded back for the next iterations?

Zaharid · 2019-10-17T16:38:06Z

n3fit/src/n3fit/tests/test_fit.py

+import subprocess as sp
+
+log = logging.getLogger(__name__)
+REGRESSION_FOLDER = pathlib.Path().absolute() / "regressions"


I wonder if we can use importlib.resources instead? Not sure it is good for anything other than being the new recommended way. It does have the theoretical advantage that you can put the package in a zip file and have it working.

scarlehoff · 2019-10-31T10:36:20Z

@Zaharid question.
The regression "data" should be generated by Travis and uploaded to the vp server?
If so how do I do it?

To first approximation I see I can shamelessly do

from validphys.uploadutils import FileUploader
uploader = FileUploader()
uploader.target_dir = 'WEB/thisisatesttodelete/'
uploader.root_url = 'https://vp.nnpdf.science/'
with uploader.upload_context( '.' ):
    pass

for instance. But is this the way to do it? Want to know for sure before cluttering the server with random tests.

scarlehoff · 2019-10-31T10:45:12Z

Want to know for sure before cluttering the server with random tests.

By which I mean before keeping on cluttering the server. I'll remove what I already uploaded.

scarlehoff · 2019-10-31T11:00:51Z

I imagine I can also create a fit called regression and upload the full fit and keep overwriting that one. Don't know.
To download things later I was planning on using loader.download_file btw.

Zaharid · 2019-10-31T11:12:13Z

We keep the existing test regression data in the git repo itself. The idea is that you compute it and you compare with the existing local file anywhere. I'd say that's doable for a fit if you don't look at things like the replica files, and would be preferred because it is simpler.

If not, there is a way to more or less transparently download a validphys output file.

nnpdf/validphys2/src/validphys/loader.py

Line 462 in 99f3e50

def check_vp_output_file(self, filename, extra_paths=('.',)):

nnpdf/validphys2/src/validphys/loader.py

Line 814 in 99f3e50

def download_vp_output_file(self, filename, **kwargs):

scarlehoff · 2019-10-31T11:13:55Z

We keep the existing test regression data in the git repo itself.

You mean uploaded by the user? The problem is that I don't think the Max and Linux regression tests, for instance, will produce the same results. Or the user and Travis for that matter since we are depending on several random seeds.

Zaharid · 2019-10-31T12:35:07Z

Apparently there is something like that:

https://stackoverflow.com/a/57094699/1007990

While it is a surprisingly difficult problem (see e.g. #77) it seems to me that is is something that should have been considered rather early in the design (seems clear tf is not really for scientists). In any case you cannot claim to have any handling of the random seeds (as per the sphinx docs) if it turns out it's going to lead to different results across the board.

scarlehoff · 2019-10-31T13:13:17Z

Can't manage to reproduce results between system 1 with gcc 9 pip-installed-tf and system 2 with gcc8 and conda-installed-tf (other than that the two systems should be equivalent).

I'll study a bit more. I don't know if I cannot use the tf seeding of that stackoverflow thread because I'm letting Keras manage the initialization or because I am using the v1 compatibility features (edit: or because I am doing something wrong which is ofc entirely possible)

If I don't manage I'll fallback to making travis upload the fit.

scarlehoff · 2019-11-04T11:03:57Z

The test runs correctly in Linux and it does work in OS X with the KMP_DUPLICATE_LIB_OK flag. I don't know whether the error is in Travis' side or whether it is the run script (I don't have a Mac to test outside of Travis' debug environment).

Asking for community input: should I set the flag such that it works in OS X (the test is single threaded so the fact that two OMP libs are linked should make no difference) or should we try to find out whether the problem is in the build/run script so that this test also checks whether the OpenMP installation is correct?

edit: I'll push the change anyway so I can be sure it works without me touching anything...

Zaharid

Other than the environment thing, which I think needs to be changed, this is good to go as far as I am concerned. The other things are minor but of course wouldn't hurt.

n3fit/src/n3fit/tests/test_backend.py

Zaharid · 2019-11-06T21:00:14Z

n3fit/src/n3fit/tests/test_fit.py

+from numpy.testing import assert_almost_equal
+
+log = logging.getLogger(__name__)
+REGRESSION_FOLDER = pathlib.Path(__file__).with_name("regressions")


I'd prefer using importlib.resources here, but don't mind it particularly.

n3fit/src/n3fit/tests/test_fit.py

Zaharid · 2019-11-06T21:09:34Z

n3fit/src/n3fit/tests/test_layers.py

+THRESHOLD = 1e-6
+
+# Helper functions
+def generate_input_had(flavs=3, xsize=2, ndata=4, n_combinations=None):


This would look really nice with hypothesis.

Just tried and it takes too long unless I put a minmax limit very strict which sort of defeats the purpose :(

Zaharid · 2019-11-06T21:20:32Z

Hello, this is @Zaharid's automated QA script. Please note it is highly experimental. I ran pylint on your changes and found some new issues.

On n3fit/src/n3fit/backends/keras_backend/internal_state.py, pylint has reported the following new issues:

Line 31: Module 'tensorflow._api.v1.random' has no 'set_seed' member; maybe 'get_seed'?

On n3fit/src/n3fit/fit.py, pylint has reported the following new issues:

Line 15: Too many statements (104/50)
Line 15: Too many branches (23/12)

On n3fit/src/n3fit/tests/test_fit.py, pylint has reported the following new issues:

Line 69: Using subprocess.run without explicitly set check is not recommended.

Zaharid · 2019-11-06T21:22:02Z

Pylint makes a good point that the test should fail if the process returns non zero exit status.

Zaharid · 2019-11-07T00:00:06Z

On Wed, Nov 6, 2019 at 11:17 PM Juacrumar ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In n3fit/src/n3fit/tests/test_fit.py <#590 (comment)>: > + + +def test_performfit(): + # read up the old info file + old_fitinfo = load_data(REGRESSION_FOLDER / f"{QUICKNAME}.fitinfo") + # create a /tmp folder + tmp_name = tempfile.mkdtemp(prefix="nnpdf-") + tmp_path = pathlib.Path(tmp_name) + # cp runcard to tmp folder + shutil.copy(QUICKPATH, tmp_path) + os.chdir(tmp_path) + # run the fit + os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE' + # The flag KMP_DUPLICATE_LIB_OK is necessary to avoid some errors + # related to the linking of OMP in travis when running under MacOS + sp.run(f"{EXE} {QUICKCARD} {REPLICA}", shell=True) Maybe I am reading it wrong, but env sets a new environment while I rather want to inherit the environment with one extra variable which is what I want. With respect to the list, they always use a string when using shell=True. Maybe I should begin by doing a quick test without shell=True.

Well subprocess.run(['python', '-c', 'import os; print(os.environ)'], env={'foo':'bar', **os.environ}) environ({'foo': 'bar', 'CINNAMON_VERSION': '4.2.4', 'COLORTERM': 'truecolor', 'CONDA_EXE': '/home/zah/anaconda3/bin/conda', 'CONDA_PYTHON_EXE': '/home/zah/anaconda3/bin/python', 'CONDA_SHLVL': '0', 'DBUS_SESSION_BUS_ADDRESS': 'unix:abstract=/tmp/dbus-nToXLHHW06,guid=d272461ced73abc5d55daf455dc2a173', 'DEFAULTS_PATH': '/usr/share/gconf/cinnamon.default.path', 'DESKTOP_SESSION': 'cinnamon', 'DISPLAY': ':0', 'GDMSESSION': 'cinnamon', 'GDM_LANG': 'en_US', 'GJS_DEBUG_OUTPUT': 'stderr', 'GJS_DEBUG_TOPICS': 'JS ERROR;JS LOG', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated', 'GNOME_TERMINAL_SCREEN': '/org/gnome/Terminal/screen/41042373_850e_482b_95d9_bb51c1fa6c72', 'GNOME_TERMINAL_SERVICE': ':1.74', 'GPG_AGENT_INFO': '/run/user/1000/gnupg/S.gpg-agent:0:1', 'GTK_MODULES': 'gail:atk-bridge', 'GTK_OVERLAY_SCROLLING': '1', 'HOME': '/home/zah', 'LANG': 'en_US.UTF-8', 'LANGUAGE': 'en_US', 'LC_ADDRESS': 'es_ES.UTF-8', 'LC_IDENTIFICATION': 'es_ES.UTF-8', 'LC_MEASUREMENT': 'es_ES.UTF-8', 'LC_MONETARY': 'es_ES.UTF-8', 'LC_NAME': 'es_ES.UTF-8', 'LC_NUMERIC': 'es_ES.UTF-8', 'LC_PAPER': 'es_ES.UTF-8', 'LC_TELEPHONE': 'es_ES.UTF-8', 'LOGNAME': 'zah', 'MANDATORY_PATH': '/usr/share/gconf/cinnamon.mandatory.path', 'MATHEMATICA_HOME': '/usr/local/Wolfram/Mathematica/11.1', 'PATH': '/home/zah/anaconda3/condabin:/home/zah/anaconda3/bin/:/home/zah/.cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin', 'PWD': '/home/zah', 'QT_ACCESSIBILITY': '1', 'SESSION_MANAGER': 'local/zah-XPS13-9333:@/tmp/.ICE-unix/1614,unix/zah-XPS13-9333:/tmp/.ICE-unix/1614', 'SHELL': '/usr/bin/fish', 'SHLVL': '1', 'SSH_AGENT_PID': '1694', 'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'TERM': 'xterm-256color', 'USER': 'zah', 'VTE_VERSION': '5202', 'XAUTHORITY': '/home/zah/.Xauthority', 'XDG_CONFIG_DIRS': '/etc/xdg/xdg-cinnamon:/etc/xdg', 'XDG_CURRENT_DESKTOP': 'X-Cinnamon', 'XDG_DATA_DIRS': '/usr/share/cinnamon:/usr/share/gnome:/home/zah/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share:/var/lib/snapd/desktop:/var/lib/snapd/desktop', 'XDG_GREETER_DATA_DIR': '/var/lib/lightdm-data/zah', 'XDG_RUNTIME_DIR': '/run/user/1000', 'XDG_SEAT': 'seat0', 'XDG_SEAT_PATH': '/org/freedesktop/DisplayManager/Seat0', 'XDG_SESSION_DESKTOP': 'cinnamon', 'XDG_SESSION_ID': 'c2', 'XDG_SESSION_PATH': '/org/freedesktop/DisplayManager/Session0', 'XDG_SESSION_TYPE': 'x11', 'XDG_VTNR': '7'}) Out[6]: CompletedProcess(args=['python', '-c', 'import os; print(os.environ)'], returncode=0)

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#590?email_source=notifications&email_token=ABLJWUSLFTMWDB7JJ47REW3QSNGANA5CNFSM4I7LUDG2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCKSRQ3Q#discussion_r343378918>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABLJWUVS56DCWGQMDGOOTKTQSNGANANCNFSM4I7LUDGQ> .

scarlehoff · 2019-11-07T08:32:06Z

env={'foo':'bar', **os.environ})

So I did understand it correctly. But ok, I'll pass the whole of os.environ to sp. I suppose is preferable in order to isolate the extra variable.

Zaharid · 2019-11-07T10:42:22Z

n3fit/src/n3fit/tests/test_fit.py

    # The flag KMP_DUPLICATE_LIB_OK is necessary to avoid some errors
    # related to the linking of OMP in travis when running under MacOS
-    sp.run(f"{EXE} {QUICKCARD} {REPLICA}", shell=True)
+    proc = sp.run(f"{EXE} {QUICKCARD} {REPLICA}", shell=True, env=new_environment, cwd = tmp_path)
+    assert proc.returncode == 0


Just add check=True to subprocess.run.

Zaharid · 2019-11-07T10:49:55Z

------------------------------

With respect to the list, they always use a string when using shell=True. Maybe I should begin by doing a quick test without shell=True.

The recommendation is not to use shell=True because you get proper escaping independent of the shell and platform which avoid accidental or intentional problems with quotes and pipes. Not a huge problem here because the call is simple enough and static, but the best practice is to pass a list of arguments and shell=False.

…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#590?email_source=notifications&email_token=ABLJWUSLFTMWDB7JJ47REW3QSNGANA5CNFSM4I7LUDG2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCKSRQ3Q#discussion_r343378918>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABLJWUVS56DCWGQMDGOOTKTQSNGANANCNFSM4I7LUDGQ> .

scarlehoff · 2019-11-07T11:49:38Z

The recommendation is not to use shell=True

I know, I was trying to see whether it worked in Os X but Travis kicked me out :(

scarlehoff · 2019-11-07T12:59:33Z

Ok, ready, I think I did not forget anything?

scarlehoff added the n3fit Issues and PRs related to n3fit label Oct 10, 2019

scarlehoff self-assigned this Oct 10, 2019

scarlehoff added the good first issue Good for newcomers label Oct 11, 2019

Zaharid reviewed Oct 15, 2019

View reviewed changes

n3fit/src/n3fit/tests/test_backend.py Outdated Show resolved Hide resolved

n3fit/src/n3fit/tests/test_layers.py Outdated Show resolved Hide resolved

scarlehoff force-pushed the n3fit-unitTesting branch from 0540b51 to 19fd450 Compare October 17, 2019 16:44

scarlehoff mentioned this pull request Oct 27, 2019

Upgrade n3fit to Tensorflow 2.0 (and Keras 2.3) #570

Closed

Zaharid reviewed Oct 30, 2019

View reviewed changes

scarlehoff added 8 commits October 30, 2019 17:03

added unit tests for the n3fit backend operations

b870129

add tests for the loss functions

ef1838f

add tests for the DIS and DY layers

dd0b3e2

added n3fit to the test script

0f17379

address review comments: add docstrings

4769cfe

added a first attempt to a full-fit comparison

0536a54

make my change of mind a bit less evident

db34647

change the quickcard.yml to use the test theory

5f40fa3

scarlehoff force-pushed the n3fit-unitTesting branch from 19fd450 to 5f40fa3 Compare October 31, 2019 12:25

use tf2 thread-seed handling

5babfb0

scarlehoff added 2 commits October 31, 2019 16:49

regression fit test

d8f30b0

now it should also work if you are in another folder...

2de9067

scarlehoff force-pushed the n3fit-unitTesting branch from 6b74d25 to 2de9067 Compare October 31, 2019 17:21

scarlehoff and others added 2 commits October 31, 2019 19:14

and now it might also work for non-development environ

b60f738

sigh

7986c62

scarlehoff force-pushed the n3fit-unitTesting branch from c04479d to 7986c62 Compare November 1, 2019 00:36

forgot to add runcard to the right folder

e92cac2

scarlehoff and others added 2 commits November 4, 2019 13:55

set KMP_DUPLICATE_LIB_OK flag in fit test

b5385c8

add comment to KMP flag

9b4a7c9

Zaharid reviewed Nov 6, 2019

View reviewed changes

scarlehoff force-pushed the n3fit-unitTesting branch from 17959db to 9b4a7c9 Compare November 7, 2019 08:33

scarlehoff added 2 commits November 7, 2019 09:44

use subprocess how it is suppossed to be used

cebabbd

add docstring

209fb2b

Zaharid reviewed Nov 7, 2019

View reviewed changes

try noshell

a4a4624

Zaharid approved these changes Nov 7, 2019

View reviewed changes

Zaharid merged commit d35489b into master Nov 7, 2019

scarrazza deleted the n3fit-unitTesting branch April 22, 2020 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] n3fit unit testing #590

[WIP] n3fit unit testing #590

scarlehoff commented Oct 10, 2019 •

edited

Loading

scarlehoff commented Oct 11, 2019 •

edited

Loading

Zaharid commented Oct 11, 2019

Zaharid commented Oct 11, 2019

scarlehoff commented Oct 11, 2019

Zaharid commented Oct 11, 2019

Zaharid commented Oct 11, 2019

scarlehoff commented Oct 11, 2019

Zaharid left a comment

Zaharid commented Oct 15, 2019

scarlehoff commented Oct 16, 2019

scarlehoff commented Oct 17, 2019 •

edited

Loading

Zaharid Oct 17, 2019

scarlehoff commented Oct 31, 2019 •

edited

Loading

scarlehoff commented Oct 31, 2019 •

edited

Loading

scarlehoff commented Oct 31, 2019 •

edited

Loading

Zaharid commented Oct 31, 2019

scarlehoff commented Oct 31, 2019

Zaharid commented Oct 31, 2019

scarlehoff commented Oct 31, 2019 •

edited

Loading

scarlehoff commented Nov 4, 2019 •

edited

Loading

Zaharid left a comment

Zaharid Nov 6, 2019

Zaharid Nov 6, 2019

scarlehoff Nov 7, 2019

Zaharid commented Nov 6, 2019

Zaharid commented Nov 6, 2019

Zaharid commented Nov 7, 2019 via email

scarlehoff commented Nov 7, 2019

Zaharid Nov 7, 2019

Zaharid commented Nov 7, 2019 via email

scarlehoff commented Nov 7, 2019

scarlehoff commented Nov 7, 2019

[WIP] n3fit unit testing #590

[WIP] n3fit unit testing #590

Conversation

scarlehoff commented Oct 10, 2019 • edited Loading

Unit tests

scarlehoff commented Oct 11, 2019 • edited Loading

Zaharid commented Oct 11, 2019

Zaharid commented Oct 11, 2019

scarlehoff commented Oct 11, 2019

Zaharid commented Oct 11, 2019

Zaharid commented Oct 11, 2019

scarlehoff commented Oct 11, 2019

Zaharid left a comment

Choose a reason for hiding this comment

Zaharid commented Oct 15, 2019

scarlehoff commented Oct 16, 2019

scarlehoff commented Oct 17, 2019 • edited Loading

Zaharid Oct 17, 2019

Choose a reason for hiding this comment

scarlehoff commented Oct 31, 2019 • edited Loading

scarlehoff commented Oct 31, 2019 • edited Loading

scarlehoff commented Oct 31, 2019 • edited Loading

Zaharid commented Oct 31, 2019

scarlehoff commented Oct 31, 2019

Zaharid commented Oct 31, 2019

scarlehoff commented Oct 31, 2019 • edited Loading

scarlehoff commented Nov 4, 2019 • edited Loading

Zaharid left a comment

Choose a reason for hiding this comment

Zaharid Nov 6, 2019

Choose a reason for hiding this comment

Zaharid Nov 6, 2019

Choose a reason for hiding this comment

scarlehoff Nov 7, 2019

Choose a reason for hiding this comment

Zaharid commented Nov 6, 2019

Zaharid commented Nov 6, 2019

Zaharid commented Nov 7, 2019 via email

scarlehoff commented Nov 7, 2019

Zaharid Nov 7, 2019

Choose a reason for hiding this comment

Zaharid commented Nov 7, 2019 via email

scarlehoff commented Nov 7, 2019

scarlehoff commented Nov 7, 2019

scarlehoff commented Oct 10, 2019 •

edited

Loading

scarlehoff commented Oct 11, 2019 •

edited

Loading

scarlehoff commented Oct 17, 2019 •

edited

Loading

scarlehoff commented Oct 31, 2019 •

edited

Loading

scarlehoff commented Oct 31, 2019 •

edited

Loading

scarlehoff commented Oct 31, 2019 •

edited

Loading

scarlehoff commented Oct 31, 2019 •

edited

Loading

scarlehoff commented Nov 4, 2019 •

edited

Loading