Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Report download time and size #11635

Merged
merged 7 commits into from
Apr 18, 2023
Merged

MAINT: Report download time and size #11635

merged 7 commits into from
Apr 18, 2023

Conversation

larsoner
Copy link
Member

CircleCI has timed out a couple of times lately due to our 3h limit. It costs more $$$ to increase this limit and I think it's pretty reasonable anyway, so let's try to make it faster. 2h is spent in the doc build mostly due to examples/tutorials -- which we should certainly continue to try to optimize -- but recently 45min has been spent in dataset downloading. This PR adds some diagnostic download time/size logger to our fetching that should help us see it in logs, and might be nice for users.

@larsoner larsoner added this to the 1.4 milestone Apr 18, 2023
@larsoner
Copy link
Member Author

Looks like this might be one culprit, looking locally after watching CircleCI have a 10-minute timeout trying to download it:

hf_sef_evoked.tar.gz
https://zenodo.org/record/3523071/files/hf_sef_evoked.tar.gz
731 KB/s - 124 MB of 731 MB, 14 mins left

we already have a mirror on osf.io so I'll update the file there and push

@larsoner larsoner marked this pull request as ready for review April 18, 2023 20:43
@larsoner
Copy link
Member Author

Okay @drammock this one is ready for review. Mirroring data on OSF.io took time from ~45 min down to ~15 min as you can see here (along with all timings for downloads -- I killed the build once that step was done):

https://app.circleci.com/pipelines/github/mne-tools/mne-python/18689/workflows/29272f16-bc51-4ce8-ae40-c7ef59c3795f/jobs/53962

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a nice improvement. CIs are not happy though. Cirrus fails with

mne/datasets/sleep_physionet/tests/test_physionet.py:132: in test_sleep_physionet_age
    paths = age.fetch_data(subjects=[0], recording=[1], path=physionet_tmpdir)
<decorator-gen-588>:12: in fetch_data
    ???
mne/datasets/sleep_physionet/age.py:129: in fetch_data
    sz += os.path.getsize(psg_fname)
/opt/homebrew/Cellar/python@3.10/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/genericpath.py:50: in getsize
    return os.stat(filename).st_size
E   FileNotFoundError: [Errno 2] No such file or directory: '/private/var/folders/76/zy5ktkns50v6gt5g8r0sf6sc0000gn/T/pytest-of-admin/pytest-0/physionet_files0/physionet-sleep-data/SC4001E0-PSG.edf'

which looks related / legit; haven't looked at any other failures

@drammock
Copy link
Member

there are also

 _______________________ test_manifest_check_download[1] ________________________
mne/datasets/tests/test_datasets.py:225: in test_manifest_check_download
    _manifest_check_download(manifest_path, destination, url, hash_)
mne/datasets/utils.py:570: in _manifest_check_download
    pooch.retrieve(
E   TypeError: _fake_zip_fetch() got an unexpected keyword argument 'downloader'

@larsoner larsoner enabled auto-merge (squash) April 18, 2023 22:39
@larsoner larsoner merged commit 8fc3d07 into mne-tools:main Apr 18, 2023
@larsoner larsoner deleted the dl branch April 19, 2023 11:15
@drammock
Copy link
Member

I think this PR broke circle:

WARNING: /home/circleci/project/tutorials/preprocessing/40_artifact_correction_ica.py failed to execute correctly: Traceback (most recent call last):
  File "/home/circleci/project/tutorials/preprocessing/40_artifact_correction_ica.py", line 539, in <module>
    fname = mne.datasets.eegbci.load_data(subj + 1, runs=[3])[0]
IndexError: list index out of range

larsoner added a commit to cbrnr/mne-python that referenced this pull request Apr 21, 2023
* upstream/main: (50 commits)
  BUG: Fix bug with paths (mne-tools#11639)
  MAINT: Report download time and size (mne-tools#11635)
  MRG: Allow retrieval of channel names via make_1020_channel_selections() (mne-tools#11632)
  Fix index name in to_data_frame()'s docstring (mne-tools#11457)
  MAINT: Use VTK prerelease wheels in pre jobs (mne-tools#11629)
  ENH: Allow gradient compensated data in maxwell_filter (mne-tools#10554)
  make test compatible with future pandas (mne-tools#11625)
  Display SVG figures correctly in Report (mne-tools#11623)
  API: Port ieeg gui over to mne-gui-addons and add tfr gui example (mne-tools#11616)
  MAINT: Add token [ci skip] (mne-tools#11622)
  API: One cycle of backward compat (mne-tools#11621)
  MAINT: Use git rather than zipball (mne-tools#11620)
  ENH: Speed up code a bit (mne-tools#11614)
  [BUG, MRG] Don't modify info in place for transform points (mne-tools#11612)
  [BUG, MRG] Fix topomap extra plot generated, add util to check a range (mne-tools#11607)
  ENH: Add mne-bids-pipeline to mne sys_info (mne-tools#11606)
  MAINT: `coding: utf-8` is implicit in Python 3 (mne-tools#11599)
  ENH: Read eyetracking data (Eyelink) (Fork of mne-tools#10855 ) (mne-tools#11152)
  MAINT: In Python 3, do not prefix literals with `u` (mne-tools#11604)
  MAINT: object is an implicit base for all classes (mne-tools#11601)
  ...
larsoner added a commit to georgeoneill/mne-python that referenced this pull request Apr 21, 2023
* upstream/main:
  BUG: Fix bug with paths (mne-tools#11639)
  MAINT: Report download time and size (mne-tools#11635)
  MRG: Allow retrieval of channel names via make_1020_channel_selections() (mne-tools#11632)
  Fix index name in to_data_frame()'s docstring (mne-tools#11457)
  MAINT: Use VTK prerelease wheels in pre jobs (mne-tools#11629)
  ENH: Allow gradient compensated data in maxwell_filter (mne-tools#10554)
  make test compatible with future pandas (mne-tools#11625)
larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 24, 2023
larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 25, 2023
* upstream/main: (152 commits)
  FIX: missing channels/fiducials can be np.nan (mne-tools#11634)
  use py3.10 in precommit config (mne-tools#11648)
  MAINT: Unify GH Actions pytest (mne-tools#11644)
  MRG: Rename "Discourse" link in top navigation to "Forum" [ci skip] (mne-tools#11649)
  ENH: Add support for Harmonic Field correction (mne-tools#11536)
  Add pre-commit (mne-tools#11541)
  BUG: Fix bug with paths (mne-tools#11639)
  MAINT: Report download time and size (mne-tools#11635)
  MRG: Allow retrieval of channel names via make_1020_channel_selections() (mne-tools#11632)
  Fix index name in to_data_frame()'s docstring (mne-tools#11457)
  MAINT: Use VTK prerelease wheels in pre jobs (mne-tools#11629)
  ENH: Allow gradient compensated data in maxwell_filter (mne-tools#10554)
  make test compatible with future pandas (mne-tools#11625)
  Display SVG figures correctly in Report (mne-tools#11623)
  API: Port ieeg gui over to mne-gui-addons and add tfr gui example (mne-tools#11616)
  MAINT: Add token [ci skip] (mne-tools#11622)
  API: One cycle of backward compat (mne-tools#11621)
  MAINT: Use git rather than zipball (mne-tools#11620)
  ENH: Speed up code a bit (mne-tools#11614)
  [BUG, MRG] Don't modify info in place for transform points (mne-tools#11612)
  ...
larsoner added a commit to larsoner/mne-python that referenced this pull request Apr 25, 2023
* upstream/main: (117 commits)
  FIX: missing channels/fiducials can be np.nan (mne-tools#11634)
  use py3.10 in precommit config (mne-tools#11648)
  MAINT: Unify GH Actions pytest (mne-tools#11644)
  MRG: Rename "Discourse" link in top navigation to "Forum" [ci skip] (mne-tools#11649)
  ENH: Add support for Harmonic Field correction (mne-tools#11536)
  Add pre-commit (mne-tools#11541)
  BUG: Fix bug with paths (mne-tools#11639)
  MAINT: Report download time and size (mne-tools#11635)
  MRG: Allow retrieval of channel names via make_1020_channel_selections() (mne-tools#11632)
  Fix index name in to_data_frame()'s docstring (mne-tools#11457)
  MAINT: Use VTK prerelease wheels in pre jobs (mne-tools#11629)
  ENH: Allow gradient compensated data in maxwell_filter (mne-tools#10554)
  make test compatible with future pandas (mne-tools#11625)
  Display SVG figures correctly in Report (mne-tools#11623)
  API: Port ieeg gui over to mne-gui-addons and add tfr gui example (mne-tools#11616)
  MAINT: Add token [ci skip] (mne-tools#11622)
  API: One cycle of backward compat (mne-tools#11621)
  MAINT: Use git rather than zipball (mne-tools#11620)
  ENH: Speed up code a bit (mne-tools#11614)
  [BUG, MRG] Don't modify info in place for transform points (mne-tools#11612)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants