Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hierarchical upload API optimized for folders & collections. #5220

Merged
merged 19 commits into from
Mar 9, 2018

Commits on Mar 8, 2018

  1. Configuration menu
    Copy the full SHA
    ebac0fd View commit details
    Browse the repository at this point in the history
  2. Refactor shed's CompressedFile abstraction into galaxy.util.compressi…

    …on_util.
    
    I need to use this from upload code.
    jmchilton committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    94cc4df View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    306dded View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a45cbfe View commit details
    Browse the repository at this point in the history
  5. Hierarchical upload API optimized for folders & collections.

    Allows describing hierarchical data in JSON or inferring structure from archives or directories.
    
    Datasets or archive sources can be specified via uploads, URLs, paths (if admin && allow_path_paste), library_import_dir/user_library_import_dir, and/or FTP imports. Unlike existing API endpoints, a mix of these on a per file basis is allowed and they work seemlessly between libraries and histories.
    
    Supported "archives" include gzip, zip, bagit directories, bagit achives (with fetching and validations of downloads).
    
    The existing upload API endpoint is quite rough to work with both in terms of adding parameters (e.g. the file type and dbkey hanlding in 4563 was difficult to implement, terribly hacky, and should seemingly have been trivial) and in terms of building requests (one needs to build a tool form - not describe sensible inputs in JSON). This API is built to be intelligable from an API standpoint instead of being constrained to the older style tool form. Additionally it built with hierarchical data in mind in a way that would not be easy at all enhancing the tool form components we don't even render.
    
    This implements 5159 though much simpler YAML descriptions of data libraries should be possible basically as the API descriptions. We can replace the data library script in Ephemeris https://github.com/galaxyproject/ephemeris/blob/master/ephemeris/setup_data_libraries.py with one that converts a simple YAML file into an API call and allows many new options for free.
    
    In future PRs I'll add filtering options to this and it will serve as the backend to 4733.
    jmchilton committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    1c6cc02 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    ce2170a View commit details
    Browse the repository at this point in the history
  7. Do not allow workflows to run tools that are not workflow-compatible.

    In the case of data-fetch there is extra validation that is done so this is somewhat important.
    jmchilton committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    b6f4bff View commit details
    Browse the repository at this point in the history
  8. Don't purge library path pastes and such in upload.py during testing.

    Concerning that they sometimes will get deleted in production with default settings - see 5361.
    jmchilton committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    347f1e1 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    3bf5990 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    1720354 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    5c5dbd2 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    54ee573 View commit details
    Browse the repository at this point in the history
  13. Precreate certain outputs for upload 2.0 API.

    Trying to improve the user experience of this rule based uploader by placing HDAs and HDCAs in the history at the outset that the history panel can poll and that we can turn red if the upload fails.
    
    From Marius' PR review:
    
    > I can see that a job launched in my logs, but it failed and there were no visual indications of this in the UI
    
    Not every HDA for instance can be created, for example if reading them from a zip file for instance that happens on the backend still. Likewise if HDCAs don't define a collection type up front they cannot be pre-created (if for instance that is inferred from a folder structure). Library things aren't precreated at all in this commit. There is room to pre-create more but I think this is an atomic commit as it is now and it will hopefully improve the user experience for the rule based uploader considerably.
    jmchilton committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    60f632b View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    d783fc3 View commit details
    Browse the repository at this point in the history
  15. Cleanup hierarchical upload commit based on PR comments from @bgruening.

    - Remove seemingly unneeded hack in upload_common.
    - Remove stray debug statement.
    - Add more comments in the output collection code related to different destination types.
    - Restructure if/else in data_fetch to avoid assertion with constant.
    jmchilton committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    3314183 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    ca28000 View commit details
    Browse the repository at this point in the history
  17. Consistent sniffing regardless of in_place.

    Previously sniffing would happen on the original file (before carriage returns and tabular spaces were converted) if in_place was false and on the converted file if it was true.
    jmchilton committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    495d125 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    d651348 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    bca2c3c View commit details
    Browse the repository at this point in the history