Skip to content

holoviz-dev/pyct

Repository files navigation

pyct

A utility package that includes:

  1. pyct.cmd: Makes various commands available to other packages. (Currently no sophisticated plugin system, just a try import/except in the other packages.) The same commands are available from within python. Can either add new subcommands to an existing argparse based command if the module has an existing command, or create the entire command if the module has no existing command. Currently, there are commands for copying examples and fetching data. See

  2. pyct.build: Provides various commands to help package building, primarily as a convenience for project maintainers.

pyct.cmd

To install pyct with the dependencies required for pyct.cmd: pip install pyct[cmd] or conda install -c pyviz pyct.

An example of how to use in a project: https://github.com/holoviz/geoviews/blob/main/geoviews/__main__.py

Once added, users can copy the examples of a package and download the required data with the examples command:

$ datashader examples --help
usage: datashader examples [-h] [--path PATH] [-v] [--force] [--use-test-data]

optional arguments:
  -h, --help       show this help message and exit
  --path PATH      location to place examples and data
  -v, --verbose
  --force          if PATH already exists, force overwrite existing examples
                   if older than source examples. ALSO force any existing data
                   files to be replaced
  --use-test-data  Use data's test files, if any, instead of fetching full
                   data. If test file not in '.data_stubs', fall back to
                   fetching full data.

To copy the examples of e.g. datashader but not download the data, there's a copy-examples command:

usage: datashader copy-examples [-h] [--path PATH] [-v] [--force]

optional arguments:
  -h, --help     show this help message and exit
  --path PATH    where to copy examples
  -v, --verbose
  --force        if PATH already exists, force overwrite existing files if
                 older than source files

And to download the data only, the fetch-data command:

usage: datashader fetch-data [-h] [--path PATH] [--datasets DATASETS] [-v]
                        [--force] [--use-test-data]

optional arguments:
  -h, --help           show this help message and exit
  --path PATH          where to put data
  --datasets DATASETS  *name* of datasets file; must exist either in path
                       specified by --path or in package/examples/
  -v, --verbose
  --force              Force any existing data files to be replaced
  --use-test-data      Use data's test files, if any, instead of fetching full
                       data. If test file not in '.data_stubs', fall back to
                       fetching full data.

Can specify different 'datasets' file:

$ cat earthsim-examples/test.yml
---

data:

  - url: http://s3.amazonaws.com/datashader-data/Chesapeake_and_Delaware_Bays.zip
    title: 'Depth data for the Chesapeake and Delaware Bay region of the USA'
    files:
      - Chesapeake_and_Delaware_Bays.3dm

$ earthsim fetch-data --path earthsim-examples --datasets-filename test.yml
Downloading data defined in /tmp/earthsim-examples/test.yml to /tmp/earthsim-examples/data
Skipping Depth data for the Chesapeake and Delaware Bay region of the USA

Can use smaller files instead of large ones by using the --use-test-data flag and placing a small file with the same name in examples/data/.data_stubs:

$ tree examples/data -a
examples/data
├── .data_stubs
│   └── nyc_taxi_wide.parq
└── diamonds.csv

$ cat examples/dataset.yml
data:

  - url: http://s3.amazonaws.com/datashader-data/nyc_taxi_wide.parq
    title: 'NYC Taxi Data'
    files:
      - nyc_taxi_wide.parq

  - url: http://s3.amazonaws.com/datashader-data/maccdc2012_graph.zip
    title: 'National CyberWatch Mid-Atlantic Collegiate Cyber Defense Competition'
    files:
      - maccdc2012_nodes.parq
      - maccdc2012_edges.parq
      - maccdc2012_full_nodes.parq
      - maccdc2012_full_edges.parq

$ pyviz fetch-data --path=examples --use-test-data
Fetching data defined in /tmp/pyviz/examples/datasets.yml and placing in /tmp/pyviz/examples/data
Copying test data file '/tmp/pyviz/examples/data/.data_stubs/nyc_taxi_wide.parq' to '/tmp/pyviz/examples/data/nyc_taxi_wide.parq'
No test file found for: /tmp/pyviz/examples/data/.data_stubs/maccdc2012_nodes.parq. Using regular file instead
Downloading National CyberWatch Mid-Atlantic Collegiate Cyber Defense Competition 1 of 1
[################################] 59/59 - 00:00:00

To clean up any potential test files masquerading as real data use clean-data:

usage: pyviz clean-data [-h] [--path PATH]

optional arguments:
  -h, --help   show this help message and exit
  --path PATH  where to clean data

pyct.build

Currently provides a way to package examples with a project, by copying an examples folder into the package directory whenever setup.py is run. The way this works is likely to change in the near future, but is provided here as the first step towards unifying/simplifying the maintenance of a number of pyviz projects.

pyct report

Provides a way to check the package versions in the current environment using:

  1. A console script (entry point): pyct report [packages], or
  2. A python function: import pyct; pyct.report(packages)

The python function can be particularly useful for e.g. jupyter notebook users, since it is the packages in the current kernel that we usually care about (not those in the environment from which jupyter notebook server/lab was launched).

Note that packages above can include the name of any Python package (returning the __version__), along with the special cases python or conda (returning the version of the command-line tool) or system (returning the OS version).