Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data access api README/Docs update #744

Open
wants to merge 44 commits into
base: dev
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
f9ef107
Doc changes
anamanica Jul 26, 2024
72f1924
Merge branch 'dev' of github.com:IMAP-Science-Operations-Center/imap_…
anamanica Jul 26, 2024
d415846
Updating
anamanica Jul 26, 2024
26135b0
Fixing
anamanica Jul 26, 2024
4376bd7
Fixing look
anamanica Jul 26, 2024
9e06ca6
Playing with format
anamanica Jul 26, 2024
5b9f4c6
Fixing again
anamanica Jul 26, 2024
1deb7aa
Fixing again
anamanica Jul 26, 2024
713f0dd
Fixing again
anamanica Jul 26, 2024
8f3218a
Fixing again
anamanica Jul 26, 2024
923d74b
Fixing again
anamanica Jul 26, 2024
41952cd
Fixing again
anamanica Jul 26, 2024
7c70cea
Updating Branch
anamanica Jul 30, 2024
ecf5ca8
Small change
anamanica Jul 30, 2024
73c5dce
More formatting
anamanica Jul 30, 2024
3121b00
Updating branch
anamanica Aug 7, 2024
e9e3ced
PR updates
anamanica Aug 7, 2024
8d2a93a
I changed my mind on format
anamanica Aug 7, 2024
3211b72
Inital Changes
anamanica Aug 7, 2024
dc3f937
Some small changes
anamanica Aug 7, 2024
d46c20a
Little changes
anamanica Aug 7, 2024
37a9685
Fixing Merge
anamanica Aug 7, 2024
304e779
Fixing docs error
anamanica Aug 7, 2024
ac3274a
Updating Branch
anamanica Aug 8, 2024
047ad5c
Updating Branch
anamanica Aug 9, 2024
55a4681
Updating Branch.
anamanica Aug 9, 2024
2af4df7
Order changes
anamanica Aug 12, 2024
a4e15e0
Order changes
anamanica Aug 12, 2024
8952a5a
Pre-commit
anamanica Aug 12, 2024
1182258
Small changes
anamanica Aug 14, 2024
c9cf41b
Updating Branch.
anamanica Aug 15, 2024
510f551
More PR comments
anamanica Aug 15, 2024
e0858a5
Adding descriptions
anamanica Aug 16, 2024
1b76a68
Hopefully fixing my mistake
anamanica Aug 16, 2024
56e626f
More small changes
anamanica Aug 16, 2024
22fc75a
Updating branch.
anamanica Aug 29, 2024
8598f0b
Testing ref links
anamanica Aug 29, 2024
93295a6
Testing ref links AGAIN
anamanica Aug 29, 2024
b1f13ec
More ref testing
anamanica Aug 29, 2024
33be899
Checking formatting
anamanica Aug 29, 2024
67600f1
Small changes
anamanica Aug 29, 2024
51bba5b
Small change
anamanica Aug 29, 2024
a42e4f9
Final comment changes
anamanica Aug 29, 2024
9e0bba6
Final touches
anamanica Aug 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
226 changes: 204 additions & 22 deletions docs/source/data-access-api/index.rst
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this document is quite lengthy now and have several sections and subsections, I suggest to add a table of contents somewhere towards the top. (Could be done in a future PR if desired)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, Ana pointed out to me that there is a built-in table of contents on ReadTheDocs, which I totally missed!

Original file line number Diff line number Diff line change
Expand Up @@ -4,33 +4,224 @@ Data Access API
===============

The `imap-data-access <https://github.com/IMAP-Science-Operations-Center/imap-data-access>`_
repository provides programmatic access and a command-line utility for
interacting with the API. It is the preferred way to use the API.
repository provides a command-line utility and python package for
interacting with the API programmatically. It is the preferred way to use the API.

The SDC provides a REST API that allows users to upload and download files, as
well as query for file metadata. The following documentation describes the
various endpoints that are supported and how to use them.
Users may also download, upload, and query via the REST API directly through the browser, or via `curl` commands.
The `REST API Specification`_ section describes the various endpoints that are supported, and how to use them.

*Note: Several sections and links begin with* [WIP]. *As development on the API is ongoing, this indicates
that the full implementation of the functionality is yet to be completed.*

The API can be accessed from the following URL [WIP]: https://api.dev.imap-mission.com

.. openapi:: openapi.yml
:group:
:include: /upload
Command Line Utility
--------------------
To Install
^^^^^^^^^^

Run the following command to use the API CLI:

.. code-block:: bash

pip install imap-data-access

Base Command Arguments
^^^^^^^^^^^^^^^^^^^^^^

The following are base command arguments for the CLI:

.. code-block:: bash

imap-data-access -h # or
imap-data-access query # or
imap-data-access download # or
imap-data-access upload

Add the -h flag with any base command for more information on use and functionality.

Query
^^^^^

When uploading files to the API, ensure these files are stored properly in a ``data`` directory. Then,
ensure your working directory is one level above the ``data`` directory in order to properly upload files.
To query for files, you can use several parameters: ``--instrument``, ``--data-level``, ``--descriptor``, etc.

Further information is found in in the ``query -h`` menu. You can use parameters alone, or in combination.

**Example Usage:**

.. code-block:: bash

imap-data-access query --start-date 20240101 --end-date 20241231 --output-format json
# The following line is returned:
[{'file_path': 'imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_v001.pkts', 'instrument': 'swe',
'data_level': 'l0', 'descriptor': 'sci', 'start_date': '20240105', 'version': 'v001', 'extension': 'pkts'},
{'file_path': 'imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_v001.pkts', 'instrument': 'swe',
'data_level': 'l0', 'descriptor': 'sci', 'start_date': '20240105', 'version': 'v001', 'extension': 'pkts'}]

Download
^^^^^^^^

To download files using the CLI tool, use the command ``download``. The downloaded files will be placed in a ``data`` directory.

It is important to note that your working directory will be established as the default directory. I.e, the ``data``
directory will automatically be placed in this file path. Choose your working directory
accordingly to suit your desires.

When downloading a file from the API, different folders within the ``data`` directory will be made to better
organize the downloaded files. See the example path: ``data/imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_20240105_v00-01.pkts``.
The ``data`` directory and its structure is further described here: `Data Directory`_

**Example Usage:**

.. code-block:: bash

imap-data-access download imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_v001.pkts


Upload
^^^^^^

Similarly, files can be uploaded to the API using the command ``upload``.

When uploading files to the API, ensure these files are stored properly in a ``data`` directory (see the `Data Directory`_ section below for more information). Then,
ensure your working directory is one level above ``data`` in order to properly upload files.

[WIP] Certain ancillary files can also be uploaded to the API. For more specific information regarding these files, visit
`Ancillary Files <https://imap-processing.readthedocs.io/en/latest/data-access-api/calibration-files.html>`_

**Example Usage:**

.. code-block:: bash
.. code-block:: bash

imap-data-access upload /imap/swe/l1a/2024/01/imap_swe_l1a_sci_20240105_v001.cdf

Importing as a package
----------------------
Imap data access can also be imported and used as a python package.

**Example Usage:**

.. code-block:: bash

import imap_data_access

# Search for files
results = imap_data_access.query(instrument="mag", data_level="l0")
# results is a list of dictionaries
# [{'file_path': 'imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_v001.pkts', 'instrument': 'swe',
'data_level': 'l0', 'descriptor': 'sci', 'start_date': '20240105','version': 'v001', 'extension': 'pkts'},
{'file_path': 'imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_v001.pkts', 'instrument': 'swe',
'data_level': 'l0', 'descriptor': 'sci', 'start_date': '20240105', 'version': 'v001', 'extension': 'pkts'}]

# Download a file that was returned from the search
imap_data_access.download("imap/mag/l0/2024/01/imap_mag_l0_raw_202040101_v001.pkts")

# Upload a calibration file that exists locally
imap_data_access.upload("imap/swe/l1a/2024/01/imap_swe_l1a_sci_20240105_v001.cdf")

Configuration
--------------

.. _data-directory:

Data Directory
^^^^^^^^^^^^^^

curl -X GET -H "Accept: application/json" https://api.dev.imap-mission.com/upload/imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_20240105_v00-01.pkts
The folder structure for data files within the IMAP SDC is rigidly defined, so the data access api will mimic that structure to make sure all data is stored in the same hierarchical structure as the SDC. This will enable seamless transition between a user's local system and the SDC. This is only used for downloads.
A user's root data location can be specified as an environment variable ``IMAP_DATA_DIR`` or through a configuration dictionary within the package itself (``imap_data_access.config["DATA_DIR"]``). If the ``IMAP_DATA_DIR`` variable is not set, the program defaults to the user's current working directory + ``data/``.
The following is the directory structure the IMAP SDC uses.

.. code-block:: bash

<IMAP_DATA_DIR>/
imap/
<instrument>/
<data_level>/
<year>/
<month>/
<filename>

for example, with ``IMAP_DATA_DIR=/data:``

.. code-block:: bash

/data/
imap/
swe/
l0/
2024/
01/
imap_swe_l0_sci_20240105_v001.pkts

Data Access URL
^^^^^^^^^^^^^^^

To change the default URL that the package accesses, you can set the environment variable ``IMAP_DATA_ACCESS_URL`` or within the package ``imap_data_access.config["DATA_ACCESS_URL"]``. The default is the development server (``https://api.dev.imap-mission.com``).

File Validation
---------------

This package validates filenames and paths to check they follow our standards, as defined by the `filename conventions <https://imap-processing.readthedocs.io/en/latest/development-guide/style-guide/naming-conventions.html>`_. There is also a class available for use by other packages to create filepaths and filenames that follow the IMAP SDC conventions.
To use this class, use ``imap_data_access.ScienceFilePath``.

Usage:

.. code-block:: bash

science_file = imap_data_access.ScienceFilePath("imap_swe_l0_sci_20240101_v001.pkts")

# Filepath = /imap/swe/l0/2024/01/imap_swe_l0_sci_20240101_v001.pkts
filepath = science_file.construct_path()

Troubleshooting
---------------
Network Issues
^^^^^^^^^^^^^^

**SSL**

If you encounter SSL errors similar to the following:

.. code-block:: bash

urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>

That generally means the Python environment you're using is not finding your system's root certificates properly. This means you need to tell Python how to find those certificates with the following potential solutions.

#. Upgrade the certifi package

.. code-block:: bash

pip install --upgrade certifi

#. Install system certificates -- Depending on the Python version you installed the program with, the command will look something like this:

.. code-block:: bash

/Applications/Python\ 3.10/Install\ Certificates.command

**HTTP Error 502: BadGateway**

This could mean that the service is temporarily down. If you continue to encounter this, reach out to the IMAP SDC at imap-sdc@lasp.colorado.edu.

FileNotFoundError
^^^^^^^^^^^^^^^^^

This could mean that the local data directory is not set up with the same paths as the SDC. See the `Data Directory`_ section for an example of how to set this up.

.. _rest-api-specification:

REST API Specification
----------------------
.. openapi:: openapi.yml
:group:
:include: /upload

**Example Usage:**

.. code-block:: bash

curl -X GET -H "Accept: application/json" https://api.dev.imap-mission.com/upload/imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_20240105_v00-01.pkts

**Possible Responses:**

Expand All @@ -47,18 +238,10 @@ ensure your working directory is one level above the ``data`` directory in order
{"statusCode": 400, "body": "Invalid extension. Extension should be pkts for data level l0 and cdf for data level higher than l0"}
{"statusCode": 409, "body": "https://sds-data-<aws_account_number>.s3.amazon.com/imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_20240105_v00-01.pkts already exists."}


.. openapi:: openapi.yml
:group:
:include: /download

It is important to note that your working directory will be established as the default directory. I.e, the ``data``
directory--which files are downloaded to--will automatically be placed in this file path. Choose your working directory
accordingly to suit your desires.

When downloading a file from the API, different folders within the ``data`` directory will be made to better
organize the files. See the example file path: ``data/imap/swe/l0/2024/01/imap_swe_l0_sci_20240105_20240105_v00-01.pkts``

**Example Usage:**

.. code-block:: bash
Expand All @@ -73,7 +256,6 @@ organize the files. See the example file path: ``data/imap/swe/l0/2024/01/imap_s
{"statusCode": 400, "body": "No file requested for download. Please provide a filename in the path. Eg. /download/path/to/file/filename.pkts"}
{"statusCode": 404, "body": "File not found, make sure you include the full path to the file in the request, e.g. /download/path/to/file/filename.pkts"}


.. openapi:: openapi.yml
:group:
:include: /query
Expand All @@ -92,7 +274,7 @@ organize the files. See the example file path: ``data/imap/swe/l0/2024/01/imap_s
{"statusCode": 400, "headers": {"Content-Type": "application/json", "Access-Control-Allow-Origin": "*"}, "body": "<param> is not a valid query parameter. Valid query parameters are: ['file_path', 'instrument', 'data_level', 'descriptor', 'start_date', 'end_date', 'version', 'extension']"}

Other pages
===========
-----------

.. toctree::
:maxdepth: 1
Expand Down
Loading