-
Notifications
You must be signed in to change notification settings - Fork 7
telecon notes
https://github.com/hapi-server/tasks
- discuss SuperMAG status
- add recent presentations to GitHub repo
- News:
- summary of PyHC meeting: new level coming for PyHC projects: bronze, silver gold, each with increasing requirements for things like testing, documentation, interoperability, etc;
- also was an Open Science Data Systems Workshop; summary: will be some papers on best practices, common terminology, and examples of science dat systems; Earth science has hySDS, an open-source and highly scalable system for managing the largest Earth science missions (75 TB/day plus surge capacity)
- status of FAIR pull request:
- https://github.com/hapi-server/data-specification/pull/224
- we had a large discussion on provenance
- summary: just go with very simple free-text
provenance
attribute at the dataset level for now; need to talk to use-case owners (Baptiste) about anything more details; the linkages endpoint could support provenance info that is based on the time range of the request - this is linked to the addition of a files endpoint, which could support fine-grained provenance for file-based data
- https://github.com/hapi-server/data-specification/issues/218
- Wed. 1pm meeting to close out the FAIR pull request (by adding the agreed-upon
provenance
attribute and cleaning up and finalizing FAIR language - during AGU, we'd like to have a 6 hour block of time for an in-person HAPI meeting, probably Mon or Wed
- Discuss http://radiojove.net/archive.html
- no meeting next week - because of holiday for some and PyHC meeting
- proposed spec changes to support FAIR principles are ready for some review:
-
license
keyword will be short-phrase from SPDX.org -
provenance
info – need something simple for now; will eventually link to file listings
-
- upcoming talks:
- HAPI at PyHC next week: I’d like comments on my slides (unbelievably, a draft is ready now for quick review)
- HAPI at the Open Science Data System Workshop (Nove 14-15); ideas welcome
- planning for the future – we need to prioritize our current efforts and have some more intensive meeting episodes
- work on caching is ramping up ahead of AGU
- web page work (by Bob's student): start with current content and revamp CSS; presentation list to be auto-linked to github repo listing presentations
Here’s a list of some things we have going right now that needs to be organized / prioritized:
- central location to find known HAPI servers and datasets
- adding of JSON-LD to central landing page of known HAPI datasets
- centrally track any outages of known HAPI servers
- adding ability to link HAPI datasets together – especially for different cadences
- HAPI caching mechanism
- HAPI amalgamation / local data serving tool
- generic plotting client via adapting of KNMI tool (Eelco Doornbos)
- pursuing standard designation via IHDEA / PyHC
- mapping of HAPI metadata to SPASE
- standard templates for specific kinds of HAPI output
- file listings
- event lists
- data availability info
- pursuing many other data centers for addition of HAPI:
- U Calgary site with lots of ground-based data
- Madrigal – the 600 lb gorilla of the CEDAR community (NSF)
- ESAC is moving to a centralized Heliophysics data system for all its missions that will include HAPI
- integration within HelioCloud – using HAPI for cloud-to-cloud data transfers
Discuss https://docs.google.com/document/d/1j8lMkvwFuJI8pdFK5K42XQL0Ua21aO3UkH8JtZSGqdQ/edit?tab=t.0
Action items:
- Jon and Bob to meet Tue 4pm for finishing FAIR updates to spec
- HAPI Browse Tool needs official name
- Bob to add status into to HAPI Browse Tool page (info for each server)
- DASH outcomes and upcoming PyHC meeting on standards
- Bob: focus on search could be done by making JSON-LD for every HAPI dataset; the focus here is creating rudimentary search mechanism based on the 10,000 HAPI records that now exist; moving to JSON-LD too, so that search engines (Google, etc) can find it; SPASE allows faceted search, which is hard for general Heliophysics - specific communities often have their own search GUIs
- Bob's student has new web site HAPI home page - info density is a little low for people' liking right now; Bob will work with him on this; Bobby is using the "astro" framework for new static page generation
Topics:
- addressing HAPI uptime - Bob has a HAPI server that captures uptime; possibly write client code to leverage this info so other client writers can use it easily (and report server down time as needed)
- Potential other HAPI servers: Joey at SwRI for TRACERS (would be a C/C++ based implementation)
- SOAR server at ESAC - this will become the main server for Heliophyiscs data at ESAC
- Sandy: HAPI would benefit from more outreach to scientists
Actions:
- Jon and Sandy meet to talk about outreach to scientists; Eric: replicate a paper - use a specific MMS one that SPEDAS has done before; see webinars for MMS examples for SPEDAS - how how to load and plot data from a specific study; https://www.science.org/doi/10.1126/science.aaf2939 lots of examples are created for or signed off by instrument teams; use data from specific events on Oct 6, 2015; try these: FPI and FGM and plasma beta calculations (that use FPI and FGM)
- Jon and Bob - prep for next week - finalize the FAIR pull request; go over recent surge-work on linkages
Agenda: Highlights from IHDEA:
- potential push to add HAPI as an officially recommended standard by IHDEA, which would first require IHDEA to create a standards promotion process, and HAP could be the first project to go through this
- new HAPI servers: Ralf Kiel has one in Germany for internal; SciQLop (via SpEasy tool by Alexi J.; SpEasy will become HAPI client first, then get used in server to expose data that is not otherwise avail. via HAPI)
- consider supporting Japanese effort by making a pass-through HAPI server over this data:
- At IHDEA, Emoto Masahiko presented on Data Repository of Aurora Imaging Spectroscopy (DRAIS) and has an Open Data Server with an API for access to their time series data: HySCAI data from Japanese spectral observations; see https://projects.nifs.ac.jp/aurora/en/repository.html (can't get this site to respond)
We need to start dealing with the fact that we keep on saying "Search and Discovery is for a different project" but such a project has not emerged. (Prompted by Speasy discussion).
Follow up wiFth Ryan
Agenda:
- Review of the FAIR updates from last week (I updated that branch with my version of Rebecca’s comments)
- A review of some things from the HAPI focus week (which was 2 weeks ago):
- an idea for linkages using a separate endpoint
- there is a need to capture a schema (even an informal one) for file listings, event lists, and availability info
- A preview of DASH and IHDEA talks about HAPI
- Allow unitsSchema to apply to only certain parameters? Will know more if need after CDAWeb units discussion.
- Discuss https://iswat-cospar.org/clusters_teams
- 2025 Space Weather Workshop March 17-21, 2025 Boulder, CO - Save The Date!
- Discuss Julie's/Alex's email about tutorials
Notes from today:
- Some links from the Zoom chat:
- From Bernie: https://citeas.org/api
- From Jon: HAPI spec mods for FAIR: https://github.com/hapi-server/data-specification/blob/217-needed-elements-for-fair/hapi-dev/HAPI-data-access-spec-dev.md#86-fair
- From Rebecca: https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/subject/#subject
- From Bobby: https://github.com/IHDE-Alliance/ISTP_metadata/tree/main/v1.0.0
- From Rebecca: example of qualified ref: https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/relatedidentifier/#b-relationtype
- Jon's notes about Rebecca'a answers to our FAIR questions (these have also been injected into section 8.6 FAIR on Bob's branch and PR for the relevant issue):
Notes about FAIR and HAPI:
on this branch:
https://github.com/hapi-server/data-specification/blob/217-needed-elements-for-fair/hapi-dev/HAPI-data-access-spec-dev.md
add 8.6 FAIR to the TOC
list of known hapi servers could be mentioned in 8.6
CDAWeb references several thousand datasets
more description in point 4 under Accessible
(HAPI is service)
HPDE.io has SPASE metadata registries for Helio data, for example.
Interoperable
mention that the language is JSON
indicate where JSON can be found:
1. info
2. catalog?include=all
3. or with the data
links to the document!
and mention that we have a schema
Interoperable:
spell: acessiable
Rebecca: what is definition of 'shared' and 'broadly applicable'?
ans: a lot of people use this!
2. (Meta)data use vocabularies that follow FAIR principles
Rebecca: If we wanted to use vocabularies, what would it look like?
To use FAIR vocab, you need to use something like:
https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/subject/#subject
note: this is mostly for the values of the attributes
subjectScheme: name of vocab
schemeURI:
valueIRU: MMS
classificationCode:
This is mostly for being able to link keywords about a dataset (not datasets names) so that datasets can be linked more easily.
Since it is focused on discovery, it's not as relevant for HAPI as a service focused on access.
FAIR vocabulary is more about search-related terms
really it's a scope decision - does HAPI really need to use this?
best example of how to use this is how the Subject
HAPI uses some controlled vocab, but not really in for formal way required by DataCite:
(Meta)data include qualified references to other (meta)data
Other metadata can be referenced using additionalMetadata. For units, an external schema can be referenced using unitsSchema. Rebecca: what does "qualified" mean?
Ans:
from DataCite:
qualified ref. means that you choose the relationship type:
this id is related to this other id by a defined relationship
https://datacite-metadata-schema.readthedocs.io/en/4.5/properties/relatedidentifier/#b-relationtype
Provenance bock could could mimic the "related dataset" relationship mechanism by using
lots of prov. is taken care of by
list of relationships allowed can constrain the metadata
just used "derivedFrom" (..other files. for example)
versus including cameFrom (..mission..) which would be harder to implement
for #4:
(Meta)data meet domain-relevant community standards
HAPI is a community standard.
Rebecca: yes, but what other standards are you using, leveraging
Bob: can we map this to something like OpenAPI mechanisms
Bob: there is not another standard for timeseries in JSON,
Point to specific pieces: it's a REST PI; it uses JSON with schemas; we use HTTP; we use ISO8601; we follow what community was already using for data server access (hence)
We do these things to support FAIR, and here is where we end (our scope). We are a data access service, and most of the FAIR aspects that we don't cover are covered by the formal pros.
working meetings this week:
- Mon 9am to 11am (Jon, Sandy, Jeremy, Bob)
- Tues 9am to noon (Jon, Jeremy, Bob)
- Wed 9am to noon (Jon, Bob, not much Jeremy)
- Thu 9am to 11am - Jon and Jeremy
- Fri 9am to noon? wait and see if needed
Topics for this week:
- settle presentations for DASH and IHDEA
- finish off the section on FAIR and how HAPI metadata is related
- CDAWeb server (Jeremy and Bob)
- SuperMAG server (Sandy and Jon): https://github.com/hapi-server/tasks/issues/9 and https://github.com/hapi-server/tasks/issues/20
- KNMI timeline viewer was already doing SuperMAG as a pass-through
- dev ops and releases (Jon): https://github.com/hapi-server/tasks/issues/22 also issue 3 and also 4
- urgent issues: point releases; patches; Zenodo IDs; larger issue is how to manage releases
- closure on servers.all (Bob and Jeremy): https://github.com/hapi-server/tasks/issues/11
- recommendations / clarification on how to deal with: cadences, file listings, availability in the spec for now (without solving the full, general linkages problem)
- for consideration (but risky as quagmire...):
- linkages between dataset related by cadence; how the linkages are expressed will affect how to communicate that there are other types of things available for a dataset, such as images, file listings, data availability, semantic types of science data, etc: https://github.com/hapi-server/tasks/issues/19
For HAPI meeting at noon:
- can SPDF or HDRL help manage one-off (pass-through) HAPI servers?
- HDRL might be willing to float specific, small requests for service sustainment (of HAPI servers)
- justifications would be needed: how much usage? what are costs, and why is it running? NASA dataset or personal? (NASA gets priority) Is it hosted some other way (if it is nowhere else right now, HDRL more interested)? How often used by community (metrics tbd)?
- heliophysics.net or helioanalytics.io could possibly host this
- 100,000 requests in a month with 50GB downloaded is $5/month, and this is a very reasonable level
- (it's better for the archives to pull in services so that they become responsible for them working, especially when breaking changes get made - the other services need to be part of the test process after changes!)
- Rebecca again visits at noon to talk about FAIR w.r.t HAPI
- Jon gives updates on completed tasks
- Bob summarize his meeting with Rebecca and https://github.com/hapi-server/data-specification/pull/224
- Rebecca will join meeting at 12:30 pm to answer questions
- Sandy update on SuperMag and also using Heliocloud for datashop, etc.
Notes: Rebecca joined at 12:30pm; lots of discussion about details of FAIR as it applies to services versus data. There are at least two kinds of identifiers, for example. A persistent one, like a DOI, and HAPI should not use those as dataset identifiers. This should be provided in the optional resourceID
field in the info
response. Some FAIR principles apply to just this identifier (i.e., there should be one). Other principles related to data identifiers apply across both the persistent id and the local HAPI server id for the dataset.
For provenance, Rebecca suggested just going with something simple: Dataset Version and also HAPI version.
The approach we are taking for a HAPI server to be FAIR:
- make a few changes to the spec to allow for FAIR items not currently present; main ones seem to be: add
dataLicense
, addprovenanceDetails
- for HAPI to be FAIR, the underlying data must also already be FAIR, and HAPI can't really make up the difference if this is not true
- Bob is nearly done with an Appendix to describe how to express FAIR data using HAPI
Action items:
- Bob works on appendix and we go over it next time with Rebecca starting at noon eastern
- Jon to email Jeremy and Bob about potential longer meetings next week for making further progress on specific issues
- Bob to meet with Rebecca about FAIR request
- merge in Bob's changes for notes and warnings messages.
- Discuss https://github.com/hapi-server/data-specification/pull/223
- From Julie: "Any updates you’d like to present for HAPI at the upcoming PyHC Fall Meeting (looking for it on Day 1 (Nov 11th)? Purely updates, no overview." Jon will respond to her email.
- From Rebecca: We discussed a few weeks ago some proposed changes to the HAPI metadata schema. I recall those changes being received well. Do you have any idea what the timeline looks like for a new version of the HAPI metadata schema to be released with those changes? Bob emailed Rebecca about a meeting this week to clarify some points.
- Can we get SuperMAG up (We were a month from having it running in 2020 and it would be good to have this finished finally!)
-
When new cdaweb server is up, all spase records need to have ProductKey for HAPI server updated. Some will change.
-
Discuss https://ivoa.net/documents/VOUnits/20231215/REC-VOUnits-1.1.html#tth_sEc2.4
-
Link in https://hapi-server.github.io/docs/2021_COSPAR.pdf is broken. But text is correct. Problem is with their publication.
-
Bob will ask student about creating GitHub runner to generate list of presentations.
How are SPASE records synced with master CDFs? Automated process?
Discuss prep things needed for 3-day meeting.
Mandate upload at least 2 presentations/abstracts to https://github.com/hapi-server/presentations
- Zach to present on verification script that goes through SPASE and tests each mention of HAPI
- name of repo for HAPI tools (since
hapitools
is already taken. Decided on hapiutils for pypi. Won't change tools-python to utils-python; tools-python may not be a single package and may want "tools-matlab", so don't want to try to have as similar as possible to package name. - confirm DASH / IHDEA submissions
- Discuss Simon's email. No updates. Bob sent clarification about interpretation of 1404 and 1405
Notes:
- Zach presented on his FAIR analysis script, which also checks HAPI Servers from NASA's SPASE records
- his code is available in this repo: https://github.com/Kurokio/HDRL-internship-2024
- next working meeting: this Fri 9am - 11am Eastern
- instead of
hapitools
(which is taken) we will usehapiutils
for the package name (different than repo name) - Bob suggests having development of HAPI amalgamation tool go through some code review / technique analysis to ensure it is re-using code form previously done solutions to common problems; have Calley reach out when starting a new problem? have discussion first on interface for any new capability (similar to what we did for caching)
- consider looking at SpacePy since those developers have solutions for similar issues
- consider a larger group discussion to facilitate ideas for functionality ideas
- DASH/IHDEA talks:
- DASH: Jon - talk on forward looking view of HAPI; plans and needs to enhance interoperability
- DASH: Jeremy: poster / talk about two servers in development: SPDF and ESAC (same new Java server) and NOAA SWPC (own from scratch); also the new ones for Mag data: SuperMAG and WDC
- DASH: Sandy and Calley: Poster for HAPI Amalgamator basics (need poster representative)
- DASH: poster by Nobes? or extra slide in Jeremy's talk?
- IHDEA: Jon HAPI status update (short)
- IHDEA: Bob covers more general topic about metadata
- Consider Week of Sep 23 as in-person HAPI dev meeting; need venue in remote B&B with good wifi...
- today at 1pm: status of WDC server Adam and Oli; they've been working on this for a month on their own, integrating with existing system
TODO:
- Jon: email Zach and Rebecca about next week
Agenda items:
- Schedule meeting with WDC devel 1 pm Eastern, July 29 (extend regular on Mondays, They have a running HAPI server and want feedback.
- talks at AGU (Dec 9-13) and DASH/IHDEA (Oct 14-16; IHDEA Oct 17-18 both in Madrid, Spain - with some remote participation):
- focus at AGU - networking with people who could use HAPI (NOAA AI center folks, etc); set up several 1 hr meetings
- Jon: AGU: HAPI overview in one of these:
- I'd like to present more than the overview / update to also cover applications / analysis and instantiation of HAPI servers
- IN039 Big Data and Open Sci. in Helio and Planetary https://agu.confex.com/agu/agu24/prelim.cgi/Session/226235
- P042 Machine Learning and Data Science Methods for Planetary Science https://agu.confex.com/agu/agu24/prelim.cgi/Session/226555
- R2O2R Space Weather Session SH034 https://agu.confex.com/agu/agu24/prelim.cgi/Session/227464
- Jon: DASH and IHDEA options
- forward looking talk: what could take HAPI to the next level? probs we've encountered; how to enable search, servers going up an down (true for any distributed system with an API approach); people building Visualizations - need more of that!! and this helps you future-proof your app -= your own Viz. tools may age, but if you are accessible via a std, other people's Viz. Apps might work; also looking at making things more FAIR; managing multiple server versions (handle automatically in clients that we distribute); metadata problem - caching of metadata, and fixing / patching of metadata with a possibly centralized overlay of metadata fixes; compelling story of things we can build on from here - what do we focus on next now that reading is becoming more standardized? Tools for data amalgamation are in fledgling state - next step is for more feature-rich set of HAPI-based data manipulations**
- Getting HAPI up and running on your Data Center (no specific session)
- Current Reach and Potential of HAPI
- Automated Testing of HAPI servers
- Using HAPI Metadata to Populate other Metadata content (would require some work to have something to say)
- Bob: AGU
- Bob: DASH - latest capabilities: new data centers / sources and some of the new tools (relatively simple poster)
- Bob: IHDEA - creation of updated HAPI metadata; has revealed opportunities for improvement in metadata world; relapsed. to SPASE directions; what often gets missed and implications; press is similar to making code - needs evolution and lots of check, esp. overcoming human factors; summary of HAPI experience and recommendations; have Rebecca on this too (for use with her search interface); search only useful if it really has access to everything, and it also needs to not be broken!!, and needs to be more compelling than google! Also mention about FAIR aspects of HAPI
- Nobes / Jeremy: IN-018 Data Deluge (UCAR and HDF) https://agu.confex.com/agu/agu24/prelim.cgi/Session/226709 Caching presentation for DASH; generic capability that can assist clients in any language; also separate presentation at DASH
- Calley with Sandy? - for DASH only (since deadline is not until Aug 10)?
- Jeremy: AGU - IN039 Accessing PDS3 an PDS4 in Autoplot, plus some about HAPI
- Jeremy: DASH
- for later:
- Sandy / Calley: AGU or DASH or IHDEA: HAPI Amalgamation
- FYI for AGU - Jon also doing HelioCloud poster in SM033 Dayside Magnetosphere Interactions
Action items:
- Need to revisit (came up with WDC folks): OpenAPI is catching on and allows auto-generation of clients in any language; but it only deals in JSON, and does pagination for long responses, so an OpenAPI client might be good for interacting with metadata, and we'd still need custom code to deal with our long JSON responses and especially for binary and CSV
Notes:
- web page is updated; looks much better; we could eventually hire professional web developer? (see HDF group page, for one example https://www.hdfgroup.org)
Presentation by Rebecca on FAIR:
- specific citation for the data (not the instrument paper, but a data citation) should be a structured element that includes 4 key / required fields: Author, Publisher, Year, Dataset Name, optional: Description, Data Version number
- consider adding License for data usage; could be per dataset?
- suggested on his Creative Commons Zero V1.0
- spdx.org/licenses
- https://spdx.org/licenses/
- For
license
content, she suggests using a link link this: https://spdx.org/licenses/CC0-1.0.html
- Three parts for HAPI team:
- update schema to reflect ways to capture all FAIR info
- update verifier to allow for check for FAIR
- work with existing servers to add updated info for licensing, extra data
Action Items:
- create ticket to update citation element (probably use same element structure for citation in both
about
and datasetinfo
) - add to DevOps page about new releases: the way to add a new PDF version to Zenodo HAPI entry
Agenda:
- Presentation by Rebecca: FAIR principles
- Sandy: talk about HAPI web page at github.io via astro or jekyll
- To discuss: conferences
- 2024 Open Source Science Data Repositories Workshop (see email sent to Jon and Bob); sponsored by NASA’s Science Mission Directorate, Caltech in Pasadena, California, Sep 25-27 (Wed to Fri)
- DASH (Oct 14-18)
- AGU (Dec 9-13)
Notes:
- discussion with Zach and Rebecca; main presentation put off to Aug 5 telecon; Zach working in own repo for now; we will consider linking, using or moving to another repo under HAPI later once it's more mature; current location is: https://github.com/Kurokio/HDRL-Internship-2024
- Bob talked to WDC people about their data and API; they have course data going back to 1900; their new API is close but still very programmer-focused; they are open to adding other endpoints and so will likely implement it themselves; Bob gave them the start of a command line program for them to flesh out
- HAPI needs provenance info - see ticket 186; longer discussion needed; including provenance in HAPI info response makes that response time-dependent, which an mess up caching; but not every dataset has files, so listing files might not appropriate for every dataset
- Sandy - we need a repo location for one-off, small tools (Python only); proposal is to create
hapitools
package where people could integrate elements as part of the package or maybe make a sub-package; Jeremy: call ittools-python
to follow conventions: Python client is inclient-python
and the import isimport hapiclient
so that the tools import will be 'import hapitools` - DASH - could we get a 60 to 90 minute tutorial deep dive into HAPI possibly in another room; Sandy will email the committee about demo rooms
- DASH - main HAPI presentation ideas: process that we go through to get people up and running (social side of it); some of the challenges (provenance, tracking users - should these be in HAPI or SPASE); good place to ask and discuss about cache maintenance / behavior - get other people's perspectives - get feedback on what they have implemented
- IHDEA - community and standards and challenges (provenance, interoperability)
Action items:
- create list of presentations you want to make at DASH and AGU
- Jon to contact SWPC people about HAPI server status
- Sandy and Bob to meet on web site stuff
- Jeremy, Bob and Bobby to meet on CDAWeb HAPI server
- Nobes and Jeremy to met on caching
- To discuss: 2024 Open Source Science Data Repositories Workshop2024 Open Source Science Data Repositories Workshop (see email sent to Jon and Bob)
- WDC update;
- https://wdc.bgs.ac.uk/dataportal/webservice_doc.html
- they have data to the 1800's; some differences from SuperMAG and INTERMAGNET (these seem to start in 1991, so after the 1989 Quebec storm)
- Bob will meet with WDC on Wed 9:30 - others can attend; they are revising their API (far along already)
- Rebecca on July 8 - also with intern;
- WDC group on July 15th? Will decide this Wednesday if they want to meet more and pursue HAPI
- where is the global list of ground magnetometer stations and the place(s) to get data from them?
- JAXA / ARASE in Japan has some ground stations (some with elec. field data)
TODO:
- Jon to email Rebecca to confirm July 8
- Bob to send invite to HAPI developers for WDC meeting
- Sandy to investigate ways to update HAPI web page; better HTML exists - need way to get this through the markdown-heavy approach at github.io using Jekyll (seems complex) or Astro (used by SPDF web developer)
- SuperMAG metadata notes
- Rebecca's email about metadata
- SPASE units
No meeting held.
- Jeremy is working on the new CDAWeb server which is to replace Nand's server
- Jon, Jeremy, Baptiste and Eelko to talk about "relations"
- multi-resolution
- other associations to files and other things
Bob - report on Simon's question related to https://github.com/hapi-server/data-specification/issues/105
Bob - report on INTERMAGNET updates
Bob - discuss servers should accept blank list of parameters, as in parameters=""
in the request URL (which is like requesting all parameters)
- Discuss https://github.com/hapi-server/data-specification/pulls
- Remaining 3.2 tasks verifier | schema
- Add https://spdf.gsfc.nasa.gov/pub/catalogs/spdf-plotwalk-catalog.json to template repo?
- Update on https://github.com/hapi-server/data-specification/issues/176
Notes:
TOPCAT presentation by Mark T.
Questions: Jeremy: all versions? Mark - yes!
Action items:
- Keep Mark in the loop on new versions so he can keep TOPCAT up to date!
We are now going to track our TODO items as tickets in a separate repo; here are the ones due for next week:
Assignments
- We all should look through this page and find things that need to be followed up on or that were not finished.
- Discuss Rebecca's email about ESIP Schema.org Cluster
- Discuss how to track to-do assignments
Discussion:
- talked about SPASE generation tool redo and it could also spit out HAPI metadata; meeting Apr 2 at noon about this
- put off talking about ESIP time series folks
- to track to-do items: make new repo for "api-tasks" and everyone look over the last few meeting notes in the wiki and move your items into api-tasks as an issue (assigned to you and with an expected due date, etc)
- looking over issues for 3.3 and beyond; some are easy; deciding about categorizing all the others
- nominal target date for 3.3 release is July 1
- for a next step, look at all issues related to making an "association" or linkage between HAPI datasets;
- TOPCAT person Mark can present - maybe at next week's HAPI dev meeting
- SPASE asks if HAPI has a service description; answer is no, but they can make one if they want to
- Bob reached out to Das2 HAPI Server person Chris P. who will get it up and running again at UIowa
- two low hanging fruit servers: DataShop and SuperMAG
- discussion of next priorities
- briefly discussed possibility of a bundle with defined characteristics that allow it to be served from HAPI; needs more development with concepts like the URI template and a HAPI-JSON from CDF mechanism; could be an HTM proposal
- Bob - magneto-telluric data (as presented at PyHC last week) is something he uses a lot; the EarthScope project via several universities and a larger data effort; Bob would prefer other to pursue this to keep scientific distance; we make a sample data server to get them interested and then there are options: 1) they like it and pick it up themselves; 2) they like it and propose it (as leads) on a proposal with us to help develop it; 3) or we propose it with them as helpers; FYI there is already a Java API for getting to the data
- go through list of next features for HAPI 3.3 or 4.x (Bob led this)
Action items
- Jon to release PDF of 3.2 to Zenodo
- Jon to add Zenodo push to release process
- Yes Bob to check on verifier status - does it handle 3.2 yet? Jeremy will try his 3.2 server with the verifier
- Jon to work with Sandy to convince Jesper to pull the trigger on HAPI support
- Jon to see if Nobes can update DataShop server
- Jon can ask about HelioCloud offering HAPI-ready environment for people with data; or an aspect of HelioCloud could be to provide a HAPI service for data that meets certain ingest criteria
- Jeremy to poke USGS about stream height dat (lots of these)
- Jon and Bob to find a way to create a separate TODO list (separate from Issues related to the spec)
- Jon to delete issue 168 after moving it to new TODO list
- another issue (157) needs to go on project TODO list
- Decided did not make sense. Will discuss Bob to send email to have people vote on issues for 3.3 - we should pick top 3 or 4 to focus on
- decide if we want to do larger releases or small releases
Jon's list of things that HAPI can server that could be described semantically:
- file list
- event list
- availability data
- different cadence of other dataset
- geographically co-located list of something (either fixed or moving)
Maybe just have elements within a dataset that identify themselves as being specific types of things. Ryan M and Rebecca R and Baptiste C have already looked at semantic linkages
- release of 3.2 ready to go - we can merge the PR if no objections
- discussion with SWPC developer(s) on HAPI server options
- location in GitHub for HAPI presentations (SunPy has this)
- FYI - caching meetings on pause for this week (and next?)
- list out new features of interest for next release:
- multi-resolution links to different datasets
- federation of HAPI servers or better way to track / report known HAPI servers
- availability info for a dataset
- lists of files (as a provenance mechanism)
- find a way to include provenance in
info
response; address how HAPI communicates provenance for underlying data that is changing; make sure references follow conventions (some now recommend not using URLs but paper titles to go into a search engine) - possibly a more generic way of semantic communication of HAPI content
Actions:
- check with Bob about Verifier status for version 3.2
- Jon to add repo for talks and presentations
- Sandy and Jeremy to look at Python server (load capacity / multi-threading); also advertise this more!
- eventually have Sandy present about HAPI amalgamation
- also consider Brent or Darren present about using HAPI for model output (spacecraft fly-throughs)
- release of 3.2 is almost ready; there's a branch for the new 3.2 directory
- one clarification needs to go in 3.2: Bob created a ticket and will update the text as usual; Jon will get this into the 3.2 release files
- next week's meeting will start at 1pm to avoid colliding with the PyHC spring meeting (online only; 9am-11am eastern M-Th)
- next week we will vote on the release of HAPI 3.2
- Nobes and Jeremy to meet this week on caching items
- CDAWeb server has trouble with voyager dataset; Bernie looked and it has an empty directory; this Voyager dataset seems to have problems with the current HAPI server (sot it's not a great dataset for testing); best idea is to move to the new Java server that ESAC and NOAA/SWPC are using
- TOPCAT is now reading HAPI data; Sandy replied to Tess about this and asked to talk about HAPI as a potential VO standard with her sometime
- Bob will contact the TOPCAT developer to see how HAPI interacts with these services; we talked about TAP servers; actually it is EPN-TAP that is mostly what Heliophysics / planetary sites use in Europe
- Jon to send around list of upcoming data and software-related meetings to hapi-dev group
- Review pull requests.
- Jon integrate change log info into spec document
- Bob maybe finish TestData3.2 with FITS.
- Discuss Workshop for Collaborative and Open-Source Science Data Systems April 29 – May 1, 2024 at the Laboratory for Atmospheric and Space Physics in Boulder, CO. (Greg Lucas and others)
- Follow-up Meeting is at U of Iowa Aug 12-14 is Open Source SDC and Aug 15-16 is Rebecca's meeting (TWSC funded, if it wins)
- Also in Boulder on May. 29 to May. 31, 2024: Innovations in Open Science (IOS) Planning Workshop: Community Expectations for a Geoscience Data Commons; https://www2.cisl.ucar.edu/events/innovations-open-science-ios-planning-workshop-community-expectations-geoscience-data
- Discussion about the all.json contents and related federated info that is now being collected in the "servers" Github project; eventually we may want another layer of services on top of this that presents it more as a service and less of a raw set of Github files, but for now we are collecting HAPI server data and can add services later
- Talk about different portals: Heliophysics Data Portal (SPASE driven) at https://heliophysicsdata.gsfc.nasa.gov/websearch/dispatcher which is what Aaron started and runs at SPDF; newer portal is by Rebecca R. and aims to include more solar datasets which are not necessarily in SPASE; earthdata.nasa.gov is for non-experts to find and learn about what is available (theme-based: atm., ocean, etc); leads to hand-selected data products suitable for non-experts and entry into the field; NASA HQ wants other divisions to have a similar thing; key task is metadata creation - will need significant effort to get solar data into SPASE; people are working on this, but it's taking a while and its being done by hand; difficulty with this is that hand-edited metadata becomes obsolete and so not useful). Daniel (at GSFC) works on this under HDRL - progress is slow-ish; nothing public yet; the earth data one looks to have had a lot of financial support over multiple years
- Focus for us for HAPI data - how best to make it searchable in a federated mechanism that knows about all the HAPI resources relevant for a project; might be possible to make SPASE records from HAPI sources (keep it automated so we can adjust to changes in SPASE)
- action item - (Bob) talk to Brian about getting more SPASE from HAPI or other ideas about SPASE evolution
Agenda for Thur 3-5pm (Eastern) meeting on HAPI 3.2 release progress:
- any comments on the
stringType
wording updates - flesh out changeling entries for 3.0, 3.1, 3.2
- check_array unit tests - are they right and complete?
- more test cases for 1) schema validation 2) test data servers (esp. for 3.2 and 3.3 test data)
- discuss what goes in
all.json
and how to arrange
- clarification of tasks at upcoming PyHC events: spring meeting in March online only - Jon to provide HAPI updates; summer school May 20-24 in person - Bob to be there in person for tutorial and roulette/tasking game;
- Mark from NOAA / Colorado U - using the server-java implementation with Eelco's timeline viewer; he's got data going through by creating products by hand, and using server-java as a stand-alone mechanism and providing it data; he does have to make the JSON info for the catalog and info responses
- check up on various other projects - specific meetings scheduled this week (Wed for caching and Fri for 3.2 release push)
Agenda:
- PyHC summer school (May 20-24 in Boulder, CO) needs some HAPI support: "Combining PyHC packages examples"; HAPI adapters seems appropriate, either Kamodo, except there are two version now, someone mad e afford, and that's what you get with pip install kamodo!) or maybe better then is SpacePy or SunPy (bit of a stretch); need someone to help out with this (Jon to find someone - maybe Jon N.)
- COSPAR (in July) presentation by Sandy "HAPI in Analysis Codes" with themes of data amalgamation, TOPS, PyHC, ML, serverless ops; abstract due end of this week
- UCAR / NCAR workshop on data and standards May 29-31
- SWPC is going to make a HAPI Server and use Eelco's tool
- action item run down
Discussion and items that still need action:
- Bob still needs to know what needs to be added to 3.2 (he needs to know the few things that need adding to 3.2), so Jeremy to get short list of things to be added to 3.1 and also 3.2
- updates to check_array.js: major task is to move semantic checking (that can't be done with JSON schema) into a package separate from the verifier; this would give people the ability to fully check the JSON from a server without running the full verifier; the verifier then can use this separate capability inside; Jon to look at the code to see if it has all the semantic checks that are needed
- Jon - Add section to hapi-dev for 3.2 change list (with links to associated issues!)
- Jon - Check that all 3.1 changes actually ended up in the 3.1 change list
- From last week: Jon: remove comment about constraining strings to enums
- From last week: Jon: create pull request for adding error 1412 "unsupported depth value" (comments about this also in the uber ticket above)
- From last week: Jeremy: come up with schema for all.txt
News:
- ESAC server for Solar Orbiter coming along
- SPDF server (having memory issues?) down on weekend due to memory leaks
- for next week: update on the Java server being worked for ESA that will eventually be used at GSFC
Action items for getting 3.2 ready:
- Jeremy: look over 3.1 schema and let Bob know of anything that needs to be added still (do this manually, and put the result sent he ticket); Jeremy has schema-sorting code that may help; Bob's formatting standardizer has made comparisons harder - we need to move to this longer term, but for now, we have just 20 lines of JSON to add into 3.1 (and then 3.2); see this ticket: https://github.com/hapi-server/data-specification-schema/issues/1
- Jeremy: also do this for 3.2 (the current 3.2 is a copy, possibly lightly edited, of 3.1
- Jon and Jeremy: review the test for units and labels (see the same ticket for a description); look at verifier-nodes / lib / checkArray_test.js and make sure the test uses (units string and array sizes) are consistent
- Jon: more reviewing and checking of schema for general errors
- Jon (and see if Nobes wants to help): make lots of examples data-specification-schema / test /3.0 (and other versions) - create examples of info responses that flesh out the core aspects and also some corner cases of the spec; remove scripts that are mixed in with test cases
- Jeremy(?): some clients (TopCat person for example) use our test datasets, so we need to make sure to capture new features of given version of spec so people can have confidence that their client can handle everything even the corner cases
- Bob and Jeremy: Jeremy to ask Chris P to send / show Bob the Python code for plotting time-varying bins, and Bob can see about adding that
- Jon: remove comment about constraining strings to enums
- Bob: move issue about error 1412 from schema repo to specification schema. Done: https://github.com/hapi-server/data-specification/issues/187
- Jon: create pull request for adding error 1412 "unsupported depth value" (comments about this also in the uber ticket above)
- Someday: now that the catalog can also list all the info MD for each catalog, we need a way to manage the schema objects jointly; to the outside world, there needs to be one schema for each JSON response (one per endpoint) and this will look like there is a lot of copy/paste going on, since the entire info response can be inside parts of the catalog response. So for inside (the JSON schemas that we maintain), there needs to be no copy/paste, and as little code as possible, but somehow (are there JSON #include options?) incorporate certain schemas within others; one option is to have our own way of storing things, and then a set of scripts to create al the endpoint-specific schemas out of our own lean content (but then we have to maintain those scripts)
- Bob and Jeremy: make a JSON schema for the all.txt (shouldn't take long - maybe 30 min) and deprecate the all.txt and encourage people to use the JSON version; there was discussion about having a service to harvest "about/" info and present a cached version of it to people, since if a server goes down, you want' even be able to get to its about page; if a server does not have an about/ response (or its' weak), then we can offer to write an about/ kind of info for any server below 3.2 (and then our cache may or may not have harvested recent info from other servers about/ endpoints)
- Nobes: come up with list of parameters for data caching - options for interrogating the cache and changing settings; Jeremy suggests a no-op implementation that has the right API to the cache to flesh that out; Sandy's three use cases are a good starting point: don't use the cache (always get live data); only use the cache (don't refresh anything; should a cache miss then be a failure?); use cache but check every time so that I'm always using the latest data; Nobes will send meeting invite to Jeremy and Bob for brainstorming ideas (mostly the parameters / arguments) on Tue or Wed
Bob and Jon met with Nga Chung at JPL. She leads https://sdap.apache.org/ effort. Will also be working on GRACE follow-on. They've looked into OGC, which has a draft standard. It is complex and not much software written for it. Thinks HAPI is a good candidate. Will report back to us in 3 months.
Discussion with James G of OPeNDAP
Can OpENDAP support the HAPI-style of "dataset"
What libraries are available for use on servers, and where are there plug-in points? (I just need to know if it's possible and where to point people to start looking for plug-in interfaces and options).
Evaluator has ability to add server side functions: this string is an ISO8601 time and give me the time values between A and B
But the Sequence data model is not used much.
Other systems could use the server layer as the base layer and define their own domain-specific structures and constraints.
So use of pydap (or any OPeNDAP server), esp. Pydap: every data is a table; time is always first column Pydap has 2 sub-libraries: one is about client side and working with data other is about providing data servers
Server side is modular - reads from different things based on plugins (including SQL databases, meaning it can work with tables).
Sequence started with JDOFS (ocean measurements, soundings, etc, with ancillary files)
Pydap code is in GitHub in its own project: https://github.com/pydap
Also of interest, Hyrax, used to support FITS and CDF (but it's dropped, tho still in the C++ code) and Hyrax can do tabular data, but it's not been given lots of attention. Hyrax is bigger beast, but now has Docker container method for installing (with right mount points for data). None of these servers require making new catalogs - just look at the data. Hyrax links:
- https://github.com/pydap
- https://github.com/OPENDAP/cdf_handler
- https://github.com/OPENDAP/bes
- https://opendap.github.io/documentation/
- https://opendap.github.io/hyrax_guide/Master_Hyrax_Guide.html
- This is about using NCML for aggregation (Unidata created the NCML piece):
- https://opendap.github.io/hyrax_guide/Master_Hyrax_Guide.html#_aggregation
Action items:
- Bob to email Jack I. about VSO HAPI server
- Bob - add KNMI HAPI server to official list! woo hoo! (note: no landing page yet - it's not required) https://data.spaceweather.knmi.nl/hapi
- Jon - send around abstract for Jan 11 presentation at SWPC (12:30pm Eastern via Zoom or equivalent)
- Jon - send new meeting notice for next year (need to end at 1:30pm for Bob if we keep to this time slot)
- Jon - contact Ale Pacini and Rob R to keep things moving
- Jon - to meet with James Gallagher to explore ways to connect OPeNDAP and HAPI, maybe using their pedal server if it can actually handle a sequence that is a dataset in the HAPI sense (and not in the OPeNDAP sense, where a dataset is a file)
Future action:
- attend US Space Weather Week; Eelco will likely attend this year too
AGU contacts / outcomes
- Bob - Jack Ireland has HAPI server at VSO for image files
- Bob - spoke some with seismic folks, but mostly about data / quality issues; best not to confuse this with inquiries about HAPI data access
- Jon - Ale P. from NOAA-NCEI is interested in HAPI for their data and Rob R.
- Jon - seismic data; Earth and moon SEED and mini-SEED and SAC; fsdn.org and Rob Casey from EarthScope; and also magnetotelluric data for electric and magnetic field; gives surface impedance and conductivity layers under the surface
- Jon - OPeNDAP and pydap server - might be useful for us if it is modular in the right way
- Jon - OPeNDAP and community outreach: James Gallagher, Dave Fulker, Arika Virapongse - they have a paper coming out about how to make a technical community work long-term (hint: don't stick with the PI-led model since it's too PI-dependent)
- Jon - Baptiste - HAPI needs DOIs and provenance
Points from Eelco:
- features API for OGC; https://ogcapi.ogc.org and some have looked at this and it seems overly complicated
- new project for ionospheric project with data coming into HAPI; looking for extension to current HAPI effort
- Jeff Johnson is SWFO ground segment lead at NOAA and has talked to Eelco about using the KNMI timeline viewer, which means the SWFO would also be using a HAP Server
Comments about generic HAPI Server - we DO have one! The node.js server has been around for a while.
We discussed a general way to make libraries for servers, and then maybe a lightweight application that is a server, but the libraries and little re-usable pieces are more what people are likely to use, since people are hesitant to run other folks servers on their own networks. If we do make a generic server in a specific language (Python, Java), then we need to make it out of the same set of re-usable library elements. Also, we could look at the pydap server that OPeNDAP has, since it already has a plugin architecture pretty much just like the one we have been talking about (according to James).
Other notes:
- think about making a Zenodo archive with a HAPI-packaged set of files that could then be read via a HAPI server based on the DOI - how cool is that!
Baptiste C. - discussion about HAPI at IVOA standards meeting (he will send his notes):
- Mark Taylor, TOPCAT developer, would like HAPI client inside; he is going to pursue this on his own; also uses Java, will likely roll the HAPI TOPCAT capability from scratch
- told the IVOA folks that HAPI does not want to change (it's very simple, and that's part of the magic!); question is how to make it more VO-compliant
- Q: how to read HAPI data into VO tools? Need a mapping of HAPI output to VOTable; possibly map binary HAPI output into binary VO format?
- Server-side Operations for Data Access (SODA) lists availability of services; possibly do another one-time mapping of hapi/capabilities and hapi/info from HAPI into SODA
- ways to set parameters in a query is different in HAPI versus VO ways of setting parameters; VO folks want change, but B. recommends not changing HAPI and again using an adapter.
- also: how to declare a HAPI service in the VO registry service; needed: a way to specify a HAPI URI in VO language (similar to what's being done in SPASE)
Suggested practical next steps:
- write HAPI to VOTable adapter
- possibly SODA (but main question is if it is useful - depends if there are datasets that each community needs form the other; maybe X-ray missions need to know about particle environment)
other TODO options:
- put option in the /info response to include a UCD (unified content descriptor) - parameter description without units - just general measurement label; similar to SPASE MeasurementType; google "IVOA UCD" for more info
- another Adopt in W3C is way to describe a measurement (target, body, type of parameter, semantic constraints); RDA research Data Alliance (RDA) i-adopt; see this: https://i-adopt.github.io
- the i-adopt folks have workshops to help explain it There are workshops to explain and get feedback from community.
Resource Description Framework is how these things are built outside of SPASE and IVOA. They are all using RDF and semantic linked data. Doug: was are using RDF triples as a store; and it's just another way to represent the data, but doesn't affect complexity of the schema; allows easier connections to other communities;
Provenance Discussion - ways to capture what data is behind a HAPI stream:
- list the files that went into a stream
- list the DOIs of the underlying data source(s)
- list the series and their versions that were used
We talked about ways to think about a potential permanent archive of HAPI results - have one specialized archiving service that runs on top of any HAPI Service; it would take a request and get the data form the right HAPI server, and then cache it somewhere in a permanent way, behind a DoI even; this is closely related to the caching service (currently being worked)
Discussion on the JSON schema effort for 3.2
- still need to go through change log (or documentation, at this point) for 2.1 to 3.0 to make sure the schema is not missing anything
Agenda:
- Talk to Carrie G. about data representation issues she's had
- discussion about schemas - Jeremy has some updates
- upcoming meetings: ADASS, AGU, AMS?
Discussion with Carrie
Case 1 - time issues with older data format that allows for start time to be adjusted by small amt (millisecs?) for each parameter in a dataset
time | param0 | param1 | param2 | param3
--------------------------------------------
t0 | p0_0 | p1_0 | p2_0 | p3_0
t1 | p0_1 | p1_1 | p2_1 | p3_1
t2 | p0_2 | p1_2 | p2_2 | p3_2
t3 | p0_3 | p1_3 | p2_3 | p3_3
Each parameter is at a slightly later time due to the way it was collected. Each parameter is a separate measurement (not part of a spectrum or something.) Could capture this with a custom (user-defined) parameter, using an "x_" prefix.
"x_timeOffsetInMillisForEachParameter": [ 0, 0, 33, 41 ]
This gives the time offset needed to add to the given time for each parameter's specific measurement time.
Questions: How consequential is this for the analysis? (what is time between records?) Ans: not sure - maybe not too much (esp. since housekeeping.)
Case 2 - For particle measurements, with each element in a spectrum having a different start time, in CDF each element in the spectrum gets it's own EPOCH variable, so for a 16 element spectrum, you end up with 16 time columns.
This could be handled by a same-sized array of time durations, either a static one that can be in the metadata for a parameter, or else the name of a column with a time-varying set of values (same structure, just different on each row).
Note that we actually don't have any mechanism for indicating the time width of a measurement. There could be multiple ways of doing this: one measurementWidth
value for an entire dataset (seems like it could be uncommon?), a parameter that represents a duration and serves as a measurementWindow
for all the parameters of the record (note that value could change every record, but it applies to everything in the record), and a parameter-specific measurement window value, either as a static value in the metadata, or as a pointer to another parameter that serves as the time-varying measurement window for that parameter (and so each parameter could specify it's own time duration or measurement window).
Note: giving measurements a time width has an impact on which data records are returned by a HAPI request!
- Carrie G. comments on HAPI about:
- can't represent non-contiguous time bins; solution - we need time bins just like we have DEPEND_1 bins! The time bins could be both static and time-varying, same as we have for the current
bins
object - HAPI can't handle mode changes (changes in number of bins within a dataset - common in low energy particles and plasma wave data)
- does she want time shifts within a record?
- need to schedule a meeting to learn more and make sure we understand
- can't represent non-contiguous time bins; solution - we need time bins just like we have DEPEND_1 bins! The time bins could be both static and time-varying, same as we have for the current
- one person had an issue with HAPI - sounded like server implementation issue, maybe not with HAPI spec
- FUTURE Meeting topic: from Doug's session: some comments on HAPI about IVOA standard
- FUTURE Meeting topic: how does TAP relate to HAPI (Jon's take: each HAPI server could be one row in a TAP-accessible
Action items:
- Jon to invite C. Gonzales to HAPI telecon, Oct 30
- Discuss last two commits at https://github.com/hapi-server/data-specification/commits/hapi-3.1.1
- Discuss B.16 Heliophysics Artificial Intelligence/Machine Learning-Ready Data (Step-1 proposals are now due January 18, 2024, and Step-2 proposals are now due April 18, 2024. Also, Sections 6 and 8 now mention the expectation that individual award values will not exceed $150K. New text is in bold.) (On or about August 23, 2023, this Amendment to the NASA Research Announcement "Research Opportunities in Space and Earth Sciences (ROSES) 2023" (NNH23ZDA001N) will be posted on the NASA research opportunity homepage at https://solicitation.nasaprs.com/ROSES2023 and will appear on SARA's ROSES blog at: https://science.nasa.gov/researchers/sara/grant-solicitations/roses-2023/)
Here are the talks we envision for the meetings:
DASH meeting:
- Bob - technical overview talk in Doug's session on Data Access Interfaces and Services; also talk about clients in here too
- Jeremy : poster on accessing Autopilot from Python
- IDL client - Scott - has been working on IDL client some, but not enough for poster
- HAPI Adapters - Sandy - just presented at AGU, and no changes since then, so nothing to report
IHDEA:
- Jon: HAPI overview and relationship to metadata / SPASE
- Bob - talk about coordinate system representations
ADASS (astronomy meeting early Nov):
- Jon : HAPi overview and also learn about astronomy time series standards
European Space Weather Week (late Nov):
- Bob - HAPI overview
- Discuss DASH (Oct 9-11) submissions
- Jeremy - poster on calling Python from Java - Making Autoplot code available in Python (using jpype)
- Jeremy - URI Templates - use as example for using Java in Python!
- Sandy - poster on popup HAPI server (if awarded); possibly also APL intern Nathan; Travis for HAPI NN poster?
- Bob - coordinate transformation working group;
- HAPI overview - The HAPI Ecosystem; whole process for getting server set up: good docs / spec, example servers (pass-through, Python, Java), verifier, client, JS reference client (happy-server.org/servers); very developer friendly
- Jon to contact: Eelco and Simon and Daniel Garcia Briseno (PHP solar person) other HAPI speakers - see if they are coming to DASH and want to present!
- HAPI adapters project - Jon N (will be there) or Steve M (attending?)
- Jon to ask Rebecca about HAPI for model data
- JSON schema ticket - Jon and Jeremy to meet Wed 2:30
- Jeremy's report on CDAWeb HAPI server;
- Discuss IHDEA (Oct 12-13) presentations:
- Jon with HAPI overview
- everyone - talk to interested people (CSA - Darren, Madrigal - Bill and Katherine, ESA - Beatrix and team) about implementations and support
- IDL client updates from Scott - example wasn't working (Jon's demo server gone)- that's fixed; lots of updates; chunking of large requests - support is in progress for this; also will collect subroutines into single file; Bob recommends using the test data server (there are several for each HAPI version) and looping through all the datasets to see if a client can handle a good variety of data (and also client versions)
- Note: In Python client, there's an option to chunk in serial or parallel (and how many simultaneous parallel requests) Matlab client does not have chunking.
- Discussion about CDAWeb server - in progress with updates to init process
- for the future - recommend all clients at minimum advertise their ordering strategy, and support re-ordering of parameters to make sure all HAP requests are valid (and communicate properly back to users what the ordering is in any returned data structures)
- next meeting: Aug 22 - check in on DASH abstracts run by Jeremy
- ADASS Meeting (Nov 5-9) - Jon to talk about HAPI (and HelioCloud) and also Coordinate Frame stuff too
- Jeremy: HAPI for open science paper to SH-001 Open Science (McGranaghan, Ringuette)
- Sandy(?): HAPI Python tools (clients, servers, HAPI-NN, plus adapters in different libraries) in the SH-012 Python Helio Session
- Jon: HAPI for FAIR Open Science, in IN-040 Open Science and Big Data for the planets and the Heliosphere (Cecconi, Masson)
- Include on other papers: Mention about HAPI and SPASE - we have an initial vision for simplified, uniform access to data repositories
- Discuss ESWW 2023. Deadline for oral presenation is June 29th. Deadline for Poster is September 10th. Last year's program did not have an applicable session for discussion of the session. However, there were many AI/Machine Learning sessions, so perhaps present https://github.com/hapi-server/application-neuralnetwork-python? The also had GIC-related sessions, so Bob may want to attend to present his research and possibly the Python application. Would need help in putting together the Python application presentation or poster (I could be second and presenting author).
TO DO items:
- for 3.1, add max request duration
- need ISO 8601 duration schema check
- make sure we have semantic check for
length
being present if and only iftype
isstring
orisotime
- make sure we have a semantic check for
size
(already an issue) - need to make test cases in
- in the spec, can point to the schema validation examples; put pointers to examples in the changelog (we will need lots of examples - more of them for bins)
- idea case would be to have all examples from the specification document in the test suite
- next step is to make up lots of examples, including some with references in lots of places!
three types of tests for:
- schema check
- semantic check (extra logic where semantics in the schema can't describe things, like dependencies or things like "same size as" requirements)
- server responses (API actuation test)
- Fall AGU - Bob will attend
- AMS in Baltimore in January. Possible sessions Space Weather and Advances in Modeling and Analysis Using the Programming Languages of Open Science
- ESSW (https://esww2023.org/) in Toulouse, France (20-24 November 2023) ; We would like to draw your attention to the community driven session: 100CD-07 - SPACE WEATHER DATA INFRASTRUCTURE: STANDARDS AND FAIR APPROACH. We encourage you all to submit an abstract by 29th of June (submissions at: https://esww2023.org/submit-an-abstract) . Kind regards, Marco, Veronique, Baptiste
- Potential community engagement workshop for generic model representation (Ringuette) - maybe this fall?
Topics:
- brief report-out on schema work; metadata validation being captured in JSON schema project; currently, you have to have a server up and running to test metadata; separating metadata validation would enable validation of static JSON metadata before a server is set up; note: schema serves as definitive documentation
- June 26 - Jon out at CEDAR; will try to connect to Madrigal developers
- HAPI Community forum - need to restart; use regular time slot?
- other future ones: Ralf, Brent, PHP solar person, Rebecca Ringuette for access to model output - eventually; HAPI in the cloud; revisit of model / data comparisons
- list of software that is using HAPI - put on web page to help connect
- HAPI output from Komodo
- HAPI for data model comparison (also Kamodo)
- ML interest in HAPI data - time to leverage this
- HAPI reader in Paraview for model ingestion! (Bob started on this with student! trajectory colored by Magnetosphere region)
- Needed - HAPI web site update (APL - but also need help! Sandy, maybe leveraging APL person)
- Sandy will create new repo ad get website working there; then we learn how to migrate the established URL; look into using Jeykl (allows Github content to serve use web page) otherwise better to make a custom page (regular HTML + CSS) since it's easier to test / view / debug / maintain
Actions:
- (Jon / Sandy at the IHDEA Cloud workman group meeting) Have Baptiste give EPN-TAP demo?
- (Jon) CSA folks give demo on July 17 - Darren C., Emma S., Eric D.
- ESA Space Weather presentation - still password protected; hard to find services; no uniform API; lots to do still to make it a useful way for computer access
- OpenAPI - no more details yet; not super compatible with HAPI, as far as we can tell
- Python library development talk by Bob - stay tuned
- IHDEA registry working group - Bob attended; relevant for HAPI if people want to find data across HAPI Servers, or more broadly across access types; EPN-TAP is a possibility -- search terms are based on TAP (an astronomy std) but are different than SPASE, and there are lots of astronomy concepts in there too;
- URI templates at CDAWeb (Andre Koval, curation scientist, deals with magnetometer data, ISTP Guidelines, SPASE, etc), and these should be in SPASE! But SPASE doesn't have a slot for this yet; Bobby - put in a ticket about adding a way to provide a URI template for a SPASE dataset: https://github.com/spase-group/spase-base-model/issues/25
- schema: Bob made some more sample datasets for testing; best to test schema with static file using StackBlitz site (needs account, but has lots of capability); node.js can always be used to try the schema on various JSON samples
- HAPI server tester (by Santiago last summer); is online here now: https://github.com/jbfaden/HAPI-Server-Tester and needs moving to HAPI project on GitHub; could be fused with some stuff Bob has, or possible redone
- Bob working on verifier hitting the
about/
endpoint - Bob also looking at upgrading verifier to do correct check for time varying bins; can't be done in schema - needs programming logic to see if the string in the
bins
elements is valid parameter of the right size (same as data) and type (should be double); spec allows a bin dimension to not have bins; make sure to not allow array elements ofcenters
to benull
- Discuss ESA Space Weather Services thread on hapi-dev.io
- Continue discussion of Bernie's interface and SPASE
- Discuss PySat Madrigal
- Summary of discussion with OpenAPI person at GMU / GSFC
- Bob's interest in presenting to PyHC about separable software libraries
SPASE and HAPI Discussion
- can HAPI be used to create SPASE records? (we could use SPASE for semantic descriptions of data that would enable Science Data Interfaces - part of our idea for NSF proposal to bring HAPI to more ground-based observations)
- SPASE and virtual variables -- SPASE includes virtual vars, but the plain CDFs may not have these in them!
- Bob and Jeremy working on improving Nand's metadata by using SPASE and the data itself
Actions:
- (Jon) get Bob connected to IHDEA working group on federated search
- (Jon) Post or link to talk for PyHC on hapi-server.org?
- (Bob) working on connecting / updating code generation mechanism linked from Bernie's HAPI dataset landing page at HDP
- (Doug, with Bob) Can LISIRD HAPI server show documentation links for each dataset? (HAPI allows a URL for further info - CDAWeb has this, ISWA)
- (Jeremy) Keep pushing CCA folks with the HAPI server at ESA
- (Bob) update doc links for SSCWeb pass-through HAPI server (at hapi-server.org/servers) so they point to a more specific page, not just top SSCWeb page
- (Jeremy / Sandy) Where is Santiago's HAPI checker script? (which repo?)
- Jon - Community forum invite to Ralf Keil have HAPI server - the requires log in; API credentialing is complex and requires account and additional software (OpenAM).
- Jon and Jeremy - Schema updates
- Jon - look into mini GEM for Fall AGU?
- Bob - finish 3.0 verifier and server action items listed at last 3.2 issue
- Jeremy - find datasets with time varying bins. How does Nand handle them now Nand does not handle them. Example FESA uses a time varying DEPEND_1.
- All - List of ideas for proposals
- Future follow-ups ** Bob - Follow-up about status of PHP HAPI server after 3.1 verifier is complete (Daniel Garcia Briseno) ** Real-time stream of HAPI data email list question ** Get update from Doug
- Jeremy - find datasets with time varying bins. How does Nand handle them now?
- Bob - finish 3.0 verifier and server action items listed at last 3.2 issue
- Finalize plan for PyHC (Jon and Bob), CDAR (Jon), GEM (none), SHINE (none), Fall AGU (Jon & Bob)
- Follow-ups: 1. Status of PHP HAPI server 2. Real-time stream of HAPI data
- Discuss assignments for last 3.2 issue
Action items to complete before next meeting:
- Bob - Move schema from verifier repo to data-specification-schema
- Jon - Create pull request for Issue 131 so we can close in next telecon
- Finish Ping Issue and make merge pull request.
- Plan for PyHC 2023 meeting?
- Schedule meeting with Eugene Yu
- vote / approve PR for issue #116
- OpenAPI and HAPI – looking for overalp (Bob talked to Eugene Yu about this)
- Issue #70 for availability files – define this using a schema concept; also could do this with file listings and image listings; and eventually for science data types
- Other outstanding issues – which ones to target for a May 1 release of 3.2
The PR for issue #116 about adding images via URI strings is ready to approve. https://github.com/hapi-server/data-specification/pull/166
The branch of the spec with the changes is here: https://github.com/hapi-server/data-specification/blob/issue-116/hapi-dev/HAPI-data-access-spec-dev.md
Action items:
- Jeremy: tweak appendix entry about HAPI robots: https://github.com/hapi-server/data-specification/blob/hapi-robots-statement-bug-148/hapi-dev/HAPI-data-access-spec-dev.md#85-robot-clients-should-identify-themselves
- Jon: move all 3.x milestones to 3.3
- Scott to provide example of complex number for defining issue #112
Agenda:
- any DASH sessions on HAPI or data services in general (including fido, EPN-TAP, etc)? Not yet... so could Doug push for this?
- issue #116 - ISO times are just string, so should this just be another specialty string type, or is ISO time special enough that we keep it separate?
- HAPI interpolator using Kamodo code: https://www.youtube.com/watch?v=p5HlAGRZGuc
- availability info discussion - issue #70
- separate meeting for refining 116
Action items:
- Doug to seek inspiration for HAPI session at DASH
- new ticket for additional types (or does one exist?)
- new ticket for moving
timeStampLocation
into a specific time parameter - Thu 3pm, Jon and Jeremy to talk about issue #116
talk about HAPI presence at GEM / SHINE /CEDAR / PyHC
Ready for pull request: issue #131
easier ones: schema reference,
talk about HAPI presence at GEM / SHINE /CEDAR / PyHC
Action items:
- Jeremy to add to pull request for images / URIs about an optional
base
attribute int hestringType
foruri
. The base should be a directory (not filename fragment) and should include a trailing slash so that base + uri string is exactly the URI to get (no need for clients to insert a "/" in between); consult Telco on this to see if he would use it or likes it - Bob to update pull request for catalog listing of
info
content; maybe have different catalog levels? consider what to put incapabilities
endpoint - Jon and Bob to coordinate son who to attend the PyHC, CEDAR, GEM, SHINE meetings.
- Jon to send Wed 4pm meeting invite to work on open pull requests
- HAPI community forum - OK to move to 2nd Tuesday; topics to consider: Brent and GAMERA HAPI; Simon and Intermagnet; new HAPI features for 3.2; HAPI JavaScript plotting client; the new PHP server
- Jon to set up meeting with Rob R
- Jon to set up meeting with SunPy / fido developers to talk about image / uri serving
Agenda:
- Issue #116 (images and URIs) - pull request is there - are we ready to commit? what about adding a
base
option foruri
stringType? (Jeremy has test server) - issue #153 (catalog lists all info) - pull request is there too
- Discuss Julie's email for PyHC meeting (May 16-18, Boulder, CO and Zoom); she wants a HAPI status summary (since it's a core project)
- HAPI Community Forum: move to 2nd Tuesday of the month? potential topics:
- March -- Brent Smith
- April -- Simon and Intermagnet
- May - Presentation of HAPI 3.2 with images capability (preview of PyHC meeting material)
- Go over the list of outstanding issues for 3.2 and 3.3; are we moving fast enough to get to 3.1
- a meeting on Wed at 3pm seems to work well for making progress; I suggest we pick an issue and try to close it on Wed.
- if time, these items:
- PHP server - is that integrated yet?
- Request for real-time stream of HAPI data - continuous connection to data accumulating in real-time; there are already APIs for this - we should look at those; needs another ticket
- engage with NOAA and Rob R - who wants an invite?
Discussed Issue #97 and approved and merge Pull Request #163 https://github.com/hapi-server/data-specification/issues/97
Talked about Issue #153 https://github.com/hapi-server/data-specification/issues/153 Needs something added to capabilities endpoint to indicate of servers can list everything in the catalog.
Dove into details of implementing Issue #116 https://github.com/hapi-server/data-specification/issues/116 See the ticket for what we decided. (Hint: we won't add a new type, but add an optional 1stringInfo` block to indicate a string is a URI.)
Simon let us know that the Intermagnet HAPI server is now live. Pending a last check with the verifier, we will add this to the list of production servers!
Also, Simon will talk at next week's meeting to describe his experience implementing HAPI for Intermagnet.
Action Items: Jon to discuss with Doug about attending Space Weather Week with a promotion / marketing hat on to see where else HAPI can help. Potential future HAPI community forum on the uses of HAPI by scientists - showcasing what data is available.
Agenda
- Intermagnet update
- HAPI servers and certificates
- PHP HAPI server - has anyone looked at it yet? list with other servers?
- pull request for
hapidump
- survey results for time range error code
- outstanding issues
3.1.1: https://github.com/hapi-server/data-specification/issues/148 (HAPI robots identity and behavior)
3.1.1: https://github.com/hapi-server/data-specification/issues/135 (Way to ping server)
Schedule Simon
Email about Zenodo, changelog, and releases
https://github.com/hapi-server/data-specification/issues/134 (HAPI responses should include the JSON schema reference)
https://github.com/hapi-server/data-specification/issues/57 (should HAPI metadata include details on numeric precision?)
https://github.com/hapi-server/data-specification/issues/112 (how to best represent complex numbers and possibly other larger structures)
https://github.com/hapi-server/data-specification/issues/130
3.2: https://github.com/hapi-server/data-specification/issues/155 (Clarifications on response formats). Should this be put in 3.1.1?
3.1.1: https://github.com/hapi-server/data-specification/issues/97 (clarify error states for time-related error codes 1402 through 1405)
Discuss
Keeping a list of people who we have met with and where discussion left off. We should revisit list to determine if a follow-up is needed.
My Priority list for 3.2 (with the rule that if it isn't finished by June 1st, it gets moved to 3.3):
https://github.com/hapi-server/data-specification/issues/131 (Citation required aka Terms of Use)
https://github.com/hapi-server/data-specification/issues/153 (allow catalog to optionally return all the info data at once)
https://github.com/hapi-server/data-specification/issues/118 (associate related datasets that have different sampling modes over time)
https://github.com/hapi-server/data-specification/issues/98 (how to handle parallel requests and advertise server capability) https://github.com/hapi-server/data-specification/issues/134 (HAPI responses should include the JSON schema reference)
My Priority list for 3.3:
https://github.com/hapi-server/data-specification/issues/70 (Proposal for availability files)
https://github.com/hapi-server/data-specification/issues/116 (can HAPI also serve a time series of FITS images or other remote sensing data). This is important but will require coordination with groups that would use it. I'd prefer not to have this issue blocking the main thread until a sub-committee has had a chance to study it.
Agenda:
- Continue with priority list of issues for 3.2 (don't get too ambitious; set hard deadline of June 1)
- Check up on issue #148 and issue #130
Action items: all - consider issue list and select things to focus on by June 1 for 3.2
Topics: start priority list!
The next HAPI community forum on Jan 10, 11am: Brent Smith and HAPI use int he Cloud for data - model comparisons.
Bob and I can report on our conversation this morning with Simon from the Inter-magnet project. He’s been brought out of retirement for some specific development, one of which is adding HAPI, so that’s great.
Usage statistics for attribution are going to be key for Intermagnet and others too, likely. One solution is to have the cache still hit the server which just says that the cache is up to date, but the server can then note the request and log it. This is consistent with what we are doing now where it's up to servers to do this.
One idea (Bernie is doing this already): use the if-modified-since
See these: https://github.com/hapi-server/data-specification/issues/68 https://github.com/hapi-server/data-specification/wiki/cache-specification
Image serving ideas I had at the AGU: People are definitely going to do this, so we need to provide guidance. Rob Redmon at NOAA said his team doesn’t want to use HAPI since it doesn’t support images, so I think since Eelco has figured a way already to support space weather imagery, we should capture that, make it official, and emphasize it to Rob, so that the NOAA missions will be able to leverage HAPI.
Topics for today - looking ahead to next year:
- lots of discussion about how to manage complexity of metadata additions (like cadence likages); the cadnece stuff is close, and could best be decided on if we had a good straw man to analyze (this will be next year).
- need a prioritization of existing issues for 3.2 - a mix of hard and easy, so we don't spin wheels on all hard things!
- allow a new way to add capabilities using
x_new_endopint
that lets us (and others!) try new things that build on HAPI - thing about new hapi-focused APIs that are related, like a file finder service, or an image serving service (need lots of input from VSO, since they actually have this already - maybe see if they want it generified)
Action items:
- Jon - PDF on Zenodo
- Bob - JSON schema for 3.1 (should this go on Zenodo?)
agenda items for today:
- short review of where we are on issue #78 https://github.com/hapi-server/data-specification/issues/78
- Need PDF of 3.1 spec on Zenodo – I can do that if no one else wants to
- AGU plans – let’s meet at the posters!
- List of HAPI posters at AGU:
- SH42E-2339: Increasing Heliophysics Python Library Interoperability Through Datamodel Adapters
- Sandy Antunes
- Poster Hall‚ Hall - A - McCormick Place, Thu, Dec 15, 9:00am - 12:30pm (Central)
- SH52A-66: Making Heliophysics Data Easier to Use: Updates on the HAPI Specification
- Jon Vandegriff
- Screen 0066‚ Poster Hall‚ Digital Poster Monitor Zone 4 - McCormick Place, Fri, Dec 16, 9:00am - 12:30pm (Central)
- SH52A-72: Accessing Data at PDS/PPI Using the Heliophysics Application Programmer’s Interface (HAPI)
- Steve Joy
- Screen 0072‚ Poster Hall‚ Digital Poster Monitor Zone 4 - McCormick Place, Fri, Dec 16, 9:00am - 12:30pm (Central)
- SH41C-02: HAPI-NN: Neural Network Training and Testing Package for HAPI Users
- Travis Hammond
- Online Only, SH41C-02, Thu, Dec 15, 8:10am - 8:20am (Central)
- SH45B-02: Taming the Monster: GAMERA model access using Heliophysics Standards
- Brent Smith
- McCormick Place - S403a, Thu, Dec 15, 2:55pm - 3:05pm (Central)
- HAPI 3.1 now released
- Linking datasets by cadence
- Future of open tickets - need to go through and prioritize; small group do this first (probably January)
Discussion Topics:
- DevOps - ways to streamline releases: a. make sure all future checking tied to issues and squash-merged PRs a. version numbers replaceable by REGEX
- cadences
Agenda:
- pick meeting time once or twice per months for European participation; choose some of these times, and then we can ask for preferences form our friends across the pond
- any topics for next HAPI community forum? Maybe pass since AGU? But we di not have one in November.
- HAPI 3.1 release still not actually done - mired in changelog cleanup
- new rigor being applied for all new changes to the spec: significant changes must come through a PR tied to an issue, and the merge of that PR should be done to squash the commits to a single commit on the master branch; there is a page for instructions:
- does anyone want to work on a proposal to extend and standardize Eelco's tool (assuming he wants to do this)?
- continue discussion of linking datasets by cadence; potential
semantics
endpoint - new topic: federated HAPI servers: add another endpoint so servers can advertise the other HAPI servers they know about
Outcomes:
- about the alt. meeting time: Jon will make Doodle poll to find an alternate, European-focused time for the HAPI telecon
- no December HAPI Community forum; will shoot for a HAPI focused model-data comparison presentation in January; Jon and Sandy to meet with Brent and Eric about their uses of HAPI for GAMERA output; have Rebecca R. and maybe others from CCMC (Darren?) also attend for discussion on how best to sue HAPI for model output; recall there are at least 2 ways HAPI can be used for presenting model output:
- have HAPI serve the simulated measurement data along a spacecraft trajectory (to be compared with actual data measured along that trajectory)
- have HAPI serve out model data at each time step; the density parameter is a huge variable with the density for every point int he grid; the grid would need to be non-time-varying for HAPI to serve it sensibly (or at least easily)
- Jon to release 3.1 with regularized changelog
- people agreed about using PRs with collapsed commits to document all spec changes; spelling chnges (i.e., inconsequential changes) should still be done as needed - there are ways to merge these to other branches easily enough
- Bob may work on this, possibly after the break
- lot of discussion - see below
- only mentioned at the end; requiring a PR on
all.txt
is not terrible. Having another endpointotherservers
would be simple enough.
About linking datasets via cadence:
Still went back and forth about having very basic cadence connection in the info
, but because this introduces two-way links, it gets messy fast.
There are difficulties with the semantics
endpoint:
- someone has to maintain this because of potential broken links, assuming we allow links to datasets on other servers
- for large data providers, there could potentially be a huge number of entries in the
semantics
endpoint - who will make this metadata? - Jon and Jeremy to meet this week to make progress on a trial approach that we can discuss next week
Other actions:
- Jeremy to see about addig a PNG with the most recent test results from serers in
all.txt
to the README.md page for theservers
repo - that same README.md needs instsructions on how to add your own HAPI server to the
all.txt
file via a PR
Agenda:
- Review of discussion with KNMI about linking datasets via cadence, and possibly by other criteria (but that starts to be a larger-than-HAPI issue.)
- For coordinate systems and vectors, we still don’t have a way to link scalars that are separate parameters but together represent a single vector. I think we should adda keywords for this to 3.1.
- Anyone want to push some proposals? HAPI amalgamator, client development to bolster / improve the KNMI open source tool and make it a stand-along capability for any HAPI server, other ideas?
Wow - lots of discussion today on all kinds of ideas for managing linkages between datasets and parameters.
Solar imagery could be served by HAPI, since the VSO has a nearly identical request interface: t1, t2, dataset.
But note that for solar images, there is at least one other key request parameter: wavelength. This is so fundamental that maybe a server could advertise this as a separate capabilities
option. But then really each of those different wavelengths probably has different time stamps, so ins't each cadence really a different dataset? LASP treats solar image datasets this way (each dataset contains just one wavelength set of images).
Also, if we allow for additional request parameters, this opens the can of worms for other non-standard request parameters, so it would need to be done very carefully, i.e., there should be a very strict way to do it that ensures that leaving those extra parameters off means you still have a fully functional HAPI server. I (Jon V) am actually against adding any kind of optional request parameters, since it really messes up the interoperability of HAPI, and that is the whole point.
If a dataset really does have just one time column and different wavelengths in different columns, then there could be advertised predefinedSubsets
of a dataset that return just the items relevant for that subset. This is kind of generic, in that you might want just the "magnetometere data" subset, or just the "ephemeris" subset, or just the wavelength=XYZ microns" dataset.
For the dicussion about linking datasets of different candences, here's my partial summary, starting with a high level list of possible approaches:
- datasets can be linked linked via a post-fix on the dataset name using an ISO duration; this works now, but is too fragile / difficult to enforce, so no one wants to do just this; we still think datasets should be named this way for clarity
- advertising within the
info
response of a single dataset that there are other versions available; this would involve some kind ofotherCadences
block that lists the other datasets; complications quickly arise if the other cadence datasets - we could also link individual parameters in a similar way: this parameter has another cadence version available in this other dataset (and it also has another name in that dataset, and by the way, here is what the other cadence is)
- we could expose different cadences as possible discrete filters on a standardized averaging filter that is exposed in the
capabilities
response; averaging is so fundamental for time series data, that allowing servers to optionally support averaging could be reasonable - there was interest by several folks (Jeremy, Doug, and Eelco's software used this) to have averaged datasets also offer min/max/std_dev, etc; so there was interest in being able to express that certain parameters are actually statistical quantities resulting from other parameters in another dataset
If datasets are related by different cadences, then in the info
block for high-res DSNAME daatset:
"otherCadences": {
[
{ "server": "URL", "dataset": DSNAME_PT1M },
{ "server": "URL", "dataset": DSNAME_P1D }
}
Note that you should not list the cadence of the other dataset, since that is available in the info
response for that dataset. There are other problems with this approach: the averaged datasets will have extra parameters (avg, min max, std_dev, maybe some uncertainties). This block has to be replicated in multiple info
responses.
What we realized eventually is that these linkages really do not belong in the info
response for a particular dataset, since they are introducing dependencies into that info response from outside. The info
response should only be about the data it is describing. The linkages belong at a higher level, perhaps through another endpoint. This semantics
endpoint would be responsible for capturing the meanings of datasets and parameters, as well as connections between datasets, both at the full dataset level (different cadences with the same exact parameters), and potentially also at the parameter level (these parameters are statistical values from some higher time resolution parameters). This would need some thought. It might be possible to also start including our science data interfaces concepts at this level: this collection of parameters identifies a magnetic field vector; this set represents an energetic particle spectrum; this set is a plasma wave spectrum; this set is a spacecraft ephemeris.
Things we don't want to forget:
- we need a way to indicate that a datasets has a fixed, regular cadence (actually, why do we need this? client should always code defensively against "phase shifts" in supposedly regular data. I think this is an analysis issue and not a data serving issue)
- we already have a
cadence
keyword, so we could add afixedCadence
keyword, or aminCadence
andmaxCadence
and if those are the same, then clients can assume that the cadence is fixed
Here is the next thing to try: What would this extra endpoint look like for a known case: ground magnetometer data at three cadences: 1sec, 1 min, 1 hour
http://server.org/hapi/semantics
"relationships": [
{ "alternateCadences" :
{ "highestResolution": { "server": "URL", "dataset": "DSNAME_PT1S" },
"otherCadences": [ { "server": "URL", "dataset": "DSNAME_PT1M"},
{ "server": "URL", "dataset": "DSNAME_P1D"} ]
"parameterLinkages": {
// maybe have ways to indicate that parameters in this dataset are averages of the highest resolution?
}
}
}
]
Or just have a way to indicate that parameters
in one dataset are linked to other datasets through various enumerated relationship, like isAverage
or isMin
, or isMax
, or isStdDev
.
"relationships": [
{ "isAverage" : { "source": { "server": "URL", "dataset": "DSNAME"}, "derived": {"server": "URL", "dataset": "DSNAME_PT1M"} } },
{ "isAverage" : { "source": { "server": "URL", "dataset": "DSNAME"}, "derived": {"server": "URL", "dataset": "DSNAME_P1D"} } }
]
Agenda:
The coord frames branch is now merged into the main branch, and the list of changes for 3.1 is added, so we are ready to approve 3.1 for release. We will need a new directory for 3.1, and we need to update the web site.
Questions about release:
- ok to edit TOC with new changelog element (or does Bob need to do that?)
- created wiki page for new release steps (needs review)
Also, we can assemble our topics for the discussion Thursday with Eelco and others about future HAPI features. Here’s what I recall:
- linking datasets on a server based on different cadences
- how to use HAPI for images and other items that have a duration
- managing dataset-specific options – is there a way to do this that does not mess up the standard?
- can the nice display client used at KNMI be isolated as a separate open-source mechanism for the display of HAPI data?
Finally, we can talk about future proposals people are contemplating.
Action items remaining: Jon: new YouTube account, talk to Julie about PyHC hack-a-thon topics (HAPI looking doubtful)
Discussion:
LATiS has some internal merging capabilities, server-side, where you could (eventually) merge different streams via a post request
no meeting
Topics:
-
I’d like to set up a meeting time with Eelco about some of the ideas he has on standards for different cadences. This needs to be earlier in the European day! (Oct 27 Thu at 9am, alternate is Oct 26 Wed, 9am both time are Eastern)
Topics:
very nice timeseries display client (can that be separated?)
different candences linked in the catalog somehow (naming convention) display of images\ how to manage stop times and intervals\ -
I can give a brief report on the outcome of the IHDEA meeting. There’s a working group meeting on registries, and we need that. I think a federated approach is best, where there is a central copy of all HAPI servers, but every HAPI server keeps a copy of this list, and thus can know about all the others. This is sort of what we are doing now, but we don’t’ really have mechanisms for letting people browse or find multiple HAPI servers, since we’ve stayed away from the search problem and focused on access.
-
At some point, we need to go over a list of things to focus on for the next release.
-
Discuss PyHC stuff: Nov. Hackathon email from Julie and the movement of HAPI into the PyHC core
ask Jeremy about small parts of the URI templates or the HAPI server elements
HAPI amalgamation - combine two HAPI streams; start with summer school examples (it was one of the HAPI tutorial problems - compare ephemeris or measurement data from two places; model-data comparison; put two data sources onto same cadence)
See this example page:
https://github.com/hapi-server/tutorial-python/blob/main/HAPI_04.ipynb -
move the meeting into practical tasks involved in getting 3.1 finalized
Registry ideas:
Doug: at LASP, linked open data (ontology-based approach) using DCAT; Google Data now understands DCAT (as well as schema.org)
SPARQL endpoint that can be queried (for any RDF data)
ability to cross-walk from DCAT to SPASE schema; RDF good for this inference is possible
SPASE does have a search mechanism (not explored in the telecon): https://spase-group.org/tools/registry/index.html
Done - HAPI moved to PyHC core packages!
List of actions: Jon: send Julie hackathon note Jon: migrate YouTube account; HAPI YouTube account needs re-doing since current account no longer accessible. Jon: finalize 3.1 and release
agenda for next time:
Discuss Nov. Hackathon email from Julie
At today's telecon, we decided to move forward with the standard list of vectorComponents, and delay inclusion of customAngles pending further external review. The customAngles aspect can easily be added later without changes to the baseline coord frame and vector components parts of the spec, and we don't want to have two versions of this floating around.
There is an effort to build a coordinate transform capability, a service, and the creator (who Bob knows) wants it to take advantage of HAPI. We talked for 20 minutes about ways we coudl do this. We really need to talk to the servie owner for definitive information. HEre's what we know from Bob. The service will compute things like L* and pitch angles for specific spacecraft given an input of time values and position values. (We assume these are tied to real spacecraft, i.e., ones that have ephemeris data in SSCWeb, for example.) Since a lot of inputs are needed, this is very non-HAPI-like in terms of making a request, but the result is just a time series table: the times (same as input), probably the positions too (from the input), and then whatever variables the services computed as more columns in the table.
There was interesting discussion about using a temporary HAPI server as both an input source and an output source. For input, give the service a URL to a HAPI server, and that call results in a table that has time and positions for the spaceraft of interest. Stated differently, that HAPI call generates the desired input values of time and spacecraft position. Constraints here are that the time values are whatever that HAPI service returns. Could you make that totaly custom, where you submit data to a kind of "instant HAPI" server that wraps the input with a full HAPI server that is then available for 5 minutes or so? That would be an interesting kind of HAPI server that takes data and instantly re-serves it via HAPI!
If users were content with an enumerated set of cadences (1 sec, 5 sec, 1 min, 5 min, 10 min, 30 min 1 hour), then you could then have that kind of ephemeris data for every spacecraft of interest. (This still assumes you have a set of spacecraft - maybe the user wants to give an aribrarty path through space that was not from a mission? Need to check about what is the intent.) Then it really is a HAPI server, since the outputs at all those cadences are sort of afully know ahead of time, and you don't have to take any non-HAPI-like inputs, i.e., no arbitrary set of time values.
Another option would be to publish the result of the service as a kind of instant-and-temporary-HAPI-server. The service responds with a link to the temporary HAPI server, and then the catalog of that server is empty until the data request is done, and then it populates the catalog.
- Discuss Eelco's hapi-dev email
- everyone approved the idea of HAPI being pulled into PyHC as a core package. Bob to report back to Shaun about that.
- Doug Lindholm is interested in making LASP's LaTiS3 server (which is still in development) a generic HAPI server. IT is very close to this, so we hope to do an HTM proposal on it. Doug to pursue this with Bob and Jeremy.
- Jeremy can share with Doug his upcoming HAPI server overview (to be given at the IHDEA meeting in a few weeks)
- Bob has made extensive comments / revisions on Jon's coord frame pull request. Jon to review.
- Jeremy pinged Arnoud about the ESAC HAPI server. Hoping stil lfor more action there.
- lots of discussion on listing URLs via HAPI. Comments captured (Again) in ticket https://github.com/hapi-server/data-specification/issues/116
- hopefully vote on 3.1 next week!
- short meeting; mostly just touching base since Jpn's just back and Bob not present; coord frame stuff still needs final approval before 3.1 is approved.
- Server update best practices document?
- Any update on fix of time length in CDAWeb server
- Close out https://github.com/hapi-server/data-specification/issues/148 based on discussion at last telecon.
- Bob made commits to https://github.com/hapi-server/data-specification/pull/132/commits. Several items need discussion. Will wait until Jon is back on.
- HAPI as core PyHC package
- Jeremy - any updates on Santiago's code?
Still Pending:
- Jon's meeting with Aaron
- Jon: attend SPASE meeting on 8/4 and request DOI for individual releases; mention broken links in docs about coordinate frames - those should all be DOIs, also there is inconsistent referencing
- Jon: merge pull request for issue #117
- Bob: create issue for disallowing
00:00:00.Z
and add to spec - Bob: Send request for recent presentations
- Bob: Respond to H-API email
- New issue: https://github.com/hapi-server/data-specification/issues/148
- Review fix of time length in CDAWeb server (Jeremy will look into it and also the issue of the server locking up every day when catalog is updated.)
- Migrating a HAPI server best practices: https://groups.io/g/hapi-dev/message/289 (Doug said he would wait to update current HAPI server until new server has same data.)
Updates:
- Jeremy: Any updates from ESA? (Update: They are still working on it.)
- Bob made commits to https://github.com/hapi-server/data-specification/discussions/147. Several items need discussion. Will wait until Jon is back on.
Still Pending:
- Jon: attend SPASE meeting on 8/4 and request DOI for individual releases; mention broken links in docs about coordinate frames - those should all be DOIs, also there is inconsistent referencing
- Jon: merge pull request for issue #117
- Bob: create issue for disallowing
00:00:00.Z
and add to spec - Bob: Send request for recent presentations
- Bob: Respond to H-API email
News:
- Jeremy: heard from Mark Taylor, developer of TopCat, who has heard of HAPI but not implemented it yet; check about this via Baptiste at IDEA meeting in October
- Bob: explored Kepler data via FITS (which supports time series tables)
- Santiago: number of tic marks on LISIRD data is way too large (Action item Post issue to https://github.com/hapi-server/plot-python)
- Action item (Bob) Close loop on https://github.com/hapi-server/data-specification/discussions/147
Questions / Discussion:
- if other time systems or time units are present, HAPI could show those as parameter columns that are time values with different units; the units is the place to capture the time system; other time-related metadata could be made optional
- if you could go BJD to TT, then CDFlib could go TT to UTC
- should HAPI support a URL type to support links; maybe not use the
data
endpoint; maybe use alist
endpoint? (that could list files such as event, event lists, image links) helpful for generating lists of input files as a kind of what to - could HAPI capture images so that pixels are registered?
- FITS references:
- https://fits.gsfc.nasa.gov/fits_wcs.html
- https://www.aanda.org/articles/aa/pdf/2015/02/aa24653-14.pdf
- https://www.aanda.org/articles/aa/full_html/2015/02/aa24653-14/aa24653-14.html
- ISTP Guidelines TIME_BASE, TIME_SCALE, REFERENCE_POSITION, UNITS, BIN_LOCATION
- https://spdf.gsfc.nasa.gov/istp_guide/vattributes.html
Agenda for next time:
- review fix of time length in CDAWeb server: https://git.mysmce.com/spdf/hapi-nand/-/issues/10
- start advertising about HAPI on Helionauts
- Bob to talk about "H-API" for use of HAPI with FITS
Agenda:
- demo of HAPI aliveness tester
- pass through of HAPI Community Forum presentation on HAPI-NN mechanism
- approve pull request for issue #115 if it's not done over email
- Discuss https://www.nsf.gov/geo/geo-ci/index.jsp
Action items:
- (Jon and Bob) final tweaks to coord frame stuff (schema rework and then a few edits)
- (Jon and Bob) make change log (hopefully as collection of pull requests)
- (Jeremy) look at TopCat - can a HAPI reader be added to this application? (TopCat is still under active development!)
- (Bob) how hard would a pass-through server be for reading VOTable input and transforming it into HAPI
Action items:
- Jon: attend SPASE meeting on 8/4 and request DOI for individual releases; mention broken links in docs about coordinate frames - those should all be DOIs, also there is inconsistent referencing
- Jon: merge pull request for issue #117
- Jon: finalize issue #115
- Bob: create issue for disallowing
00:00:00.Z
and add to spec - routinely put on future agendas (3x per year?): updates to recent presentations
- Bob: will start by requesting presentation links now from people
Discussions:
- P. O'Brien and also B. Smith both working on using HAPI for model output runs - let's keep them coordinated
- P. O'Brien and OpenAPI - he will try and work on it; OpenAPI explorations by Bob - looks like we could map HAPI to this! Bob will consult with him on this
- NOAA in R. Redmon's group; working on API for REST services;
Agenda:
- Finish issue #115
- Do we all agree to disallow
T00:00:00.Z
? - List out AGU presentations: Jon: 1. HAPI Spec update 2. DataAdapters Project
Action items from last week:
- Jon: DataShop down. Remove from production server list; dumps full stack trace - security issue? Eventually, DataShop will move to a new server at AWS (different deployment scheme from APL)
- Jon: Completion of issue #117 pull request (last week mentioned that he found a typo but had issues with committing)
- Jon - Next community forum plan?
- Jon: Completion of issue #117 pull request (last week mentioned that he found a typo but had issues with committing)
Updates
- Jon - outcome of SPASE meeting July 21 about DOI to reference for coord frames
- Jon - ISWAT update
New Issues if we have time
- Look into finding catalog of time series; maybe consider adding HAPI as an input format? http://www.star.bris.ac.uk/~mbt/topcat/
- All - Thoughts about server-issues repo for issue tracking
- Sandy - New web page overview. Were SEO tags and mobile device friendliness lost? Some incremental improvements, but still looks like designed by a scientist; my recollection for the objective of revision was to make it look like it was not designed by a scientist ...
- All - Posting presentations and news on web page
- Bob - Discuss https://github.com/hapi-server/server-issues/
- Posting links to presentations on web page. Better to post a search link to NASA ADS so list won't go out-of-date?
- Bob - Discuss O'Brien's use of HAPI for CCMC runs (proposal): OpenAPI version of HAPI for use in micro services
- Bob - Discuss O'Brien's thoughts about extensions for subsetting
Notes:
15 minutes on updates
- COSPAR update; several folks are mentioning HAPI; ESA Space Weather Service Network Alexi Glover and Ralf Keil have HAPI server - the requires log in; API credentialing is complex and requires account and additional software (OpenAM). Bob suggested that this system could leverage what Arnaud's team is doing, allowing for open access (special permission was given for the ESAC team to use HAPI with no login). However, because some of the data is real-time, this may not be possible.
- Radio Science talk by Marco Molinaro who is using HAPI; Bob will follow up and invite Marco to a HAPI telecon
- Jeremy heard from ESAC
Action items:
- Jeremy to follow up with Beatriz in mid-August; ESA has permission to use HAP on their server and intends to "bu August"
- DataShop dumps full stack trace - security issue? Eventually DataShop will move to a new server at AWS (different deployment scheme from APL)
Agenda:
Focus on the coord frame issue.
Updates
- Jon: Completion of issue #117 pull request (last week mentioned that he found a typo but had issues with committing)
- All - Post presentations and news on web page
- Jon: ISWAT update
- Jon: Need to deal with DataShop. Has been down for a month; move out of production server list?
- People's thoughts about server-bugs repo for issue tracking
- Jeremy - CAIO server (Cluster server at ESAC)
- Jon - outcome of SPASE meeting July 21 about DOI to reference for coord frames
- Bob - COSPAR updates - ESA and Trieste Solar Radio System
- Sandy - New web page overview. Were SEO tags and mobile device friendliness lost? Some incremental improvements, but still looks like designed by a scientist; my recollection for the objective of revision was to make it look like it was not designed by a scientist ...
New Issues if we have time:
- Bob - Discuss https://github.com/hapi-server/server-bugs/
- Posting links to presentations on web page. Better to post a search link to NASA ADS so list won't go out-of-date?
- Bob - Discuss O'Brien's use of HAPI for CCMC runs (proposal)
- Bob - Discuss O'Brien's thoughts about extensions for subsetting
Priority Issues (last 3.1 issue)
- All - Final review of issue #117 pull request
- All - Discuss issue #115
- Jon - Community forum plan for August
Updates:
- Jeremy - CAIO server (Cluster server at ESAC)
- Sandy - Web page with Brent Smith
- Jon - update on SPASE issue (https://github.com/spase-group/spase-base-model/issues). Should we make a copy and create our own DOI if no action in another month?
Issues not covered in last meeting
- Bob - have finalized tutorial posted. Discuss having one of us present at monthly session. https://github.com/hapi-server/tutorial-python
- Bob - issue of needing AWS and more permanent hosting of some HAPI servers and services
- Bob -
T00:00:00.Z
question; no libraries seem to accept this (especially JavaScript in Bob's test suite; Python code also fails on it); libraries all wantT00:00:00.0Z
orT00:00:00Z
; even though it seems like a legitimate value, since no libraries accept it, we should modify our own regex to not allow this one as legit
Notes:
- Pull request for #117 - final tweaks on tick marks (a few are one char too early), then good to go
- CIAO server - need to check with ESAC still
- HAPI Community Forum, 11am Eastern : August 9: HAPI NN presentation by Travis Hammond; Question about putting things on common time grid? Could Travis show that too?
- HAPI Community Forum for September (Jon out that week): Java server by Jeremy; short blurb from Jeremy within a few weeks
- Action item: APL set up HAPI server and give Bob W account; multiple angles for doing this -- possibly use HelioCloud, either the GSFC instance or the APL instance!
- Related: is there place to get ephemeris data and coord transform matrices (see also SSCWeb Coord calculator and SPICE/NAIF WebGeoCalc); these are not HAPI servers; this could be a great Heliophysics Tools and Methods proposal, or SPDF could do this.
- locations of WebGeoCalc: https://wgc2.jpl.nasa.gov:8443/webgeocalc/documents/api-info.html
- location of SSCWeb coord calculator: https://sscweb.gsfc.nasa.gov/cgi-bin/CoordCalculator.cgi
- Generic need is for users who would want to put up a HAPI Server - can we support that? (a pre-configured AHPI-focused set of capabilities)
- Jeremy's heretical question: do we want to drop Day-of-Year acceptance for time values? No. Jeremy will find libraries that don't support it
- Discussion of server-bugs; do we want to keep using Bob's repo for server issues; half turn out to be client but some are server things that we need track outside of email;
Priority Issues
- Bob - Discuss pull request for #136 Closed and merged
- Jon - Discuss issue #117 Will close after final review next meeting
- All - Discussed issue issue #115
Other action items from last meeting
- Sandy - Go over new website No updates on web site, but he uploaded logos to https://github.com/hapi-server/hapi-server.github.io/tree/master/logos
None of the following were covered
Issues not covered in last meeting
- Bob - have finalized tutorial posted. Discuss having one of us present at monthly session.
- Bob - issue of needing AWS and more permanent hosting of some HAPI servers and services
- Bob -
T00:00:00.Z
question - Jon - update on SPASE issue (https://github.com/spase-group/spase-base-model/issues). Should we make a copy and create our own DOI if no action in another month?
New Issues:
- Bob - Discuss https://github.com/hapi-server/server-bugs/
- Posting links to presentations on web page. Better to post a search link to NASA ADS so list won't go out-of-date?
- Bob - Discuss O'Brien's use of HAPI for CCMC runs (proposal)
- Bob - Discuss O'Brien's thoughts about extensions for subsetting
Priority action items:
- Bob - provide draft for #136. Attempt to close issue at this meeting. Bob will send email with link to pull request asking for comment.
- All - Pick next issue to close out. We added issue #115 to the 3.1 list. Jon will send a draft pull request out before the next meeting.
Other issues:
- Bob - update on updates of hapi-server.org/servers that he said would be complete by this meeting. See the new checkboxes under Options. New options include showing requests made, a link to the verifier when a dataset is selected, and a link to http://hapi-server.org/urlwatcher when a server is selected.
- Bob - have finalized tutorial posted. Discuss having one of us present at monthly session. Jon will make a decision for next meeting soon.
- Bob - issue of needing AWS and more permanent hosting of some HAPI servers and services
- Bob -
T00:00:00.Z
question - Sandy - update on logo revisions Sandy will post logo to the hapiserver.org.github.io repository.
- Sandy - update on website revisions We decided to move a way from Jekyll. Sandy will develop new page in personal repo for us to review.
- Sandy - report on GitHub discussions option for general communication with users. Sandy has activated. Bob will put link on https://hapi-server.org/ page.
- Jon - update on SPASE issue (https://github.com/spase-group/spase-base-model/issues). Should we make a copy and create our own DOI if no action in another month? We discussed, but I don't recall what we concluded.
- Jeremy - show web app that uses HAPI data. Jeremy demonstrated Jeremy - POST URL
- Jon and Sandy - discussion of GAMERA HAPI server. They are creating a HAPI server. Bob's student Eric Winters is working on. We discussed how it would work.
Canceled
We won't be able to get to all of these and can push to the next telecon or a splinter meeting as needed.
-
Update on issues needed to get to 3.1 release. Assign action times. Put report on action item in the agenda for the next telecon.
Updates to get to the 3.1 release:
-
3.1 release: Items #117 and #136 required for 3.1. All other items downgraded to medium priority and saved for next release.
-
If folks have something ‘done’, speak now if you want it in 3.1
-
Next week, #117 and #136 will report out: #117 = ‘way to generically expose any existing complex metadata associated with a dataset’, #136 = ‘way for each dataset to indicate the max allowed time span for a request’
-
After next week, 1 week for everyone to review, then release 3.1 spec.
-
-
Discuss how we avoid having things "fall off" like the discussion of the web page revisions and the logo.
-
Jon: "One thing the server bug on CDAWeb makes me realize is that people will need a place to report things like that for any server. Servers can have contact info, but not all do. Plus, the summer school shows that people might use a persistent Slack channel for HAPI as a way to get quick help / pointers. Anything that can help people get “unstuck” quickly when they try something new will probably be very useful in aiding adoption."
(2 & 3) Use of Slack or similar for internal and/or external use: Sandy will look into GitHub’s “Discussion” and Wiki features and report back next week
-
CDAWeb issue. Bernie says Jenn is fixing the software for that. No additional discussion needed.
-
Discuss logo. Shrink ‘API’ to 2/3rd size and nestle closer to the ‘H’. Go to 2-color: API as dark blue. Sandy will email designers.
-
Jeremy's crawler code and relationship to solution to problems given in summer school. Jeremy will look at summer school code that Bob will post soon
-
Bob's updates to hapi-server.org/servers - Now has links for verifying and viewing server ping tests.
This telecon was to hear from the VirES implementors about their experience adding HAPI as a service to their data system.
VirES server can send gzip as requested by the client (via the right HTTP header) ==> we need to check our clients – do they properly trigger compression request?
HAPI spec needs to address special floating point values - in the JSON output, the special values `NaN, Infinity, -Infinity' could be used instead.
The issue is that CSV would need to have the string 'NaN', and binary would have an IEEE NaN value
The spec currently says nothing about text encoding. We have a ticket on this and it almost done. We are using UTF-8 with nil-terminated characters.
HAPI 1201 error is hard to implement – header leaves before data! Note that an HTTP 1.1 chunked response can be interrupted to tell client something is wrong. (main benefit of chunk is integrity check – gives you a way to tell the client that something went wrong.
Question: Are you doing your own chunking? Ans: Django is relied upon for chunking – pass it an iterator and it does the rest (same with gzipping)
traceability – what if the data is updated; how to indicate this to clients? Currently, this is hard since HAI does not report what went into the data. We could consider a new endpoint: hapi/provenance
We will add a link to the VirES HAPI server GitHub project on our list of known implementations.
https://github.com/ESA-VirES/VirES-Server/tree/staging/vires/vires/hapi
R and Julia client: Pete Reily from Predictive Science.
need to look at OpenAPI and get HAPI regisered there.
need to look at OGC/EDR - a standard in Earth Science for delivering data.
from Bobby: Rich Baldwin at NOAA is asking about a comparison between OGC EDR and HAPI.
- OGC EDR is at https://ogcapi.ogc.org/edr/
- API docs for it are here: https://developer.ogc.org/api/edr/edr_api.html
- and here: https://github.com/opengeospatial/ogcapi-environmental-data-retrieval
- New HAPI logo coming -- Jon and Sandy are leading this.
- Sandy will be presenting upcoming SuperMag HAPI server
- Jon will be presenting the HAPI server and its 3.1 features
TOC mechansim: Bob has a nice solution for this:
cd /tmp; pip install pyppeteer; git clone https://gitlab.com/rweigel/biedit; git clone https://github.com/hapi-server/data-specification; cd data-specification/hapi-dev; /tmp/biedit/biedit -p 9000 -o
Note that pyppeteer
is optional (PDF output only). Also note that if you are working on a different branch, you would need to switch to that branch - the above command puts you on the master branch.
(Usage note: don't leave the TOC checkbox checked all the time - it triggers per-keystroke updates.)
SPASE at the Heliophysics Data Portal now lists HAPI URLs. Bob could update the hapi-server.org/servers page so that CDAWeb ind olink point to the SPASE URL.
Lots of work on addtionalMetadata
item in the info
response. See that ticket for the latest.
Overview of results from the IHDEA meeting:
- once per month, we will have a HAPI telecon devoted to IHDEA members as part of our role representing Working Group 5 (devoted to HAPI) within IHDEA. This will be the first Tuesday of the month at 9am Easter time, which is more internationally friendly, at least for Europeans, and it's at least not the middle of sleeping time for Japan.
- Jon is now coordinating IHDEA WG 3 on coordinate frames, and that group will try to come up with a schema and instances of coordinate frame definitions; there will be several meeting throughout the year; Jim Lewis already asked for access to all master CDFs at CDAWeb to get started cataloging the frame names in use
- there was some talk about adding images to HAPI - this is complex; IHDEA folks expressed interest in keeping HAPI simple, emphasizing that this is one of its main strengths; there is the EPN-TAP protocol and the CSO interface which can server images, so maybe those are enough; IHDEA folks also suggested maybe an IHAPI interface - something separate for images
Update on the HAPI paper - Bob needs edits by Friday; Jon to add COSPAR recommended standard language and reference to COSPAR Space Weather Panel Resolution on Data Access (from 2018): https://www.iswat-cospar.org/sites/default/files/2019-08/PSW_2018_DataAccess_Resolution_V2.pdf
Referee also wanted update on ability to deal with image data. Bob to just say this is a possibility but is not in there right now.
Discussion of image handling in HAPI
- if we don't return number, this should be via another endpoint ('hapi/references' or similar? this needs thought)
- could possibly also serve event lists - but those can have repeat events at the same time - this is at odds with HAPI time series data
- we need to try this and see if its worth it
For coordinate frames, and to support the full machine interpretability of vector data in HAPI, the following changes will be added to the spec:
- add a 'coordinateSchema' optional entry so that each dataset can specify a machine readable schema for interpreting coordinate frame names
- add an optional item to a parameter to indicate that it is a vector quantity, and this element will indicate the coordinate name (to be interpreted according tot eh 'coordinateSchema') and also a 'componentType' which is an enum of 'cartesian', 'cylindrical' or 'spherical'
Discussion on custom parameters: possibly add a section to the 'hapi/cpapbilities' section: ''' capabilities : { optionalParameters : [ { name: 'c_avg', restriction: { type: 'double', range:[0,inf], default: 0, description: "average according to the number of seconds given for the value of this parameter" }, { name: 'x_subtractBackground', restriction: { type: 'string', enumeration:['yes','no'] }, default: "yes', description: "subtract the background from the data?"}, { name: 'c_qualityFlagFilter', restriction: { type: 'int', range[0,4] }, default: 0, description: "quality level to accept, with 0=best quality, 4 = worst" }, ] } '''
Discussion on SuperMAG - might be good to get something working soon, rather than fit SuperMAG intricacies into HAPI mechanisms
demo of test site for SuperMAG HAPI interface by Sandy Antunes; several options for SuperMAG data (baseline subtraction or not, etc); different ways to handle this - possibly use different prefixes, but danger is proliferation of HAPI Server URLs with confusion about which one is for what data; alternately could be done with additional request parameters, possibly non-standard ones
There is likely a need for HAPI to support additional request parameters.
There are two types of new request parameters that might be needed.
- parameters that any server might want to support, but do require some effort to implement; examples: time averaging filter; spike removal filter; possibly a parameter value constraint option, although this is getting really complex! These parameters would have a prefix to indicate that they are optional, additional parameters, but if servers want to implement them, they should use the existing names and syntax and meaning (all time averaging should use the same request parameter and should behave the same on the server)
- parameters that are truly custom to one server or even one dataset; these have a prefix of x_
There should be a way to convey the presence of and also the meaning of any additional (standard or custom) parameters in the capabilities file.
We talked about serving images - see today's entry for issue #116 for more discussion.
Sandy showed what he's doing for SuperMAG to add a HAPI server there. He's created additional prefix elements in the URL before the /hapi/ part of the server URL to allow for the combinatorics of options for SuperMAG data. There are many, but the two options discussed were:
- baseline = daily, yearly or none
- give_data_as_offset_from_baseline = true, false (I think Sandy called this "delta")
In this case, some of the combinatorics for data options might be able to be gotten rid of if SuperMAG woucl also allow it's baseline data to be released as a separate dataset. We can ask about this, but it would still be worth it to look at ways to support extensions to the standard HAPI query parameters.
Extensions to HAPI query parameters could be described in the capabilities
endpoint. They would have to be simple and fully described. You could envision enumerated options as in option1={A,B,C}
or numeric options like 0.0 <= option2 <= 10.0
(expressed in JSON syntax in the capabilities). There would need to be a description for each elements, and units for any numeric quantities.
Sandy will make his existing server publically availble for testing, and Jeremy will try it out. Bob will make his Intermagnet development server available too, and we can compare them next week to see how to proceed.
For issue 115 ( https://github.com/hapi-server/data-specification/issues/115 ) the SPASE group and the IHDEA group are planning to come up with a way to identify coordinate frames in a standard way. There's actually an existing IHDEA working group on this already for one year, but nothing was done yet. There is a potential leveraging of SPICE-based techniques, but SPICE does not actually have a naming convention for frames. Several folks have their own conventions, but none have gotten wide adoption. There are a few standard papers on coordinate frames people use as the basis of their conventions.
We talked about allowing non-standard schemas for 'units' and also for 'coordinateFrameName' elements. As long as there is also listed a reference to the specification, we though it would be OK for people to specify a schema that we did not explicitly list. There was talk of listing all custom schemas in the 'about' endpoint, but then Bernie suggested and we agreed that it really belongs in the 'info' response (close to the use of the new schema name), which is where you need it anyway. A ticket has been opened for this.
https://github.com/hapi-server/data-specification/issues/129
Next week, Sandy will present progress towards a SuperMAG HAPI server. SuperMAG present some challenges since it currently presents data in a way somewhat orthogonal to HAPI (each station is not a dataset, but HAPI tends to think of them that way), and also there are lots of options or flags that the SuperMAG native access mechanism exposes.
We briefly talked about how HAPI could possibly be used in a cloud-based setting. This is being explored for SuperMAG, and then also for possibly model output data. Model data may have variable grid structure, so the dta structures are changing shape at each time point. HAPI does not currently support this.
Finally we talked about using HAPI for images. See ticket #116. Bob and Sandy are both interested in this, so we can talk about it next time too.
Agenda for next meeting:
- quick status update on HAPI paper and any HAPI presentations
- Sandy to present on SuperMAG (15-20 minutes, plus discussion)
- overview of a sample coordinate frame schema (written by JonV and based on CDAWeb info) - this is just a toy version of a real schema, but we can reference it and it shows people the basics of what would be needed for a more full-scoped coordinate frame naming standard.
- talk about HAPI for images
- review of outstanding, high-priority tickets
talk about 3.1 issues and priorities
no meeting
To discuss:
Bob - will make webinar on client usage of HAPI for science users after the paper comes out
Eric: "target specific users; scientists versus data providers"
This is ready to be closed after confirmation by Aaron and others at CDAWeb: https://github.com/hapi-server/data-specification/issues/56 Closed the ticket on user identity management.
These are some older tickets with no champion, so we reviewed those to see if anyone wants to revive them, or else they should be closed.
https://github.com/hapi-server/data-specification/issues/101 (servers emit HTML) -- agreed to close with coments about other solution
https://github.com/hapi-server/data-specification/issues/102 (use self-documenting REST style) -- agreed to keep open with low priority for now
Wed 1pm Eastern is next ticket review meeting
Next regular meeting will be June 14.
- HAPI 3.0.0 on zenodo: communities are SPA and PyHC
- HAPI paper submitted to JGR
- Jeremy: what about clients if servers go to 3.0.0? How do clients negotiate with the server which versino to use?
What about a capabilities object that identifies other versions of the spec:
otherVersions: [
{ "version": "2.1",
"url": "http://server.org/data/v2.0/hapi" },
{ "version": "3.0",
"url": "http://server.org/data/v3.0/hapi" }
]
This above approach is possibly non-standard - Bob found this reference with 4 methods.
https://www.xmatters.com/blog/devops/blog-four-rest-api-versioning-strategies/
Need an issue for this - not for before 3.1
Note: some things in 3.0 that will be deprecated.
Clients: Python and Matlab: not tested yet against 3.0; SPEDAS - not yet at 3.0; Eric will mention to SPEDAS team
Sample 3.0 data: http://hapi-server.org/servers-dev/#server=TestData3.0&dataset=dataset1
Discussion
- ok to close issue #107 (it's attached to the 3.0 release); create new ticket for longer term web site updates
- wording discussion for issue #77 on keyword normalization: best to use the latest language in the spec, but include the older style keywords to indicate that they are deprecated; have a block at the start of the 3.0 spec indicating the big changes
- how to improve landing pages so that people who don't know anything about HAPI can get started.
- COSPAR drafts / updates due Dec 7.
Landing page improvement ideas:
a. better hapi-server.org page (The content at hapi-server.org comes form the README.md checked in to the project associated with hapi-server.github.io. The actual server is at Amazon, but the landing page gets pulled / reserved from github.
b. improve the user interface of the HAPI Server Explorer" (or whatever Bob wants to call it), running at hapi-server.org/servers Ideas: add verb to dropdown menus; add intro paragraph; include name (HAPI Server Explorer, or equivalent)at top of page; create a set of slides to explain the usage, or maybe a video (use Camtasia for making videos, or maybe OBS);
At the Heliophysics Data Portal, maybe have two links for a HAPI-accessible data set: use HAPI data, info about how to HAPI. Aaron emphasized the need to let people know that this service is available and how to use it.
Action items
- Jon - close issue #107 about (after changing link to hapi-server.org)
- Jon - create new ticket for better hapi-server.org intro site - a "Getting Started" page for brand new HAPI users.
- Jon - look into updates for the hapi-server.org main page
- Bob - continue with spec updates for issue #77
- Bob - revisit / restart the HAPI paper; submit to JGR, Space Science Reviews, Advances in Space Research
- Jeremy - keep thinking about sub-group for server mods
- Jon - check with Masha on COSPAR group status
- Bobby - maybe include HAPI (and SPASE) on COSPAR presentation
Next meeting is: Dec 7 - this is during AGU, but it keeps us checking up on action items.
went through all issues to categories by milestone (3.0 or 3.0+)
Action items:
- Jon to review pull request for issue #94
- Jon to check with Eric on MMS units status
- change units pull request to only have units specs with good online info
- Bob to write up spec changes for keyword normalization
- all to read related issues #82 #83 and #87 for discussion next week
- Jon to ask Jeremy about starting server extensions working group
- How to describe access to a dataset via HAPI in SPASE via the
<AccessURL>
element?
There was a lot of discussion about this - our solution that we proposed last time much more than I anticipated. SPASE is not clear about the intent about <AccessURL>
so we debated about responding using HTML versus JSON.
Two tickets are opened after the discussion: 101 (use HTML) and 102 (add links to make HAPI more truly REST-ful)
- Jon gave summary of meeting with Beatriz Martinez at EASC; HAPI server coming for Cluster Science Archive (CSA), but they are re-doing the server and will be done in January - no changes to metadata schema, so Bob and Jeremy will look at their metadata to see how it maps to SPASE; they are interested in using Bob's server initially as a living (actually used) example and then create their own implementation eventually; they were interested in any Java components we might be able to offer to help
- discussion about how best to incorporate HAPI info in the AccessURL of a SPASE record. The info response is not the right thing, since that is very computer-centric, so the current thought is to use the top level URL for the HAPI serer and then reference the dataset ID in the ProductKey element of AccessURL.
- Nand asked us if we were really making a difference and suggested it is time to zoom out and ask larger questions about impact and relevance. We need to push adoption more by getting totorials / quick starts out there. We need to finish the HAPI paper.
- IHDEA meeting is 19-22 Oct; agenda still being formed - people need to submit talks now since that's how the agenda will be formed; see this link: https://indico.obspm.fr/event/427/
Actions:
- Bob to review CDA metadata with Jeremy ahead of Dec 2 meeting with Beatriz
- Jon to look at adding quick start link to HAPI server home page
- Jon to look through SPASE records to see which ones would need HAPI access info
- Jon to add HAPI access info to agenda of next SPASE meeting (likely to be on Thursday Oct 15)
- all: submit your IHDEA talk now
- for Issue 94 (server info page), we decided to use the example.com/hapi/about endpoint, and added a
publisherCitation
optional element. This is ready to go into the spec. Additional items which are more dynamic belong on a different endpoint, discussion of which belongs in another issue. - broken links now fixed on hapi-server.github.io
- updates from Bob: generic HAPI server being updated to work on a Windows server. Supported operating systems for this generic server are: Unix, Mac, Windows, Raspberry Pi, Docker
- HAPI client being upgraded to chunk up requests for longer time ranges; discussion of caching, which is linked to this capability due to need to
Action: Bob will update spec with new "about" endpoint; others will review his pull request and can make suggestions Action: Jon to incorporate time-varying changes into spec (different pull request) ahead of IHDEA meeting in October
- opened new issues #98 regarding how a server can indicate the ability to handle parallel requests from the same client; note that sometimes, it's hard to tell how many requests are from a single client if you have multiple servers behind a load balancer
- AMDA server is using HAPI now based on Bob's node-js front end; no public API could be found yet at AMDA (ask about this!); a new versin of the node-js server is about to be released; the AMDA folks asked about "HAPI inside" label or logo. Several will look into this (the PyHC group has a logo design process underway). Several responded to Genot's questions.
- need to finalize and close issue about HAPI error codes for time ranges
- no meeting next week
- closed issue #95 (should stop=None be interpreted by server to be the stopDate for the dataset?) since this can be handled in client code
- action item for all: comment on issue 97 about time error code clarification
- Bob presented the case for more complete info about each HAPI server (all.txt is very plain right now). We need a schema for the server list, and it would also be great to have an endpoint whereby servers can emit their own info (presumably using the same schema element). Bob has already created issue #94, and he will use that to come up with a schema for server details.
- Bob to update web page on the HAPI main page - hopefully he can talk about recent hapi-server.org web page updates next week
- Jeremy gave demo of Sparkline (https://en.wikipedia.org/wiki/Sparkline) capability he has added to his Autoplot-based HAPI server, which can be added to other servers if they want a quick visualization capability
Agenda
- release 2.1.1 is pretty much ready to go; procedural questions:
- retain a running changelog versus just changes for most recent version (maybe each major release has running changelog);
- the use of pull request mechanism works as long as people work on the same branch for modifications; that
- need to copy in the contents to the 2.1.1 directory
- decide which changes are key for resolving and including in 3.0; see this list: https://github.com/hapi-server/data-specification/milestone/4
- meeting this Wednesday 9-11:30am noon with ESAC about HAPI server
- GEM meeting this week
- AGU abstracts due July 29; https://agu.confex.com/agu/fm20/prelim.cgi/Home/0 ; some interesting sessions:
- PyHC session: https://agu.confex.com/agu/fm20/prelim.cgi/Session/102187
- Tools, analytics, etc for Solar and Planetary: https://agu.confex.com/agu/fm20/prelim.cgi/Session/101191
- Helio HackWeek - maybe a short presentation about uniform data access to participants?
- Aug 25-27
- https://heliohackweek.github.io/
- ISWAT sign up - still needs doing: https://www.iswat-cospar.org
- Bob - updating the IDL client - needs installation instructions to be added by Scott; also making https://hapi-server.github.io/ more user friendly; better landing page for list of active servers; schema for server list; Bob to report / demo next week.
Agenda:
- release discussion: 3 issues (small, clarifications) to go into 2.1.1; then stamp this out; then copy to hapi-dev to begin work for 3.0; make sure there are no version 3.0 issues contaminating the 2.1.1. release.
- ISWAT cluster sub team? https://www.iswat-cospar.org/ General agreement to join. Sepearate paper for COSPAR meeting (different than Bob's current draft); include Batiste Cecconi, and Arnaud Masson
- HAPI web site improvements needed http://hapi-server.org/
- Other examples:
- https://sunpy.org/
- https://hpde.gsfc.nasa.gov/
- paper: journal is for Space Weather (SPASE is in here), or maybe Space Science Reviews? (same as SPEDAS paper)
- Heliophysics 2050 workshop - looking for white papers; need one for Data Environment:
- papers due in September 2020
- https://solarnews.nso.edu/heliophysics-2050-workshop-new-date-and-call-for-white-papers
- around the room - action item updates; new servers:
- Arnaud asking for time to meet for ESAC server; Bob's generic server an option
- no news from InSook; PPI node needs more support or other project leadership
Remember to do for 2.1.1 release: add leading comments about key clarifications.
Actions
- read Bob's draft paper - see his email dated 2020-06-22 https://docs.google.com/document/d/1CyA_rFHv8nEMZLjCCexaVl7g5B3xYQ-vI10t2ZtbJlg/edit
- see previous action items
- optional: read the recent SunPy paper: https://iopscience.iop.org/article/10.3847/1538-4357/ab4f7a
- side issue: move URI template wiki and effort to spearate GitHub project; make more useful by implementing in more languages -- translate JavaCC file to antlr, then to other languages? Summer of Code Project? find JAvaCC code for URI template parsing (Jon)
Today's call was a review of tasks needed to focus on pushing HAPI forward. Here is what we came up with, in order of importance.
- finish spec updates
- get SuperMAG HAPI server up and running
- get PDS HAPI server up and running
- finish and publish Bob's paper on HAPI - the spec and it's uses, showing wide adoption in Heliophysics and also at the PDS/PPI node
- status dashboard for existing servers
- integration with SpacePy
- continued and even more coordination with PyHC projects
For the next few months, we will try month telecons of 30 minutes to keep momentum going. The link to these notes will be included in the weekly telecon notice.
Action items from today:
- Bob to email Rob about SuperMAG (done)
- Bob to send link to paper (done)
- Jeremy to contact InSook
- Jon - get busy with those spec updates!
- walk-through of Jeremy's stand-alone Java client (no dependency on Autoplot); currently offers low level access to an iterator of records as they stream in; server oriented methods are static methods; very low level options expose the JSON content of the response; higher level methods allow conceptual access that isolates you from the actual JSON content (which also insulates users from potential changes in the JSON)
- discussion of issue:77 on some naming changes to remove a few quirks - see the issue for details
- Looked over issues
- talked briefly about availability info; some tickets related to this already
- Jon will present at the Python meeting on Wednesday.
- Jeremy will present Java client next time: May 11
- presentation / discussion of SuperMAG HAPI interface targeted for June 1.
- Jeremy is starting a Java client, mostly coded and looking for people he could work with.
- talked a bit about client identities, to support Super Mag server Bob Weigel is working on.
- "Time Series Data" vs "Time Ordered Data". Shing says that Time Series Data implies F(T) where F is an array of scalars.
- Chris L from LASP presented their HAPI server, and plans for the next version.
- meeting was planned but was cancelled.
- Jeremy has been working with In-Sook with the UCLA server.
- Bob Weigel has been working to have the verifier check on version numbers.
- in 2.1.0, need an update now about labels and units so the verifier can be updated
- we need a 2.1.0 official release (at the right time point - to allow proper differencing)
- changelog entries need to link to individual commits or diffs
- examples need to be added to clarify units and labels:
The spec doesn't describe very well what to do for the label (or for the units) of multi-dimensional arrays.
For a 1-D array, it seems clear enough: the label (or units) can be a scalar (that applies to all elements
in the array) or an array of values with one string per data element (and the length must match the size
of the 1-D array).
"name": "Hplus_velocity",
"description": "velocity vector for H+ plasma",
"type" :"double",
"size": [3],
"label": "plasma_velocity"
"units" :"km/s"
For a two-dimensional (or higher) array, we should allow for the units and the label to be a scalar that
can then apply to the entire multi-dimensional object:
"name": "velocities",
"description": "two velocity vectors for different plasma species",
"type" :"double",
"size": [2,3],
"label": "plasma_velocity"
"units" :"km/s"
The idea of having an array parameter is that all these elements have a strong "sameness" about them,
so expecting the units to be the same is reasonable.
Note: for an array of two vector velocities, the size should be [2,3] instead of [3,2] since the
fastest changing index is at the end of the array.
Note that the ordering is not ambiguous for things like a [2,2] because the spec indicates that the
later size elements are the fastest changing.
See this:
You could also give a label for each dimension:
"name": "velocities",
"description": "two velocity vectors measured from different look directions",
"type" :"double",
"size": [2,3],
"label": [ ["species index"], ["vector component"]],
"units" :"km/s"
Each label in this case applies to the entire dimension.
But the values still all have the same units. It's hard to think of a case where the units would be
different - otherwise, why is it an array?
You could also label each vector component:
"name": "velocities",
"description": "two velocity vectors measured from different look directions",
"type" :"double",
"size": [2,3],
"label": [["species index"], ["Vx", "Vy", "Vz"] ],
"units" :"km/s"
Or, you might want to label everything:
"name": "velocities",
"description": "two velocity vectors measured from different look directions",
"type" :"double",
"size": [2,3],
"label": [["H+ velocity ", "O+ velocity"], ["Vx", "Vy", "Vz"] ],
"units" :"km/s"
The units behave in a similar way, in that a scalar unit string is broadcast to all elements in its dimension,
but an array of string values are applied to each element in the array. One use cse for this would be for
vectors specified as R, theta, phi values.
add example with different units (R, theta, phi, for example)
meeting agenda
- brief report on Jon's UCLA visit - I will tag-up with In Sook a few more times via phone call over the next few months; she had some fairly complex data and had some issues fitting it into HAPI; the PDS group there needs help converting PDS3 data into CDF
- going through outstanding issues identified by Bob: PDF problem, nulls in bin ranges, null in labels
- next telecon meeting: Feb. 12, 1pm
Notes: for UCLA HAPI server: what about using VOTable tools already used; leverage Eric on different floor! also some PDS3 to CDF tools at CDAWeb! Jeremy to work with In Sook - possibly loop Jon in for a few discussions
MAVEN SWEA (Solar Wind Electro Analyzer) data is in CDAWeb too; the elevations vary for the first 8 energies only, and then are fixed for the remaining 56 energies; HAPI could capture these elevations as a separate variable in the header;
Action: Jon to find out how the team views this data; PI at LASP is listed in metadata
Action: explore how image pixel references could also be captured using these bins
Action: remake HAPI 2.1.0 PDF and see if it fixed the Github renderer; ask around - is this broken?
Action: add phrase about bins content that having both centers and ranges is also OK.
Action: need more bins examples since this is one of the most complex parts of the spec
Action: Bob to write up description about allowing a null ranges if there are bins with centers but no ranges (for just some bins - if there are no ranges then just don't have a ranges object); Jon will review the writeup
Action: for integral channels, explain that you still need to put a (very high, but just high enough) bin boundary
Action: add-write-up for time-varying bins and for header references
Action: Jeremy to meet with In Sook Moon at UCLA to help along the HAPI effort there; see above too
Action: fix time string length for datashop Cassini MAG dataset (Bob noticed this)
Action: Doodle poll on new meeting time (and alternate meeting week of Feb 10)
- Discussion about CDAWeb server's approach to ordering of parameters in the request; can CDAWeb be 2.1 compliant? Nand to consider this soon.
- In looking at CHANGELOG: need to clarify changes in CHANGELOG: add numbers; categorize as to effect on servers
- AGU updates: Amazon lambdas could be useful
Plans for this year:
- get spec ready for 3.0 release; Jon to come up with a reading list for the "what to put in 3.0" discussion on Jan 27
- check up on PDS server - Jon to visit UCLA in January - Jeremy participating via telecon?
- Python client - spruce up docs and packaging; make sure other libraries use this as lower layer elements
- training for scientists - tutorials at meetings; tutorial telecon captured as video and crib sheets - borrow this technique form Eric Grimes (Jon to ask for assistance)!
- status and continuity of LASP server
- paper out on the 3.0 spec; Bob has an early draft he will send around
Discussion about data and DOIs; CDAWeb will acquire DOIs via SPASE; can retrieve data now using DOI or CDAWeb ID or SPASE ID; waiting for missions to coordinate DOI assignment; should HAPI offer a more generic data query capability for other IDs? Question about versioning and provenance? HAPI will use this standard when it comes avaialble.
Next meeting: Jan 27 to decide about issues to include for HAPI 3.0
- IHDEA meeting update: the verifier is very popular; the new ability to handle time-varying bins was presented; HAPI is now accepted as the interoperable way to deliver time series data; ESDC (which stands for ESAC Science Data Center, where ESAC stands for European Space Astronomy Center) is planning to adopt HAPI - they are waiting for a lull in activity - we will coordinate with them starting around the new calendar year; CDPP is also planning an implementation
- hapi-server.org is having problems in some browsers because of it's certificate and https issues
Action items:
- Jeremy to fix the hapi-server.org certificate issue
- Jeremy to prepare a demo of Autoplot using SAMP for something other than granule access. SAMP can delivery das2 endpoints, and could similarly expose HAPI endpoints (either at the dataset level or probably also at the whole server level)
- Bob to give a demo of the generic server capability (it is now installable via pip)
- Jon to update the spec with all recently (conceptually) approved changes, including: time-varying bins, references in the header, a cleaning up of the usage of id versus dataset, etc; changing time.min and time.max to start and stop in the request interface (keep but deprecate the older terms)
- PyHC meeting in two weeks - Jon and Aaron to attend; others will participate online; ensuring a sensible, common data access mechanism within the emerging library is of particular interest to the HAPI crowd
- next telecon on Nov 18
- The two new features (time-varying bins and references in the info header) have both been tried on live demo servers, and seem to be working well. See Ticket #83. These are ready to be written up in version 3 of the spec.
- units - For HAPI 3.0, we would also like to add an optional "unitsSchema" as an optional Dataset Attribute. This would allow data providers to specify what convention is to be used for interpreting the units strings in the metadata (i.e., info header). As mentioned in Ticket #81, which is about this topic, conventions like UDUNITS2 are suitable for this, and they satisfy case 1 and case 2 that are described in Ticket #83. There needs to also be a way to specify which version of the schema is in play, and we decided to start with a rough version identifier, such as "udunits2" rather than being very specific like "udunits2.2.26" since that would be harder for clients to manage when there are minor version changes. The other example are the units from AstroPy, which are apparently part of the core AstroPy package, which is now at version 3.2.1 so that using AstroPy-comnpliant units in HAPI metadata could be indicated using a "unitsSchema" of "astropy3". Rather than force people to choose a units schema from a list, we will describe the ones commonly in use and provide recommendations for how to come up with the appropriate schema name. If clients do not recognize the unitsSchema, they will just ignore it. Note that each dataset specifies it's own unitsSchema (but not individual parameters).
- other news: WHPI (Whole Heliosphere and Planetary Interactions, see https://whpi.hao.ucar.edu) is attempting to make plasma data from all relevant Heliophysics missions and models accessible. There's a meeting next September and ideas are floating around now to help make this happen. HAO has money to work on this. This would be a great chance to point out that HAPI was designed exactly for this problem, and try to get some traction with and support from this group.
- other news: Cluster data is going to be mirrored at CDAWeb, where the default option is to present it in it's converted CDF form (ISTP-compliant) and serve it via the usual CDAWeb conventions, including HAPI.
- upcoming meetings:
- Aaron headed to Big Data meeting for NAS - he is looking for ideas and slides
- IHDEA - Jon to will present the latest updates to HAPI
- PyHC meeting; Aaron and Jon to attend; Bobby attending remotely and has ideas ha wants advanced
- AGU - relevant sessions are Monday (IN11E - Tools and Databases in Solar and Planetary Big Data) and Thursday (SH41C - Python for Solar and Space Physics)
This call was to give a quick status update from the sub-group working on references and time-varying parameters. A few suggestions were logged in Ticket 82.
Aaron also commented about maintaining a focus on implementations, and having something ready for people who want to implement a HAPI server but want to just drop is a pre-existing, generic server that can districtue their data using HAPI. We also talked about future connections to NSF efforts, such as we hope to bolster using the SuperMAG effort that is underway. Madrigal would also be a useful connection to make.
- Eric will send Bob a note to update the HAPI main web page about the IDL client in SPEDAS.
- Time-varying bins virtual hack-a-thon is this Thursday; iron out spec changes and implications
- upcoming meetings: AGU (PySPEDAS poster will mention HAPI, Jeremy is in Python session, Jon has 2 abstracts on HAPI), also the IHDEA meeting in October - present time-varying bins update to ESA contingent
agenda: Jeremy's presentation on Das2 server options
Das2 servers have flags for individual datasets that grew out of the original use for Das2 servers, which was as a somewhat internal protocol between a client and server written by one developer, who understood what all the "secret" options were and could use them to optimize the data transfer for what the client needed. Jeremy advised against this kind of behind-the-scenes options proliferation.
Because the ensuing discussion led to significant interest in adding optional capabilities to HAPI servers, the bulk of the content for this telecon is captured in Issue:79
See that issue for details about adding server processing options.
Action items:
- Bob to present about the FunTech server
- Jon to follow up on server implementers
- need examples of capabilities modifications to support binning, interpolation, and spike removal
agenda:
- discuss release notes
- organizing efforts for the next release - which issues to work on and who
- plan for getting servers updated to the latest spec
- getting the word out: documentation, posters at meetings, training sessions at meetings, training videos
- news: Python meeting tomorrow in Boulder and via telecon: http://lasp.colorado.edu/home/pyhc-meeting/
- Python meeting agenda: https://docs.google.com/spreadsheets/d/1npXsRis0lD_mtM94E3W0TkJfuCe0na60rzoYqH_mmhA/edit#gid=1845898482
Focus for the future:
- more complete example package showing people how to access typical dataset using multiple clients
- paper describing HAPI - Bob W. to send around draft; options: JGR, Space Phys. Rev
Actions:
- Bob: send around client test suggestions
- all test clients per Bob's directions
- Jon: check with Nand and Doug about server status
- Jeremy: prepare demo of Das2 dataset options management (only retrieve finest resolution, etc)
- Jon: straw-man examples of binning, interpolation and de-spiking
- all: bring servers up to new spec
We decided to proceed with the release of 2.1.0. The one thing left to do is update the spec to reflect the resolution of Issue #69 about how to handle a user request for parameters that are not in the same order as what is in the metadata. We are also adding an additional error code (1411 - out of order or duplicate parmeters).
Suggestions - update the "server" nomenclature in the spec to reflect intent: this is the full server and URL prefix of the top server location / entry point. After the first example (http://server/hapi/data?id=alpha&time.min=2016-07-13) clarify that "server" includes the hostname and possible prefix path to the HAPI service.
Lots of questions about prescribing the order of returned parameters - Nand: this can add confusion when there is no header in the response (then you have to consult the info to see what you got in the response). The differences are focused on client expectations (return what I ask for) versus a data-centric perspective (the data exists and will be returned with as little changes as possible - no re-orderings). Jon will discuss with Bob and Jeremy and bring a suggestion to the next telecon.
- Duration (issue #75, now closed) is tied to time-varying bins, so the explanation is not in the spec document, but a separate implementation page until the time-varying bins are figured out.
- Need one last look at all changes since last release: https://github.com/hapi-server/data-specification/compare/c5b82826f427e71dafddc708ea112d4e0927ca97..4702968b13439af684d43416b442c534bf569f6c
- Need to make a changelog with diffs for key updates; roll up typos, etc, into one item
- Bob looked at timeseries data from earthquakes (including electric field values). He said their standards seems pretty easy to map to HAPI - we have similar elements and a similar approach (of course the details differ); he will send some links See http://service.iris.edu/irisws/. The timeseries link is the one for data.
- normalizing ids and labels fro version 3.0- (See discussion below)
- coordinating efforts on Python HDEE proposals
For parameters: id - machine readable ID with limited characters (no spaces or odd characters); BX_GSE label - short, human readable version of ID; spaces ok, "Bx in GSE coordinates" description - up to a paragraph of information about the parameter or dataset; think figure caption; same level of info from SPASE record
SPASE analogs are: parameter key, name, description (the main thing is to have them correspond one-to-one with SPASE, and maybe others?)
Relationship to resourceURL? If this is present, then 'description' is obtainable there.
For catalog entries (each entry is a dataset): currently, each dataset has: id (Required), title (optional) Suggestions for 3.0: A. each dataset has: id (required), optionals: label, description, start, stop, cadence B. have a verbose flag on catalog request that generates a full, parameter-level catalog of all datasets; like this http://datashop.elasticbeanstalk.com/hapi/catalog?all=true If present, advertise in capabilities as catalog verbosity
Does this make HAPI too much of a registry? Original idea was to let discovery focus be outside HAPI. It makes HAPI usable in other contexts.
Discussion about the generic server Bob W. is creating:
The server has multiple installation methods, one of which is a Docker image. This option has drawback, since it's hard to edit files inside Docker (you have to ssh into the Docker VM, and then use whatever primitive OS tools are available, like vi or nano). So after someone configures their server, they could build a Docker image, but it might not be too useful like this as a deliver mechanism. Unless-- you could have the server config file be external, and then tell the Docker image about it at startup. Bob will look into allowing the run option for the Docker image to take a URL argument pointing to an external config file.
Volunteers are needed to try out Bob's method and see how easy or hard it is to build the back end components to feed the HAPI front end.
What is also needed is a GUI mechanism for building that back end. This could be a separate open source project to build this part.
NOAA Space weather week is coming up; this is a good time to connect with both the science and operations / developer side of the house at NOAA. Also, the archiving side (NGDC) and the realtime side (NOAA Space Weather Prediction Center) will both be there, and they have separate mandates that don't mix often. Jon will contact Larry P. and Bob S. to see about connecting with NOAA people about using HAPI for their archive and real-time data
Specification updates
Jon is planning to put some revamped TIMED/GUVI data behind a hAPI server, and one issue is that each measurement needs to be correlated with a lat/lon on the Earth. We need a way to associate data columns with support info columns, like lat/lon. Also, sometimes, the lat/lon may be fixed, or partly fixed, i.e., changes every few years (when the ground magnetometer station is moved). Options are:
- just have a column that repeats the same value (this is the default now, and probably until HAPI V3)
- the header could list all the options for a slowly varying quantity, and also provide labels for each value, and then the data column could reference the label and only repeat that instead of the enire set of values; this is a kind of built-in compression
We should look at how Earth science organizes data products that need lat/lon registration.
Next steps for HAPI - better on-boarding process for people who want to adopt HAPI. Groups so far that have done this are CCMC and Fundamental Technologies (PDS/PPI sub node in Kansas).
We need to make our documentation have more of a flow or be more organized and cookbook oriented.
There are still a lot of outstanding open issues on the spec document. These need to be cleaned out. Most are documentation clarification, but two are larger issues. The biggest one is handling "mode changes" (bin vluaes that change with time). This is issue 71. Jeremy, Jon and Bob need to meet separately to try their latest approach as outlined in the issue.
attendees: Jon, Jeremy, Todd, Chris, Eric
- Happy New year everyone; we are missing our NASA colleagues and hoping they can get back in there soon
- is this meeting time OK for the upcoming year? will do a poll later to see if this time is OK
- EGU - abstracts due Thursday; Tom is going from LASP; no session identified for data environment topics; no one else likely to go
- iSWA HAPI Server is up and Jeremy reports that it performs well; Jon to ask CCMC to advertise it more on their main page
- Masha from the CCMC mentioned at the AGU that HAPI was approved by COSPAR and that we should form a group about it before the next COSPAR meeting in March; Jon to follow up with her about this, since it was a hurried conversation in the poster hall; the COSPAR approval of SPASE is still in process pending some clarifications, possibly related to how SPASE and HAPI interact
- is the URI template mechanism a part of SPASE? Todd thinks it can be listed in the AccessURL
- discussion about creating a "drop-in server"; we need to first define this more clearly; some kind of ready-to run mechanism to support the use case where a provider does not already have a server that can be modified; definitely it should provide proper HAPI parameter parsing and a secure environment; maybe these parts could be done in multiple languages (NodeJS, Python, Java) to give people options. Bob's server is coming along nicely and could be made into something installable via NPM (installer / repository specific to JavaScript); maybe we can start a group project for this effort; there are some datasets at APL to which we could try applying the generic server: TIMED data (time series of atmospheric retrievals and images) and also SuperMAG (which has some strict user registration and data usage acknowledgement requirements)
- client work - need to keep bolstering the Python client to make sure it is industrial strength Python; Bob is working this - does he need / want help? this will hopefully end up in the Heliophysics Python library
- next meeting: Jan. 28 (since 21 is Federal holiday)
Post-AGU meeting discussion about AGU - discussion with Bob and Jeremy and Larry Brown - Bob wants more issues closed, especially bug ones; the ambiguity of cadence is a key one; 2. other AGU news: charter in the works for IHDEA (International Heliophysics Data Environment Alliance) 3.
- meeting reports from various events: IVOA - Jon V.; very short - astronomers have preliminary interest in HAPI; contact is Ada Nebot; ADASS - anyone go to this?; EarthCube RCN - Jon V.; HelioPython - Aaron, Bob, others
- Update on time-varying bins - not much news yet
- server status check, including LASP; development so far on Github at LASP site
- AGU Plans
Meeting updates: IVOA - interest in HAPI and our experience; only a preliminary connection - further dialog needed; interest in re-using existing standards, such as Apache AVRO
Meeting updates: Python meeting - presentations from contributing libraries and other existing libraries in terms of practices and structure; possible HDEE call for exploring e.g., library governance; Bob and Aaron met with NGDC (Eric Keane, and Rob Redmond) who have their own APIs (spider, and 2 others since then, now another); API is mostly for internal use within web-page plotting and for access to their own database; DISCVR and GOES products; most of their products already in CDAWeb; SWPC real-time data is separate, and they only expose files for security reasons - thus would need a wrapper; question: what is latency with iSWA at CCMC? if low, then probably good enough; could ask CCMC to cover more products; group at ONERA (French radiation belt group, Sebastien Bourdarie) also building a HAPI server - eventually using a Python Django - would they be willing to contribute it as open source?!!!
Overview of LASP HAPI server from Chris Lindholm; it will be generic as a LaTiS server - if users can set up their data to fit into the LaTiS framework, then the data can be served via HAPI. More at AGU, including public HAPI server. Functional programming (Scala) being used.
Next meeting (after the AGU): January 7, 2019
- report on International Heliophysics Data Environment Alliance (IHDEA) - meeting as ESAC (archive for all ESA missions); Arnoud Masson; enabling cross-agency interoperability; public site is at ihdea.net with dev and info mailing lists
- upcoming meetings: HASA HQ Data and Computing across all SMD, IVOA (Nov 7-9), Python Meeting at LASP, ADASS, EarthCube
- connection with NOAA being sought (Bob Weigel working this with Aaron); need to prime the discussion with the right NOAA people before Space Weather Week (April 1-5, 2019 in Boulder)
- Update from LASP (Doug Linholm) - code for scala-based somewhat modularized HAPI server available at https://github.com/lasp/hapi-server which might be demonstrated next time
- Update on Python client - able to push data directly to Autoplot; lots of other features for a demo next time
- next telecon - Nov. 19; topics include more on Python client and possibly some on the LASP HAPI server
Upcoming meetings:
- Python meeting in Boulder: Aaron and Bob Bob attending; Aaron (with Alex DeWolfe) is coordinating Python library development for Heliophysics
- Astrophysics Data Analysis conference in College Park, MD
- NSF EarthCube RCN meeting at NJIT: Jon going; let Aaron know if you want to be invited attend
Python client: Bob has a basic package installer working and a Jupyter notebook;
Specification updates: Jeremy and Jon presented ideas for dealing with issue:71 about constants in the header and about time-varying header elements; suggestion from Todd and Bob: use native JSON reference capability; possibly also have our own reference syntax when using a parameter value as time-varying bin values
Action items:
- Jon and Jeremy - revise the suggestion for issue 71 to use native JSON refs
- Bob and everyone - find more Python helpers
linking parameters in the header this relates to issue 71; there has not been much work on this yet; issue 71 now has a write-up of some options; Jeremy will explore some options in the next few days
email lists for now, we will just use the hapi-dev list for most communications; we can use hapi-news occasionally, but that should include instructions for getting on the hapi-dev list, since that is still going to be the priority list for a while
NOAA data would it make sense to have NOAA data via a pass-through HAPI server (written outside of NOAA)? we should interact with NOAA some, especially at next year's Space Weather Week, when developers and scientists are all available
server updates:
APL: JUNO data going to be put behind a HAPI server
Iowa: Autoplot bug fixes; das2 server codebase is shared with hapi server codebase, and a setting determines if the das2 server is also a hapi server; decided dataset by dataset within a das2 server
LASP: development underway for HAPI server, which will be part of the LATiS version 3 effort; work is all being done on Github and so the codebase will be usable by others interested in serving data via HAPI or LATiS
PDS/PPI: server is up and running; CAPS data available - more testing needed; any dataset in PDS4 can be easily added to the HAPI server
CDAWeb: Nand's server still running OK; saw some accesses from APL; problematic variables being removed
client updates:
Nand is working on Java client - this could be coordinated with Jeremy and Larry Brown
VisualBasic client for MS Excel is going slowly at APL; high school intern will continue this fall
Todd - send Jon and Aaron the email addresses on the hapi-news and hapi-dev distribution lists
Jon - send something to HAPI-news occasionally to keep people up to date on development
Jon - test the hapi-dev list using the WebEx meeting setup tool to see if everyone will get the WebEx invite
Jon - email Alex DeWolfe about adding more data formatting discussion to this Friday's Python telecon
Jon - work with Aaron to touch base with the CCMC people for a status update on their server and we're especially interested in any feedback they have regarding the specification
Jeremy - work on implementing something for linking variables and/or header items
Jeremy and Bob - remove time library dependence from Python client; look into Jupyter notebook as a demo for how to use Python client to interact with a HAPI server
AGU sessions - planning for multiple sessions; SPEDAS training after Mini-GEM (and poster in Cecconi's session)
Oct 2,3,4 Python for Space Physics at LASP; presentations on existing capabilities; architecture discussion and layout; Alex DeWolfe coordinating; she also has mailing list and telecons every other Friday
Actions:
Jon - send Alex D. a note about Python integration of HAPI; jump in on upcoming Python telecon
Jon - write up summary of discussion on reference variables and include in issue 71, then notify everyone
next telecon: August 20
SPASE and HAPI put forward in resolutions recommending their use as standards
Jeremy will submit to this session by Baptiste Cecconi:
IN044: Interoperable tools and databases in Planetary Sciences and Heliophysics
https://agu.confex.com/agu/fm18/prelim.cgi/Session/46558
Bob is thinking about this session:
IN007: ASCII Data for Public Access
https://agu.confex.com/agu/fm18/prelim.cgi/Session/49978
Jon will put a HAPI specification poster in this session:
IN042: Integrating Data and Services in the Earth, Space and Environmental Sciences across Community, National and International Boundaries
https://agu.confex.com/agu/fm18/prelim.cgi/Session/50270
Bobby and Bernie will not create a HAPI-specific poster, but can support a CDAWeb description on another HAPI poster, which should also include Nand.
Doug will present the HAPI-fied version of LaTiS at the AGU as well, session is still TBD.
Next telecon will be July 30
topics to include:
issue 71: https://github.com/hapi-server/data-specification/issues/71
updates from various servers (CDAWeb, PDS/PPI, GMU, UIowa, APL, LASP, and maybe the CCMC developers)
Note: next telecon is in one week (July 23) in order to have a short tag-up on AGU abstract submissions.
A. peruse the AGU session list and think about what HAPI abstracts we can submit. There are multiple options:
- multiple posters: a poster on the Spec, one on clients, one on servers
- one poster on all of these (spec, clients, servers)
- other permutations: one on the spec and servers; then one more for clients
There's a session by Baptiste Cecconi: https://agu.confex.com/agu/fm18/prelim.cgi/Session/46558
There's also a Heliophysics Python session by Alex DeWolfe: https://agu.confex.com/agu/fm18/prelim.cgi/Session/46412
B. take a look at issue 71 - it's about how to handle constant parameters or references in the header and in the data.
URL is: https://github.com/hapi-server/data-specification/issues/71
Be ready to talk about this at the next telecon
Server updates: The CDAWeb HAPI server is going to use Nand's approach for the foreseeable future: https://cdaweb.gsfc.nasa.gov/hapi (We did not talk about this, but it uses https (encrypted), which has to be considered when mixing with regular http (non-encrypted) sites.)
CCMC - Aaron can check with them soon to see how they're doing
PDS - Todd not on the call (COSPAR); will get an update next time
LASP - Doug says funding all set up and work is starting / progressing
Client updates
- Autoplot and the MIDL HAPI client were presented at the MOP meeting last week. 20- people attended the tutorial. A few scientists are starting to see the value of having one access method across data centers.
- SPEDAS tutorial held at GEM meeting. Another planned for Sunday evening after mini-GEM at AGU. A part of this will be about HAPI, so Eric was fine with a HAPI representative helping out with or being present for that part of the annual SPEDAS tutorial. We hope to have CCMC and PDS and maybe LAP online with HAPI by then!
- At APL, some interns are going to attempt a fully Excel-based HAPI client, or at least some mechanism that can produce more regularized CSV files that can be opened easily in Excel.
AGENDA
- news from Bob on updates to the verifier
hapi-server.org/verify is link to new verifier; it just is a pass-through to his own site at GMU
-
also from Bob - update on the generic server Bob - few tweaks to docs and ready to start advertising about in 2 weeks;
-
transitioning hapi-server.org to actually serve HAPI content
Jeremy: change documentation so that it points to working examples on hapi-server.org
- feedback about Jeremy's proposal for constant parameters
Jeremy's proposal for constant elements in the header or data: https://github.com/hapi-server/data-specification/issues/71
Lots of discussion about exactly how to arrange references in the header. Should there be a more generic way to link variables - i.e., treat even the constant elements as a kind of parameter, and then just have them linked in the header, like CDF does. Or should we keep header variables different than time-varying data parameters?
Today's discussion: PDS HAPI server is up and running. Send issues to Todd. The rest of today's discussion was mostly about how to handle data with unusual bins, such as 3D data that in addition to a regular grid of bins along each dimension, somehow also has a separate grid of bin values that applies to a specific slice or face of the data. This is a MAVEN dataset, and Todd will be sending around more info about it for next week.
We will have another telecon next week, May 14, and then take off the week of the 21st, since that is the TESS meeting week.
- upcoming meetings:
- EGU is this week; HAPI poster is on Wednesday, presented by Baptiste
- CCMC meeting (Friday session) is devoted to comparing interoperability mechanisms and has international participation
- TESS meeting - no updates - registration is open
-
in applying HAPI to Cassini data, scientists wanted to be able to manipulate and combine the data, doing more than just presenting what is in the file; MIDL does this because it knows what type of science data it is dealing with - effectively, it has more metadata so it can make a particle data object (with look directions, or pitch angles, etc); the way to have HAPI support this with the current spec would be to add custom metadata extensions (allowed by the spec) that would allow a client to know more about a dataset discussion about using HAPI to capture more
-
status updates: incremental progress on server development
-
Jeremy and Eric are working on supporting caching using the If-Modified-Since HTTP request header mechanism; Jeremy has a draft document out about how to do this; Autoplot can already do caching, and adding the If-Modified-Since to Jeremy's test server did not take too long (few hours of Python modificaitons). Eric is working on adding caching to SPEDAS - he is planning to use daily chunking of data (same as Jeremy).
-
discussion about a generic server - see next paragraph
Generic Server Ideas
Jeremy and Jon want to start a group development of a generic server that is independent of current servers, many of which are modifications of existing, historically motivated servers, and since HAPI is being added as a secondary delivery mechanism, these modified servers are not suitable as generic examples. Also, it would be good to focus on web security in the design from the beginning. So we envision a 2-level system with a front-end that manages incoming requests, and also returns the response. The front end is completely generic and re-usable and as the outward facing element, it is made to be very secure. The back end deals with the data management needed to fulfill the request. It should be made able to handle data arrangements that are nearly HAPI-ready, such as a static HAPI site that has files and metadata as fixed entities (and the back end knows how to subset them, etc).
The back-end could be made generic if the data center can provide three elements of functionality:
- the ability to read a dataset for a given time range and bring it into an internal data structure of that data centers choosing (QDataset for Autoplot, ITableWithTime for MIDL, something similar for CDF programmers). This capability is something each data center will possibly have already.
- the ability to subset this internal data model by parameters or by time
- the ability to turn this internal data model into a HAPI-specific structure that the back end knows about (and is essentially a HAPI-based data model with the right metadtata).
If a server can provide these 3 things, the back-end code can handle the rest of the HAPI-specific processing.
Doug mentioned that this essentially reflects the design layout of LISARD, and some of the code is already on Github, and the upcoming development will likely be another Github project within the current hapi github project. The generic server should not be tied to a single institutions code base, but we can certainly pull ideas from existing implementations.
Jon wants to get Rob Barnes and Bob Schaefer involved, since they both have relevant data that we can try to make available through HAPI, and as we do this, we could also spend a little extra time to create a generic server like the one outlined above. Schaefer's data is interesting since it is ITM data with higher dimensionality, and this would demonstrate that HAPI can be used for ITM data.
Bob Weigel (not on the call today) needs to also be heavily involved in the design and implementation of this generic capability, since he has expressed an interest in it for a long time already.
- create feature request for overlay metadata to identify specific data types; this topics is related to time-varying metadata, so this could be incorporated into any updates to the spec
- write up ideas about generic server and create feature request (or update existing one).
- next meeting is Monday, April 16, when Rick M. from CCMC will demo his HAPI interface; no meeting on April 23 since that is the week of the CCMC meeting
updates:
- HAPI error codes - spec document update almost done - needs example still
- HAPI caching in Autoplot - few small bugs before production; structured so that the cached content could be used by other clients in other languages; detection of stale cache is via the optional modification date (which is not granular) or just age in cache; maybe flesh out a common set of refresh rules on this telecon?
- modification dates and HTTP status codes - Bob, Jeremy and Jon to talk at next week's time slot
- CDAWeb HAPI server; Nand's is running at proto.hapistream.org/hapi ; add this to servers.txt (Jon); being migrated inside CDAWeb
- LASP - getting set up soon
- SPEDAS - bug fixes and time format handling updates (will use regex from Github to handle YYYY-DOY formats); the validator (by Bob Weigel) may have a better tested way to parse times -- see the verifier code here: https://github.com/hapi-server/verifier-nodejs/blob/master/schemas/HAPI-data-access-schema-2.0.json (needs leap seconds updates); also SPEDAS accepts parameter restrictions; also handles first time column OK
- demo by Larry about MIDL4 HAPI client
- Aaron - need long term organization mechanism
- Jeremy, Bob, Jon to use next week's 1pm slot to talk about modification times and expiration dates
- next telecon: March 26
Action Items:
- HAPI web page (https://hapi-server.github.io/) needs to mention SPEDAS! (Jon)
- discussion about posting a Java client to Github main page (Jeremy and Larry)
Agenda
- demo by Eric of HAPI support in SPEDAS
- touch base on other development efforts underway
- mention upcoming meetings:
- EGU, April 8-13, Vienna, Austria, https://www.egu2018.eu/
- TESS, May 20-24, Leesburg, VA, https://connect.agu.org/tess2018/home
- COSPAR, July 14-22, Pasadena, CA, http://cospar2018.org/
Updates:
- PDS PPI node - server update in progress; works in development; being pushed into Git repo for move to production environment; available for PDS4 datasets (MAVEN and LADEE now; soon Cassini and MESSENGER; migration of everything else underway too)
- Jeremy and Bob - more generic servers; Jeremy: multi-threaded Python; Bob: node.js server in dev.
- GSFC HAPI server - Nand has new version; also has API for HAPI input stream and output stream
- could be some interest in making data from active missions jointly usable; stay tuned for senior review report
Switch to every 2 weeks - next telecon in March 12.
Next time - MIDL demo.
CDAWeb - JSON update still in progress
Bob and Jeremy - working on generic server and developer documentation;
the HAPI verifier - up to 2.0! ability to check JSON and binary is still in progress; ability to set timeout will be added soon
discussion about error codes: the spec points out that when no JSON is requested, only the HTTP status response is available; Bob and Nand already implemented mechanisms that do more than this, and they suggest we add to the spec so that it recommends the following for HAPI server error responses:
- modify the HTTP response text (not the code number) to include the HAPI-specific error code and message
- even for error conditions that report "not found" still return JSON content to describe the error message
Note: These are all small enough changes (and are just recommendations) so that they only trigger a version number increase to 2.0.1
Before adding it to the spec, we need to see which servers can do this, and which clients can utilize this information. We expect it is not a problem, but want to be sure. What we know already about servers: Tomcat (yes), node.js (yes), Perl(?), Python(?). About clients: curl (yes), wget (no)
Next week: Eric Grimes - will demo IDL HAPI client and SPEDAS crib sheet
action items:
study the following server capabilities to implement 1 and 2 above; Jeremy (Python and Perl servers)
see how proxies affect the transmission of the JSON content when there is an HTTP 404 error; was this going to be Bob or Jeremy?
clarify the error handling section in the spec to describe the new recommendations (Jon)
discussion about streaming implication of timeouts - need statement in the spec about servers needing to meet reasonable timeout assumptions for clients; current typical values are around 2 minutes; we need to check these; must specify for time-to-first-byte and time-between-bytes
Bob's verifier currently has multiple tiers of checks; it will be switched to allow the timeout to be an input
also need to clarify expectations about multiple simultaneous requests (do servers need to be multi-threaded?); CDAWeb limits simultaneous connections for security reasons; Apache has settings to limit connections; does Tomcat?
how to clarify any confusion about streaming? record variance is the fastest changing item
make sure the spec mentions that servers can respond with "too much data" which is especially relevant if delivering data in any of the column-based formats were considering as optional output format
Discussion about current JSON format - there was a question about the validity of records with different types in the array for one record; JSON Lint parses this fine, claiming mixed values are OK; JSON Spec RFC7519 agrees;
related topic of interest: Open Code / Source white papers
- NASA is serious about it's commitment to encourage / require open code.
- people are encouraged to submit short statements with support or opposition or suggestions of pitfalls to avoid, etc.
- some comments about streamlining the legal / formal release process; also documentation is time consuming
- difference between open source project (lots of global developers contributing) versus open code (source code available, but not necessarily supporting active, joint development)
- overlap with SPASE descriptions for publicly available resources
HAPI email list now set up
Web site improvements: minor improvements only, add dates to releases; mention the news listserv and how to subscribe; current telecon members have post capability - new members are moderated starting off; others listen only; eventually have a hapi-help@hapi-server.org; add all the logos fro supporting organizations
Lessons from the AGU:
- discussion with Arnaud Masson (Aaron's counterpart at EGU); Aaron will set up a meeting about interoperability at the right level of formality, using HAPI as an example case
- feedback from Hayes: OK to proceed with some HAPI development
plans for the year
conference presence this year? EGU - joint abstract with Baptiste (ask about collaborators) and Arnaud and LASP group (Tom, etc) supporting the presentation of the material at the meeting; Jon will write tomorrow TESS in June - abstracts due in February (AGU-based site)
- Jeremy: update from SPEDAS group - re-writing client for latest version
- need to get feedback from CCMC on their server?
- Bob: working on generic HAPI front-end server to manage HAPI requests; if a provider has a command-line way to stream data, it can be connected to the front end to make data available via HAPI; updates in a few weeks; (this would be run on existing servers at the provider site); includes validation mechanism internally
Next telecon is Jan. 22.
Action items:
- Jon: Draft note for SPA email newsletter. Request for comments on HAPI 2.0.0; emphasize good lowest common denominator
- Aaron: start talking with ESA; get names of telecon people
- Todd, Jeremy, Jon: get listserv email set up at hapi-server.org; Todd will look
- all: keep working on implmentations
- Bobby: send AGU notes
- all: what standards group to join or become: SPASE, Apache, IVOA, COSPAR
Request:
- Nand wants someone to check the JSON output of his CDAWeb server; Bob says the verifier will eventually do a cross comparison between the CSV and JSON and binary data
Discussion:
Topic 1: how to capture start and stop times
Write-up proposals for handling start and stop times: option 1: reserved keywords for the start time column and stop time column option 2: keywords that refer to the names of the start time column and stop time columns option 3: delta +/-; use units on the column to capture a duration
suggestions: accumulationTimeStart accumulationTimeStop
accumulationStartReference accumulationStopReference
accumulationStartTimes -> name start time column accumulationStopTimes -> name of stop time column
comments: accumulation is too specific
measurementStartTimes measurementStopTimes
Topic 2: what about extended request keywords? lots of issues: in capabilities (server-wide) or in info (dataset specific)?
Need a document to capture topics we've discussed and not put in the spec, but need to remember.
next meeting: Jan. 8
- Bernie demonstrated a way for servers to indicate that data has not changed since last requested; servers emit a last-modified header value, and clients and include a if-modified-since header, to which servers can give a 403 "Not Modified" if nothing has changed; this is harder for a service-based approach, since these header values are supposed to relate to the actuat content of the response (rather than the underlying data used to construct the response).
- There is already an optional attribute in the HAPI info header for
modificationDate
and clients can look at this and just not issue a request for data if nothing has changed (rhather than issue a request and look for the 403) - It would take a lot of work for all servers to implement an accurate
modificationDate
since there could be a lot of granules to examine; for static datasets, it is easier since it does not change - So for now, we will not make any changes to the spec.
AGU plans - still need to choose a night for the HAPI dinner - Wed. is current winner on doodle poll
- update spec: error if you mix date format within an info header
- next week: Bernie illustrates last-modified in info header or catalog?
action items:
- review Bob's list of 1.0 to 2.0 changes (Jon)
- add example to clarify the single string or array of strings for parameter units and labels (Jon)
- update the spec document to clarify what the data stream should look like for 2D arrays when streaming JSON formatted data; the JSON mechanism of arrays of arrays is what the spec calls for
- look into mailing list options (Jon and Jeremy)
- keep working on implementations (everyone)
Bob showed a simplified version of the website that removed duplicate info on the GitHub developer page and the GitHub Pages web site page. He's attempting to link index.md to README.md to go even farther in avoiding duplication.
We still need a novice friendly landing page at https://hapi-server.github.io
We reviewed modification to the units
and label
attributes within the Parameter definition in the spec. They need some tweaks:
- add to each "In the later case," to clarify about array values.
- instead of referring to the one unit or label string as a scalar, just call it "a single string" since scalar sounds too numeric
Lots of discussion about Extensions to HAPI - it is captured here as we discussed it.
maybe have an area where new endpoints can appear:
http://hapi-server.org/hapi/ext/data
- this could serve as both "extensions" and "experimental" in that people can try out new things
Doug: dap2 - does not define extensions; it has simple query mechanism for index-based selection of data
in the CAPABILITIES description, need to capture the fact that the extension exist:
"extensions": [ "average", "decimate" ]
Or, maybe we define some higher level functionality as part of the spec (for the data
endpoint), and just make it optional.
"options": [ { "data": ["average","filter", "interpolate"] } ]
Bob: needs examples to help us see how it works: easy one would be decimation (only include every Nth point)
Lots of different ideas:
http://hapi-server.org/hapi/ext/data?id=ACE_MAG&decimate=10
http://hapi-server.org/hapi/ext/decimate?id=ACE_MAG&everyN=10
- this does not work well since you will want to do more than decimate - it needs to be a request parameter
- Doug: could use function syntax: id(ACE_MAG)&stride(10)&average(60)
- this is similar enough to regular request syntax that it is probably better to stick with one syntax
For constraints on data, recall that we are using time.min and time.max with an eye for extending this to data
http://hapi-server.org/hapi/ext/data?id=ACE_MAG&everyN=10¶m.min=X¶m.max=Y
We could have users stuff all their extended capability into one additional parameter (with CSV function calls with parameters to the functions)
http://hapi-server.org/hapi/ext/data?id=ACE_MAG&extensions=average(30),stride(10)
http://hapi-server.org/hapi/data?id=ACE_MAG&extensions=x_average(30),x_stride(10)
Most people liked having extension right on the data endpoint, but with the x_
prefix to indicate they are extensions and experimental.
http://hapi-server.org/hapi/data?id=ACE_MAG&x_average=30&x_stride=10
- These could be advertised in the
capabilities
endpoint like this:
"extensions": [ { "data": { "name": "x_UIOWA_average", "description": "mid-western averages", "URL": "http://sample.org/hapi_extensions/average_info.html" }, "x_stride" : {} } ]
Todd: we are talking about two things:
- additional processing done by the data endpoint (averaging, etc)
- different endpoints (listing coverage intervals for a dataset)
Aaron: maybe moving too fast with extensions - let's get a solid base working first
Nand had a question about mixed time resolution - he's going to ask it via email.
Add a SUPPORT email link to the main HAPI page!
- try to use GitHub mechanism for listserv to keep track of asked questions
- we should use the hapi-server.org domain for listserv options
The web site is finally transitioned to show version 2.0 as the latest version. Note that this version was finalized a while ago.
The issue of mixed units was discussed again. With Todd present, we revisited the use of unitsArray and labelArray, and have decided not to add those attributes. Instead the units
attributes (which is required) and the label
attribute (optional) will be allowed to have two meanings. A scalar value must be used for a scalar parameter, but for array parameters, you can use either a scalar or an array. The scalar means that all array elements have the same units, and the array means you have to specify a units value for each element in the array (so the array must have the same shape as given by the size
attribute). The spec will be updated so people can see if they like that. This is also very backwards compatible.
Jeremy said the regular expression he mentioned in issue #54 (which some people tried and did not work) does indeed have a problem (with interpreting colons?) and he's looking into it.
CCMC attendees: Chiu Weygand and (I think?) Richard Mullinix
- they showed the beginnings of a HAPI server for ISWA data at the CCMC, namely this one:
- it is online here: https://iswa.gsfc.nasa.gov/IswaSystemWebApp/hapi/catalog
- but note that it is a work in progress and does not fully support the spec
Questions from the discussion with the CCMC people:
- what about extensions to the API? they had additional filters they wanted to allow; we mentioned the possibility of defining how people could add extensions, and then having a suggested set of optional extensions as part of the spec; it would take another working group or a dedicated effort to clarify this
- time parsing was more difficult for them - this might end up being a common difficulty, so we should think about providing time parsing libraries in multiple languages
- they wanted to know about subsetting the catalog and how to arrange their server URLs
We will try to have a HAPI dinner at the AGU on Tuesday, Wednesday or Thursday night. Doodle poll will be taken soon.
actions:
- Jon: update dev spec with new definitions of parameter attributes
units
andlabel
- Jon: Doodle poll for AGU dinner
- Jon and Bob: figure out how best to arrange the main GitHub site and GitHub Pages site to avoid duplication
discussion about mixed units for arrays: we decided to try a unitsArray
attribute on parameters to capture different units for each array dimension
also decided to add an optional label
attribute for parameters, with a corresponding labelArray
Jeremy has new regular expressions for checking date format compliance - see issue #54
Add Jeremy's regular expressions (for Java (uses named field) and others) to validate allowed ISO8601 date formats.
Client and Server updates:
- any 2.0 servers? not yet
- ask Nand about status of CDAWeb HAPI server (Aaron)
- alternate CDAWeb approach: Bob's server
- datashop - eventaully get Cassini APL data
- Iowa HAPI server - Chris has it in non-public beta
- CCMC - still working on it
- SPEDAS - aware of and interested in; not urget yet?
- idl client - update from Scott imminent
a. implementation status
Chris Piker has the current spec worked into UIowa's das2 server and JEremy has questions about CSV from him:
Question: why NaN for CSV fill?
Answer: keeps it consistent with binary
Question: why no comments allowed in CSV?
Answer: makes readers more complex and slow
Question: how to handle progress info between client and server?
Possible Answers: two-way communication? use multiple connections to the server, one of which is for tracking progress; maybe see web-workers mechanism;
a clever option: track rough progress using the time tags in the data, since the overall time range is known!
Question: How well defined is the CSV spec? Answer: not sure what we decided on this; Jeremy was going to look at cleaning it up?
b. Todd mentioned on the SPASE call last week about the PDS/PPI plans for HAPI servers
c. Aaron is hoping to have an HDMC meeting at some point to solidify plans
d. the Github web site has still not been changed
e. I heard back from Daniel Heynderickx, who works with data servers at ESA and wants to use HAPI
f. update from Doug Lindholm: LASP white paper sent to Aaron; Lattice extensions to implement the HAPI spec; also, a HAPI client reader implementation so Lattice could ingest data form other HAPI servers and re-serve it via a Lattice API
g. Jeremy reports that the SPEDAS group looking at Scott Boardsen's IDL implementation
he's hoping to convince them to expose data that's been read via SPEDAS through an IDL HAPI server (so Autoplot could read it from the server); MMS has LEvel 2 products only available via the IDL routines in SPEDAS
add section numbers to TOC?
next meeting: Monday, Oct 9, 1pm: status of implementations
call with Jon V and Bob Weigel
-
We are planning on redoing the web site to make it more coherent for visitors. Landing page not be the Github page, but just the README.md, and modify the README to have not hyperlink to a release, but just to the markdown and the PDF and HTML, as well as to the JSON schema.
-
Use GitHub pages mechanism for the web site, possibly using Jeremy's domain "hapi-server.org" so that this points to the README.
-
Get rid of the "versions" directory (in the structure branch) using a more flat arrangement.
-
Not expose Github tags to people, since that would lead them to download the whole repository (with all older versions of the spec).
-
The SPASE group has been told about our preferred way to indicate the availability of a HAPI server within a SPASE record. There can just be an AccessURL pointing to the "info" endpoint for a particular dataset.
-
Bob showed the Matlab and Python clients he has.
-
Action items:
- Jon:
- rename current development version to release version
- add updated Table of Contents
- release version 2.0
- Bob:
- fix problem with JSON schema (centers and/or ranges)
- look over the file arrangement before 2.0 is released
- update the verifier to the latest spec (use a separate branch of the verifier code for each version?)
- Jon:
In subsequently looking over the HAPI specification Github page,' I think we need to prepare it for long-term stability with multiple releases. The standard approach is to have one directory for each release, and then have a landing page that points to the most recent release, as well as the development version.
Jon is setting up a separate telecon later this week to propose, tweak, and settle on a directory arrangement scheme for this and subsequent releases.
How to incorporate HAPI URL into SPASE?
- Give an info URL like this
http://datashop.elasticbeanstalk.com/hapi/info?id=WEYGAND_GEOTAIL_MAG_GSM
and let software figure out how to parse it - Just give a URL to the top of the HAPI server, and assume the SPASE ID (product key) is the dataset name in HAPI
- Give the URL to the top of the HAPI server, but also give the HAPI dataset name (in case HAPI data server names things differently)
- What about a data request?
http://datashop.elasticbeanstalk.com/hapi/data?id=WEYGAND_GEOTAIL_MAG_GSM&time.min=1994-09-30T23:59:59.000&time.max=1994-10-01T23:59:59.000
Nand's request: need clarifying use case.
Two other Nand suggestions:
- We should always provide the header; original reason was to be able concatenate subsequent requests; value of always having header is that data self-identifies when you save it. Discussion: communicating just the numbers is sometimes useful; the API already emphasizes a division between the header and the data; importing just the numbers with no header might be important (in Excel, for instance, or IDL using its CSVread mechanism); Conclusion: keep the option to leave off the header
- Precision in general and about time values. Conclusion: let the server decide. Good practice is to limit the output to the precision you (the server) actually have.
- issue 51: should time column have required name "Time"? -- decided not to require this, but to add to spec a clarification on the importance of having an appropriately names time column (don't leave the time column name as Unix Milliseconds when you changed it to be UTC to meet the HAPI spec)
- issue 40: why only string values for fill? -- decision is that it is OK to require fill values be strings; the problem is that JSON does not enforce double precision to be 8-byte IEEE floating point, so we can't rely on JavaScript or the JSON interpreter to convert the fill value ASCII into a proper numeric value; thus, we will just leave it as a string and the programming language on the client will need to do the conversion
- issue 42: what about a request for specific parameters that is somehow empty? -- decision: treat this as an error condition; in fact this is generically an error: any optional request parameter, if present, must also have a value associated with it; since it was optional, its presence then requires a value
- issue 46: need to clarify about the length of strings and how to use null termination in a string; the spec currently does not capture what we wanted to say; the null terminator is needed only in binary output, and only when the binary content of the string data ends before filling up the required number of bytes for that element in the binary record; so the length should NOT include any space for a null terminator; if you fill up the entire number of bytes with the string, there is no need for the terminator; if you are less than the number of bytes, then you do use a null, with arbitrary byte content padding to the required length
- issue 49: time precision -- change spec to say that the server should emit whatever time resolution is appropriate for the data being served; servers should be able to interpret input times also down to a resolution that makes sense for the data; any resolution more fine that what the server handles should result in a "start time cannot equal stop time error"; the precision the clients can handle is outside the scope of the spec, so users concerned about high time resolution should be aware of any restrictions of the clients they use.
- email notifications: just use a listserv at APL for people who want notification of any change to the hapi-specification repository (not just issues); so far, this will be: Bob, Todd, Jeremy, Jon; no need for more complex scheme using pull requests with branching and merging (the complexity of that is warranted only with larger source code projects)
- MATLAB client:
**
hapi.m
is feature complete from my perspective except for some minor changes for the binary read code. **hapiplot.m
is feature complete from my perspective. **hapi.m
andhapiplot.m
work using data from four different HAPI servers ** Neither of the scripts have been systematically tested on invalid HAPI responses. Common errors are caught and other errors generally lead to exceptions. This could be improved and we'll probably add code to catch errors as we find them. - Python client:
**
hapi.py
is feature complete from my perspective. It handles CSV and binary. **hapiplot.py
has far fewer features thanhapiplot.m
. I am now certain that I don't likematplotlib
. ** Both scripts work ondataset1
from http://mag.gmu.edu/TestData/hapi, which includes many possible types of parameters. I have not tested on data from Jon's, Nand's, and Jeremy's server. - There are some issues that we'll need to discuss about the clients that are related to whether there is a difference between a parameter has no
size
vs.size=[1]
. See also a question about size on the issue tracker. - Verifier ** Mostly feature complete and I still need to post the schema that I am using at https://github.com/hapi-server/data-specification I have a few questions for Todd about encoding conditional requirements. ** I added a few new checks and emailed Jeremy, Nand, and Jon warning them to expect new errors and warnings.
- Issues ** Hopefully I am done posting issues and questions ...
- Specification document ** I made several editorial changes to the HAPI-1.2-dev document
- Outreach
** Tried to do a phone call with Redmon last week. Will try again next week as I am out after Wed of this week.
** Looked of SpacePy and figured I would wait till
hapi.py
was complete before I emailed Morley. Will email him next week.
Discussion 1: clarity needed for multi-dimensional data when one or more dimensions does not have any 'bins' associated with it; right now, the spec pretty much says you have to have bins for all dimensions;
We settled on adding a single line to the spec: If a dimension does not represent binned data, this dimension must still be present in the 'bins' array for each dimension, but that dimension should have '"centers": null' to indicate the lack of binned elements.
Discussion 2: we need a place on the wiki to describe a common set of routines and calling parameters so that all the scripting languages can use the same names for the various types of calls.
Progress on action items from last time:
- Scott added the IDL client to GitHub
- Jeremy started a Java checking client
- most of the people to be contacted have not been yet - we need more to advertise first...
Bob created basic Python and Matlab clients, creating areas for them at the top level of GitHub; these are ready for others to mess around with and add/augment as a kind of joint development.
Jeremy has a Java API checking app (verifier) also at the top level in GitHub, and also open for joint development.
Action Items:
- Bob: email several people to ask about their interest in and potential use of the HAPI spec for their data serving interface
- Bob: still working on basic Matlab HAPI client
- Bob: email SpacePy people about HAPI client development and status of SpacePy
- Jeremy: work on rudimentary server checking mechanism
- Jon: add code to the verifier and see how it could be migrated to be or at least use a generic Java client
- Bernie, Bobby, Nand - CDAWeb server is progressing but not done
- Jon, Todd - start a collaborative effort to create a Python client in association with the SpacePy people
- Jon: waiting to hear back from Daniel Heynderickx about newly release version 1.1
- Aaron: update the CCMC people with news about version 1.1
topics discussed:
public versus private data served by HAPI: we won't make usernames and passwords part of the spec, but will have a part of the wiki devoted to implementation guidelines, where we can describe how to best serve data that has both private and public regimes.
Issue: citations - data providers will not like that HAPI obscures the source of the data; data providers won't get credit for serving their data, the won't know who is using it, and the appropriate reference wont' get cited Temporary resolution: will add an official issue to capture the need to address this concern; for now, the SPASE record that a HAPI dataset can point to can contain a citation; ultimately, it would be great to have a DOI associated with each HAPI dataset. Also, the resourceURL or the resourceID (or both) can serve as substitutes for a more robust citation.
how many different server implementations are needed? The only viable way for lots of data to stay accessible through HAPI is if the providers who install the HAPI servers also maintain them. Unused services will fall into disrepair (liek OPeNDAP services at CDAWeb, which got little use.)
Instead of creating an implementation that anyone can use (via a possibly hard-to-design interview process), maybe we focus on getting key providers to have an implementation, and we focus our energy and funding on a team that can help them understand the spec and get a sustainable HAPI installation going.
Multiple groups are working on servers that could be installed by 3rd party users, so this would give users a choice of HAPI server implementations.
We listed organizations that we hope would be interested in providing this kind of common access via a HAPI mechanism:
- CCMC/iSWA
- NOAA - National Weather Service (older:SWPC); Howard Singer
- NGDC -> NCEI (Spyder, now retired; Rob Redmond potentially interested)
- USGS (Jeff Love)
- Madrigal (MIT/Haystack)
- CDAWeb - Nand Lal working on updating his server to HAPI 1.1
- other SPDF data
- PDS PPI Node (Todd King)
- LASP (Doug Lindholm) LISIRD2 / Lattice Evolution to 3rd party use
- GMU / ViRBO / TSDS
- Univ. of Iowa - Heliophysics and planetary missions
- APL - Heliophysics and planetary missions
- SuperMAG
- other ground-based magnetometers
- European groups: VESPA, AMDA (Baptiste Cecconi), other ESA projects (Daniel Heynderickx)
- software/tool providers:
- SpacePy - Steve Morely, also John Niehof
- SPEDAS - Vassilis Angelopoulus (?)
For now, we will focus on working with the set of these groups that are more internal (to the existing HAPI community), such as PDS, CDAWeb, LASP, and CCMC. After we hace some success here, we can branch out to groups like NOAA, USGS, SuperMAG, and the Europeans.
Also, we need client libraries first, before HAPI becomes a compelling option, so several people will start working on those.
Action Items:
- Bob: email several people to ask about their interest in and potential use of the HAPI spec for their data serving interface
- Bob: work on basic Matlab HAPI client
- Bob: email SpacePy people about HAPI client development and status of SpacePy
- Jeremy: work on rudimentary server checking mechanism
- Bernie, Bobby: report back with status from Nand about his updating of the CDAWeb HAPI server to meet the 1.1 spec
- Jon, Todd - start a collaborative effort to create a Python client in association with the SpacePy people
- Jon: email Daniel Heynderickx about newly release version 1.1
- Aaron: update the CCMC people with news about version 1.1
- Scott: commit IDL client to Github area
Agenda:
- final review of changes to HAPI spec document for version 1.1 release
- discussion about implementation activities based on the distributed list of proposed activities
topics discussed:
review of recent edits of spec by Todd, Bob, Jeremy, Jon
new domain hapi-server.org available for examples; Jeremy to make our example links live soon (tonight?)
Question: should we allow HAPI servers to have additional endpoints beyond the 4 required ones in the spec?
Todd: no - put them under another root url outside the hapi/ endpoints.
Bob, Jeremy, Jon : yes, but put in separate namespace under hapi/ext/ or with specified prefix (like underscore)
Answer for now: punt and push this to future version; might be good idea to allow extensions, but we need to figure out how to allow servers to advertise their extensions - it needs to be in the capabilities endpoint. Also, we need to think more about implications. We have a pretty controlled namespace now, so we don't want to dilute that. Silence in the spec for now means people will hopefully realize they are in exploratory territory.
release of new spec! now at Version 1.1.0; tag is v1.1, name is Version 1.1.0
discussion about https: we'll need to address this in the spec at some point
re-arrangement of top level documents:
- move spec to something else besides the README.md
- describe all files in the README.md including the recent versions
- for now, indicate in the README.md where to find the stable release versions
Action Items
- Todd: create PDF stamp of version 1.1.0 and put in repository
- Jon and others?: update main spec document to indicate that the live version at the tip of the master branch - list of released versions; probably use a different name for the key spec document and put more general explanation in the README.md
- Jon: issues to add:
- extension to endpoints
- supporting for https; Let's Encrypt offers free certificates
- Jon: create wiki to keep track of longer running issues, like the activities document or telecon notes
- Jon: close out old issues related to release of version 1.1
- all: consider our set of next key activities: creating personal servers, creating drop-in servers for other people, making lots of data available, creating clients in multiple languages, lists/registries of HAPI servers, integration with SPASE