schema YAML files with `slot_usages` #1469

turbomam · 2023-12-04T18:28:09Z

grep -r -c slot_usage src/schema | grep -v ':0'

src/schema/prov.yaml:1
src/schema/annotation.yaml:3
src/schema/core.yaml:11
src/schema/nmdc.yaml:13
src/schema/workflow_execution_activity.yaml:13

slot attributes modified:

annotations
any_of (over ranges)
comments
description
maximum_cardinality
minimum_cardinality
notes
pattern asserting a pattern on an object property leads to RDF with errors
range
required
structured_pattern

The text was updated successfully, but these errors were encountered:

turbomam · 2023-12-04T18:28:48Z

src/schema/prov.yaml:1

Activity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:act-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true

turbomam · 2023-12-04T18:30:49Z

src/schema/annotation.yaml

GenomeFeature:
  slot_usage:
    seqid:
      required: true
    type:
      range: OntologyClass
      description: A type from the sequence ontology
    start:
      required: true
    end:
      required: true
Pathway:
  slot_usage:
    has_part:
      range: Reaction
      required: true
      description: >-
        A pathway can be broken down to a series of reaction step
FunctionalAnnotation:
  slot_usage:
    has_function:
      notes:
        - this slot had been called id
        - "Still missing patterns for COG and RetroRules."
        - "These patterns aren't tied to the listed prefixes. A discussion about that possibility had been started, including the question of whether these lists are intended to be open examples or closed"
    type:
      range: OntologyClass
      description: TODO
    was_generated_by:
      description: provenance for the annotation.
      notes: To be consistent with the rest of the NMDC schema we use the PROV annotation model, rather than GPAD
      range: MetagenomeAnnotationActivity

turbomam · 2023-12-04T18:36:27Z

src/schema/core.yaml

ProcessedSample:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:procsm-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
AnalyticalSample:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:ansm-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
Site:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:site-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
PlannedProcess:
  slot_usage:
    designated_class:
      comments:
        - required on all instances in a polymorphic Database slot like planned_process_set
OntologyClass:
  slot_usage:
    id:
      pattern: '^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$'
AttributeValue:
  slot_usage:
    type:
      description: An optional string that specified the type of object.
QuantityValue:
  slot_usage:
    has_raw_value:
      description: Unnormalized atomic string representation, should in syntax {number} {unit}
    has_unit:
      description: The unit of the quantity
    has_numeric_value:
      description: The number part of the quantity
      range: double
PersonValue:
  slot_usage:
    orcid:
      annotations:
        display_hint: Open Researcher and Contributor ID for this person. See https://orcid.org
    email:
      annotations:
        display_hint: Email address for this person.
    has_raw_value:
      description: The full name of the Investigator in format FIRST LAST.
      notes:
        - May eventually be deprecated in favor of "name".
    name:
      description: >-
        The full name of the Investigator.
        It should follow the format FIRST [MIDDLE NAME| MIDDLE INITIAL] LAST, where MIDDLE NAME| MIDDLE INITIAL is optional.
      annotations:
        display_hint: First name, middle initial, and last name of this person.
ProteinQuantification:
  slot_usage:
    best_protein:
      description: the specific protein identifier most correctly grouped to its associated peptide sequences
    all_proteins:
      description: the grouped list of protein identifiers associated with the peptide sequences that were grouped to a best protein
ControlledIdentifiedTermValue:
  slot_usage:
    term:
      required: true
GeolocationValue:
  slot_usage:
    has_raw_value:
      description: The raw value for a geolocation should follow {latitude} {longitude}
    latitude:
      required: true
    longitude:
      required: true

turbomam · 2023-12-04T18:36:55Z

src/schema/nmdc.yaml

The following slot_usages are currently commented out. Everything else in this issue is active

OmicsProcessing only patterns on object properties
Study almost all OK now
Biosample mostly

Pooling:
  slot_usage:
    has_input:
      minimum_cardinality: 2
    has_output:
      minimum_cardinality: 1
      maximum_cardinality: 1
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:poolp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
Extraction:
  slot_usage:
    has_input:
      required: true
    has_output:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:extrp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
LibraryPreparation:
  slot_usage:
    has_input:
      required: true
    has_output:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:libprp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
FieldResearchSite:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:frsite-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
CollectingBiosamplesFromSite:
  slot_usage:
    has_input:
      range: Site
      required: true
    has_output:
      range: Biosample
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:clsite-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
DataObject:
  slot_usage:
    name:
      required: true
    description:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:dobj-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
BiosampleProcessing:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:bsmprc-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
    has_input:
      range: Biosample
SubSamplingProcess:
  slot_usage:
    volume:
      description: The output volume of the SubSampling Process.
    mass:
      description: The output mass of the SubSampling Process.
    has_input:
      any_of:
        - range: Biosample
        - range: ProcessedSample
    has_output:
      range: ProcessedSample
      description: The subsample.
MixingProcess:
    slot_usage:
      volume:
        description: The volume of sample filtered.

turbomam · 2023-12-04T19:24:12Z

src/schema/workflow_execution_activity.yaml

WorkflowExecutionActivity:
  slot_usage:
    started_at_time:
      required: true
    ended_at_time:
      required: true
    git_url:
      required: true
    has_input:
      required: true
    has_output:
      required: true
    execution_resource:
      required: true
    type:
      required: true
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wf-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetagenomeAssembly:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmgas-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeAssembly:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmtas-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetagenomeAnnotationActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmgan-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeAnnotationActivity
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmtan-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmt-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetatranscriptomeActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmt-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MagsAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmag-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetagenomeSequencingActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmsa-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
ReadQcAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfrqc-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
ReadBasedTaxonomyAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfrbt-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetabolomicsAnalysisActivity:
  slot_usage:
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmb-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
MetaproteomicsAnalysisActivity:
  slot_usage:
    used:
      description: The instrument used to collect the data used in the analysis
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfmp-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true
NomAnalysisActivity:
  slot_usage:
    used:
      range: string
      description: The instrument used to collect the data used in the analysis
    id:
      required: true
      structured_pattern:
        syntax: "{id_nmdc_prefix}:wfnom-{id_shoulder}-{id_blade}{id_version}{id_locus}"
        interpolated: true

turbomam · 2024-01-02T18:09:08Z

oops, this is for some other repo that I work in. will move soon.

pbuttigieg · 2024-01-12T23:21:14Z

Thanks - was confused

turbomam · 2024-01-18T15:08:37Z

shoot, I don't think I can move this issue out of this org. I will just copy and paste and then delete here.

cmungall closed this as completed Feb 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schema YAML files with `slot_usages` #1469

schema YAML files with `slot_usages` #1469

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Jan 2, 2024

pbuttigieg commented Jan 12, 2024

turbomam commented Jan 18, 2024

schema YAML files with slot_usages #1469

schema YAML files with slot_usages #1469

Comments

turbomam commented Dec 4, 2023 • edited Loading

slot attributes modified:

turbomam commented Dec 4, 2023 • edited Loading

src/schema/prov.yaml:1

turbomam commented Dec 4, 2023 • edited Loading

src/schema/annotation.yaml

turbomam commented Dec 4, 2023 • edited Loading

src/schema/core.yaml

turbomam commented Dec 4, 2023 • edited Loading

src/schema/nmdc.yaml

turbomam commented Dec 4, 2023 • edited Loading

src/schema/workflow_execution_activity.yaml

turbomam commented Jan 2, 2024

pbuttigieg commented Jan 12, 2024

turbomam commented Jan 18, 2024

schema YAML files with `slot_usages` #1469

schema YAML files with `slot_usages` #1469

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading

turbomam commented Dec 4, 2023 •

edited

Loading