Skip to content

Python script for creating Gemini-valid xml files from CSV

Notifications You must be signed in to change notification settings

AstunTechnology/gemini_csv_import

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README

A python script for exporting gemini compliant metadata from a csv file to individual xml files.

This branch is compliant with Gemini 2.3 and Python 3.

How do I get set up?

  • Create a python 3 virtual environment in the root directory python3 -m venv .
  • Activate the virtual environment source bin/activate
  • Install dependencies by running pip install -r requirements.txt
  • See the sample csv file for the correct layout- alternatively change the column mappings in metadata_import.py to match your layout
  • Place your csv file in the input folder and rename it metadata.csv
  • Change to the python directory
  • Run python metadata_import.py
    • The script can take the following command-line arguments:
      • -n [number] or --numrows [number]- the [number] will dictate how many of the rows in the metadata.csv to be parsed and exported
      • -a or --all- to parse and export all the rows in metadata.csv
      • -h or --help- will display instructions on how to run the script
    • If no command-line arguments are passed when the script is ran, a user prompt will request the number of rows to be parsed and exported, the accepted values are either a number or all.
  • Your xml files will miraculously appear in the output folder
  • Check error.log in the python folder for details of any records that failed- these will be listed by title with the details of the error
  • Encoding errors in the source CSV may currently cause the script to fail. The offending bytecode will be shown in the error message so you can replace it in the source data with the correct symbol
  • When importing the records into Geonetwork, use the _to_gemini xsl

Data Specifics

  • Creation Date and Revision Date can be of the form YYYY-MM-DD or DD/MM/YYYY
  • Descriptive Keywords can be a comma-separated list
  • Topic Category must be one of the following (case-sensitive), but can be a comma-separated list:
    • farming
    • biota
    • boundaries
    • climatologyMeteorologyAtmosphere
    • economy
    • elevation
    • environment
    • geoscientificInformation
    • health
    • imageryBaseMapsEarthCover
    • intelligenceMilitary
    • inlandWaters
    • location
    • oceans
    • planningCadastre
    • society
    • structure
    • transportation
    • utilitiesCommunication
  • West, East, North, South bounding coordinates must be in WGS84 format (lat/lon)
  • Temporal Extent can be a comma-separated list (begin date, end date) but dates must be in form YYYY-MM-DD or DD/MM/YYYY
  • Data Format and Version can be comma-separated lists but must come from provided lists of formats and versions, see iso19139.gemini23/loc/eng/labels.xml
  • Data Quality Info must be one of dataset or nonGeographicDataset (case-sensitive)
  • Inspire theme (case-sensitive) must come from the INSPIRE Themes Thesaurus (can be a comma-separtated list)
  • Update Frequency is case-sensitive, choose one of the following codes:
    • continual
    • daily
    • weekly
    • fortnightly
    • monthly
    • quarterly
    • biannually
    • annually
    • asNeeded
    • irregular
    • notPlanned
  • The copyright statement should not include the copyright symbol, a correctly encoded version of this will be included automatically

Who do I talk to?

About

Python script for creating Gemini-valid xml files from CSV

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages