Skip to content

Commit

Permalink
feat: implement cdot data provider (#43) (#44)
Browse files Browse the repository at this point in the history
  • Loading branch information
holtgrewe authored Mar 13, 2023
1 parent 8540b73 commit 3a8ed9d
Show file tree
Hide file tree
Showing 11 changed files with 16,585 additions and 27 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ chrono = "0.4.23"
enum-map = "2.4.2"
flate2 = "1.0.25"
lazy_static = "1.4.0"
linked-hash-map = "0.5.6"
linked-hash-map = { version = "0.5.6", features = ["serde", "serde_impl"] }
log = "0.4.17"
md-5 = "0.10.5"
nom = "7.1.3"
Expand Down
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
# hgvs-rs

This is a port of [biocommons/hgvs](https://github.com/biocommons/hgvs) to the Rust programming language.
The `data::cdot::*` code is based on a port of [SACGF/cdot](https://github.com/SACGF/cdot) to Rust.

## Running Tests

Expand Down Expand Up @@ -60,3 +61,21 @@ $ bootstrap.sh http://dl.biocommons.org/uta uta_20210129
```

The `*.pgd.gz` file is added to the Git repository via `git-lfs` and in CI, this minimal database will be used.

## Some Timing Results

(I don't want to call it "benchmarks" yet.)

### Deserialization of large cdot JSON files.

Host:

- CPU: Intel(R) Xeon(R) E-2174G CPU @ 3.80GHz
- Disk: NVME (WDC CL SN720 SDAQNTW-1T00-2000)

Single Running Time Results (no repetitions/warm start etc.)

- ENSEMBL: 37s
- RefSeq: 67s

This includes loading and deserialization of the records only.
Loading

0 comments on commit 3a8ed9d

Please sign in to comment.