Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: ♻️ ConditionProbe as a proportion of visit #14

Merged
merged 3 commits into from
Dec 22, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
# Changelog

## v0.1.3 - 22-12-2022

- ConditionProbe: Update, computed as a proportion of number of visit.
## v0.1.2 - 14-12-2022

- ConditionProbe computes the availability of administrative data related to visits with at least one ICD-10 code recorded.
Expand Down
35 changes: 19 additions & 16 deletions docs/components/probe.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ We list hereafter the Probes that have already been implemented in the library.

=== "NoteProbe"

The [``NoteProbe``][edsteva.probes.note.NoteProbe] computes $c_{note}(t)$ the availability of clinical documents linked to patients' visits:
The [``NoteProbe``][edsteva.probes.note.NoteProbe] computes $c_{note}(t)$ the availability of clinical documents linked to patients' administrative visit for each care site, stay type and note type according to time:

$$
c_{note}(t) = \frac{n_{with\,doc}(t)}{n_{visit}(t)}
Expand Down Expand Up @@ -194,27 +194,30 @@ We list hereafter the Probes that have already been implemented in the library.

| care_site_level | care_site_id | care_site_short_name | stay_type | note_type | date | n_visit | c |
| :----------------------- | :----------- | :------------------- | :----------- | :-------------------- | :--------- | :------ | :----- |
| Unité Fonctionnelle (UF) | 8312056386 | Care site 1 | 'Urg_Hospit' | 'All' | 2019-05-01 | 233.0 | '0.841 |
| Unité Fonctionnelle (UF) | 8312056386 | Care site 1 | 'Urg' | 'All' | 2019-05-01 | 233.0 | '0.841 |
| Unité Fonctionnelle (UF) | 8653815660 | Care site 1 | 'All' | 'CRH' | 2011-04-01 | 393.0 | 0.640 |
| Pôle/DMU | 8312027648 | Care site 2 | 'Urg' | 'CRH' | 2021-03-01 | 204.0 | 0.497 |
| Pôle/DMU | 8312027648 | Care site 2 | 'Hospit' | 'CRH' | 2021-03-01 | 204.0 | 0.497 |
| Pôle/DMU | 8312056379 | Care site 2 | 'All' | 'Ordonnance' | 2018-08-01 | 22.0 | 0.274 |
| Hôpital | 8312022130 | Care site 3 | 'Hospit' | 'CR Passage Urgences' | 2022-02-01 | 9746.0 | 0.769 |
| Hôpital | 8312022130 | Care site 3 | 'Urg_Hospit' | 'CR Passage Urgences' | 2022-02-01 | 9746.0 | 0.769 |

=== "ConditionProbe"

The [``ConditionProbe``][edsteva.probes.condition.ConditionProbe] computes $c_{condition}(t)$ the availability of administrative data related to visits with at least one ICD-10 code recorded for each care site according to time:
The [``ConditionProbe``][edsteva.probes.condition.ConditionProbe] computes $c_{condition}(t)$ the availability of claim data in patients' administrative visit for each care site, stay type, diag type and condition type according to time:

$$
c_{condition}(t) = \frac{n_{condition}(t)}{n_{99}}
c_{condition}(t) = \frac{n_{with\,condition}(t)}{n_{visit}(t)}
$$

Where $n_{condition}(t)$ is the number of stays with at least one ICD-10 code recorded, $t$ is the month and $n_{99}$ is the $99^{th}$ percentile of $n_{condition}(t)$.
Where $n_{visit}(t)$ is the number of administrative stays, $n_{with\,condition}$ the number of stays having at least one claim code (e.g. ICD-10) recorded and $t$ is the month.

!!!info ""
If the $99^{th}$ percentile $n_{99}$ is equal to 0, we consider that the completeness predictor $c(t)$ is also equal to 0.
If the number of visits $n_{visit}(t)$ is equal to 0, we consider that the completeness predictor $c(t)$ is also equal to 0.

!!!Warning "Care site level"
This probe is only available at hospital level.

```python
from edsteva.probes import VisitProbe
from edsteva.probes import ConditionProbe

condition = ConditionProbe()
condition.compute(
Expand All @@ -235,10 +238,10 @@ We list hereafter the Probes that have already been implemented in the library.
condition.predictor.head()
```

| care_site_level | care_site_id | care_site_short_name | stay_type | diag_type | condition_type | date | n_visit | c |
| :----------------------- | :----------- | :------------------- | :-------- | :-------- | :------------------- | :--------- | :------ | :---- |
| Unité Fonctionnelle (UF) | 8312056386 | Care site 1 | 'All' | 'All' | 'Pulmonary_embolism' | 2019-05-01 | 233.0 | 0.841 |
| Unité Fonctionnelle (UF) | 8312056386 | Care site 1 | 'All' | 'DP/DR' | 'Pulmonary_embolism' | 2021-04-01 | 393.0 | 0.640 |
| Pôle/DMU | 8312027648 | Care site 2 | 'Hospit' | 'All' | 'Pulmonary_embolism' | 2011-03-01 | 204.0 | 0.497 |
| Pôle/DMU | 8312027648 | Care site 2 | 'All' | 'All' | 'All' | 2018-08-01 | 22.0 | 0.274 |
| Hôpital | 8312022130 | Care site 3 | 'Hospit' | 'DP/DR' | 'Pulmonary_embolism' | 2022-02-01 | 9746.0 | 0.769 |
| care_site_level | care_site_id | care_site_short_name | stay_type | diag_type | condition_type | date | n_visit | c |
| :-------------- | :----------- | :------------------- | :-------- | :-------- | :------------------- | :--------- | :------ | :---- |
| Hôpital | 8312057527 | Care site 1 | 'All' | 'All' | 'Pulmonary_embolism' | 2019-05-01 | 233.0 | 0.841 |
| Hôpital | 8312057527 | Care site 1 | 'All' | 'DP/DR' | 'Pulmonary_embolism' | 2021-04-01 | 393.0 | 0.640 |
| Hôpital | 8312027648 | Care site 2 | 'Hospit' | 'All' | 'Pulmonary_embolism' | 2011-03-01 | 204.0 | 0.497 |
| Hôpital | 8312027648 | Care site 2 | 'All' | 'All' | 'All' | 2018-08-01 | 22.0 | 0.274 |
| Hôpital | 8312022130 | Care site 3 | 'Hospit' | 'DP/DR' | 'Pulmonary_embolism' | 2022-02-01 | 9746.0 | 0.769 |
29 changes: 16 additions & 13 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -478,7 +478,7 @@ The working example above describes the canonical usage workflow. However, you w

=== "NoteProbe"

The [``NoteProbe``][edsteva.probes.note.NoteProbe] computes $c_{note}(t)$ the availability of clinical documents linked to patients' visits for each care site, stay type and note type according to time:
The [``NoteProbe``][edsteva.probes.note.NoteProbe] computes $c_{note}(t)$ the availability of clinical documents linked to patients' administrative visit for each care site, stay type and note type according to time:

$$
c_{note}(t) = \frac{n_{with\,doc}(t)}{n_{visit}(t)}
Expand Down Expand Up @@ -521,19 +521,22 @@ The working example above describes the canonical usage workflow. However, you w

=== "ConditionProbe"

The [``ConditionProbe``][edsteva.probes.condition.ConditionProbe] computes $c_{condition}(t)$ the availability of administrative data related to visits with at least one ICD-10 code recorded for each care site according to time:
The [``ConditionProbe``][edsteva.probes.condition.ConditionProbe] computes $c_{condition}(t)$ the availability of claim data in patients' administrative visit for each care site, stay type, diag type and condition type according to time:

$$
c_{condition}(t) = \frac{n_{condition}(t)}{n_{99}}
c_{condition}(t) = \frac{n_{with\,condition}(t)}{n_{visit}(t)}
$$

Where $n_{condition}(t)$ is the number of stays with at least one ICD-10 code recorded, $t$ is the month and $n_{99}$ is the $99^{th}$ percentile of $n_{condition}(t)$.
Where $n_{visit}(t)$ is the number of administrative stays, $n_{with\,condition}$ the number of stays having at least one claim code (e.g. ICD-10) recorded and $t$ is the month.

!!!info ""
If the $99^{th}$ percentile $n_{99}$ is equal to 0, we consider that the completeness predictor $c(t)$ is also equal to 0.
If the number of visits $n_{visit}(t)$ is equal to 0, we consider that the completeness predictor $c(t)$ is also equal to 0.

!!!Warning "Care site level"
This probe is only available at hospital level.

```python
from edsteva.probes import VisitProbe
from edsteva.probes import ConditionProbe

condition = ConditionProbe()
condition.compute(
Expand All @@ -554,13 +557,13 @@ The working example above describes the canonical usage workflow. However, you w
condition.predictor.head()
```

| care_site_level | care_site_id | care_site_short_name | stay_type | diag_type | condition_type | date | n_visit | c |
| :----------------------- | :----------- | :------------------- | :-------- | :-------- | :------------------- | :--------- | :------ | :---- |
| Unité Fonctionnelle (UF) | 8312056386 | Care site 1 | 'All' | 'All' | 'Pulmonary_embolism' | 2019-05-01 | 233.0 | 0.841 |
| Unité Fonctionnelle (UF) | 8312056386 | Care site 1 | 'All' | 'DP/DR' | 'Pulmonary_embolism' | 2021-04-01 | 393.0 | 0.640 |
| Pôle/DMU | 8312027648 | Care site 2 | 'Hospit' | 'All' | 'Pulmonary_embolism' | 2011-03-01 | 204.0 | 0.497 |
| Pôle/DMU | 8312027648 | Care site 2 | 'All' | 'All' | 'All' | 2018-08-01 | 22.0 | 0.274 |
| Hôpital | 8312022130 | Care site 3 | 'Hospit' | 'DP/DR' | 'Pulmonary_embolism' | 2022-02-01 | 9746.0 | 0.769 |
| care_site_level | care_site_id | care_site_short_name | stay_type | diag_type | condition_type | date | n_visit | c |
| :-------------- | :----------- | :------------------- | :-------- | :-------- | :------------------- | :--------- | :------ | :---- |
| Hôpital | 8312057527 | Care site 1 | 'All' | 'All' | 'Pulmonary_embolism' | 2019-05-01 | 233.0 | 0.841 |
| Hôpital | 8312057527 | Care site 1 | 'All' | 'DP/DR' | 'Pulmonary_embolism' | 2021-04-01 | 393.0 | 0.640 |
| Hôpital | 8312027648 | Care site 2 | 'Hospit' | 'All' | 'Pulmonary_embolism' | 2011-03-01 | 204.0 | 0.497 |
| Hôpital | 8312027648 | Care site 2 | 'All' | 'All' | 'All' | 2018-08-01 | 22.0 | 0.274 |
| Hôpital | 8312022130 | Care site 3 | 'Hospit' | 'DP/DR' | 'Pulmonary_embolism' | 2022-02-01 | 9746.0 | 0.769 |

=== "Model"

Expand Down
3 changes: 3 additions & 0 deletions edsteva/io/hive.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
from pyspark.sql import SparkSession
from pyspark.sql.types import LongType, StructField, StructType

from edsteva import koalas_options

from . import settings
from .i2b2_mapping import get_i2b2_table

Expand Down Expand Up @@ -100,6 +102,7 @@ def __init__(
if spark_session is not None:
self.spark_session = spark_session
else:
koalas_options()
logger.warning(
"""
To improve performances when using Spark and Koalas, please call `edsteva.improve_performances()`
Expand Down
Loading