Releases: OxWearables/biobankAccelerometerAnalysis
v5.1.2
v5.1.1
What's Changed
- Fix a bug in pandas inferred freq by @chanshing in #194
- Improve documentation
- Added a data dictionary section: https://biobankaccanalysis.readthedocs.io/en/latest/datadict.html
- Added a quality control section: https://biobankaccanalysis.readthedocs.io/en/latest/usage.html#quality-control
v5.1.0
What's Changed
- Improve classification module: Add support for cross-validation and custom
max_depth
andmin_samples_leaf
(useful to reduce model size). Improve reporting by outputting a JSON file with model metrics. - Update trained models (much lighter now) and model URLs.
- Fix
writeCmds
when paths contain spaces
v5.0.0
What's Changed
- The Capture-24 dataset was revised for annotation errors and ML models were retrained accordingly. Differences in the ML classification outputs might be expected.
- Major change in the random forest models to be based on imbalanced-learn package to replace the monkeypatch approach employed so far c14c1a3
- Fix spectral entropy normalization feature 8abdc32
- No longer compute/use fft3d features (expensive and not really useful). Also reduce number of fft features from 12 to 10 aca88ca
- Github Actions to no longer run classification as the new models are a bit too large.
- Minor refactoring: summary variable name changes (
wear
->wearTime
,CutPoint*
->cutPoint*
, etc.) - Other small improvements.
Due to the changes above, older activity classification models are no longer supported (you can still use them reverting to older releases).
v4.2.3
What's Changed
- Improve docs and fix ReadTheDocs issue
- Fix summary statistics; add new
*-week-avg stats
#191 - Hardcode
ylim
range inaccPlot
for easier comparison between different plots. - Use mean (not median) absolute error for calibration. This together with the change introduced in 4.2.0 produce lower calibration error. See plot below.
v4.2.2
v4.2.1
What's Changed
- v4.2.1 by @chanshing in #189
- New flag
--extractFeatures
for more flexibility, e.g. extract feats w/o classify - Make versions explicit in
setup.py
-- mainly to ensure same sklearn/joblib versions with trained model - No more need to manually specify activity model for
accPlot
- Add a progress bar for
accCollateSummary
- Fix summary output stats to be computed based on multiple of 24h, otherwise will be biased for days w/o full 24h. The issue appeared mainly at the boundaries of the recording period (first and last days).
- New summary stats: Include in the summary the actual recorded hours for each day. See below.
- Fix wear time stats to be also based on 24h and be consistent with the other metrics.
- Refactor nonwear code.
- Cleanup imputation: Impute only for stats calculations and no longer impute the output time-series, therefore also deprecate
--imputation
.
Include summary of actual recorded hours of activity
Sometimes we want the actual recorded hours for each separate day. Now we include in the summary:
{
day0-wear(hrs): ...
day1-wear(hrs): ...
...
day0-sleep(hrs): ...
day1-sleep(hrs):...
...
}
These should visually match the time-series plot.
v4.2.0
What's Changed
Improve gravity calibration routine; Update github workflows by @chanshing in #184
-
Following van Hees paper, discard outliers before linear regression and use weighted least squares instead of ordinary least squares. We use a slight variation where we detect outliers based on a percentile (0.5%) instead of some hard threshold as in the paper. We also track the MAE instead of RMSE which is more robust to outliers.
-
Use uncentered temperature (deprecate meanTemp) as it made comparing offsets across devices more cumbersome. Centering the temperature changes the offset coefficients, meaning the same device would have different offsets if they had different mean temps.
-
Fix #154 where calibration coeffs were not restored when calibration fails.
-
Update github workflows to reflect changes, and also switch to comparing the summary file instead of the epoch file as it is much lighter (28K vs 6.3M). We don't wanna fill up git with historical copies of epoch files.
Deprecation
Calibration is now based on the absolute temperature rather than the centered temperature. The reason for this is that it makes it easier to compare the calibr params across devices. Centering the temperature changes the offset params, so even the same device would have different offsets depending on the mean temperature. In van Hees paper temperature was centered for interpretation purposes. In the future, setting --meanTemp
will be deprecated. Currently, passing --meanTemp
will be ignored and force calibration. When processing UKB, we recommend always running calibration.
v4.1.0
Two new entry points accWriteCmds
and accCollateSummary
Usage:
accWriteCmds accDir/ -d resultsDir/ -f processCmds.txt
accCollateSummary resultsDir -o all-summary.csv
Major changes in output directory structure
Major change in how outputs are structured: When processing multiple files, results are now grouped by subjects instead of results. So
specifying each folder for each result (e.g. --timeSeriesFolder timeSeries/
, --summaryFolder summary/
) is now deprecated. Instead, simply specify --outputFolder
and all result files will be stored there.
Example:
accDir/
group0/
subject001.cwa
subject002.cwa
...
group1/
subject003.cwa
subject004.cwa
...
Then accWriteCmds accDir/ -d outDir/ -f processCmds.txt ; bash processCmds.txt
will result in the following output structure:
outDir/
group0/
subject001/
subject001-summary.json
subject001-timeSeries.csv
...
subject002/
subject002-summary.json
subject002-timeSeries.csv
...
...
group1/
subject003/
subject003-summary.json
subject003-timeSeries.csv
...
subject004/
subject004-summary.json
subject004-timeSeries.csv
...
...
Minor fixes
- better print
- rm nonWearFile intermediate file at exit
- filesCSV defaults to None (calibrate by default)