Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

Make sure pearson_correlation_scalar input is validated #753

Merged
merged 1 commit into from
Sep 21, 2018

Conversation

JanisGailis
Copy link
Member

Input validation in pearson_correlation_scalar had faulty logic

Input validation in pearson_correlation_scalar had faulty logic
@JanisGailis
Copy link
Member Author

Closes #746

The pearson_correlation_scalar is meant to work only on 1D variables (much like the underlying scipy function). I had a check for 3D variables, which was faulty. This PR resolves that, but the operation you were trying to carry out in #746 will then fail anyway, just with a nicer message.

@codecov-io
Copy link

codecov-io commented Sep 20, 2018

Codecov Report

Merging #753 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #753   +/-   ##
=======================================
  Coverage   76.94%   76.94%           
=======================================
  Files          81       81           
  Lines       12498    12498           
=======================================
  Hits         9616     9616           
  Misses       2882     2882
Impacted Files Coverage Δ
cate/ops/correlation.py 100% <100%> (+1.06%) ⬆️
cate/util/process.py 90.15% <0%> (-0.76%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9af8fd2...a044555. Read the comment docs.

@JanisGailis
Copy link
Member Author

@forman Merge when ready. If this is not merged in a day, I'll go ahead and do it myself.

Copy link
Member

@forman forman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that operation 1D only? And why specifically about time series?

Can't we just check if both datasets have same dims and all dims are equal. If so, flatten both nd-arrays and compute correlation.

@forman forman merged commit 6540e61 into master Sep 21, 2018
@forman
Copy link
Member

forman commented Sep 21, 2018

@JanisGailis merging this, but please answer my question once you have time!

@JanisGailis
Copy link
Member Author

@forman Using .values.flatten() on an ND xarray.DataArray, is a bad idea, as it will very likely run into MemoryError in many instances.

However, after some quick investigations, xr.DataArray.stack() could be used to achieve what you suggest in a memory safe way. This probably means that the datasets will have to be coregistered beforehand, for the operation to make any sense on ND inputs. This shouldn't take too long to implement. Should I go ahead?

@forman
Copy link
Member

forman commented Sep 24, 2018

I actually did not mean the numpy flatten(), but just flatten, turn ND into 1D. So, yes, fine!

@JanisGailis JanisGailis deleted the jg-746-pearsonr branch September 27, 2018 07:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants