Skip to content
This repository has been archived by the owner on Mar 18, 2024. It is now read-only.

WIP: ENH: add quality prediction to Carbon Flux #23

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

stsievert
Copy link
Contributor

This adds a prediction from all global sites. This is still a work in progress: I need to make this a better predictor.

@stsievert
Copy link
Contributor Author

I need a baseline to start of with. To do this, I used a linear regressor on every site to predict the last year at the same site. I used the same model as a baseline with the global prediction, but predicted a different (unseen) site, not the latest year.

Apparently, the correlation coefficient is the metric cared about. I run these simulations 179 times (once for each station). Here are the distribution of correlation coefficients from these trials:

screen shot 2018-10-01 at 9 23 19 am

We can see that the linear model at one site significantly outperforms the global linear model.

The summary statistics are

Statistic One site predicting last
year at same site
All sites predicting
one held out site
Median 0.573 0.441
Mean 0.497 0.398

@jbednar
Copy link
Collaborator

jbednar commented Oct 1, 2018

Sounds promising. Still WIP?

@stsievert
Copy link
Contributor Author

Still a WIP. Mostly, I know the baseline performance (what's considered "good"), and have a metric to improve upon.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants