Random Feature-based CES #194

odunbar · 2022-10-18T23:57:30Z

Purpose

Adds the ability to use (Scalar and vector-valued) RF with uncertainty in place of GP within CES. using RandomFeatures.jl

Closes #164

Content

Interfaces with the currently registered RandomFeatures.jl
adds ScalarRandomFeatureInterface as a MachineLearningTool
adds VectorRandomFeatureInterface as a MachineLearningTool
new example examples/Emulator/RandomFeature/optimize_and_plot_RF.jl
new example examples/Emulator/RandomFeature/vector_optimize_and_plot_RF.jl
new example examples/Lorenz/calibrate.jl
new example examples/Lorenz/emulate_sample.jl

The current implementation has Scalar RF replacing (exactly) the GP, whereas Vector RF does no SVD, and therefore learns the output space correlations. The hyperparameter learning is more involved, so to reduce some cost I learn the cholesky factors of an input and output covariance of the feature distribution, currently described by a MatrixVariate Normal distribution.

new example examples/GCM/emulate_sample_script.jl though currently just the emulation!

In this example we have 4 options: (Note in all cases we train on cholesky factors for the input variables)

GPR trains an output_dim-length vector of scalar GPRs,
Scalar RFR SVD replaces the vector of scalar GPRs, with a vector of scalar RFRs,
Vector RFR SVD Diagonal assumes a diagonalized output in the vector problem (i.e. still in the setting of a system of Scalar RFs & GPs but only train one object)
Vector RFR SVD nondiagonal still applies the SVD, but does not assume that the resulting output must be diagonal. It therefore learns cholesky factors of the output
Vector RFR nondiagonal does not apply SVD, nor assumes the output is diagonal. It learns the cholesky factors of the direct output.

Emulating an R^2 to R^2 function (150 data points)

SVD + Scalar GP (diag in) results
SVD + Scalar RF (nondiag in) results
SVD + vector RF (diag out) results
SVD + vector RF (nondiag out) results
vector RF (diag out) results
vector RF (nondiag out) results

Emulating GCM data R^2 -> R^96, evaluated at a test point

SVD + Scalar RF results

SVD + Vector RF (restrict to diagonal) results [hparam learnt with 202 features])

SVD + vector RF results (full non-diagonal [hparam learnt with 608 features])

No-SVD, with vector RF results (full non-diagonal) + standardize each data-type by median

SVD + GP results

Full CES test (with "E" emulating an R^2 -> R^12 forward map) 250 data points

SVD + Scalar GP (diag in) results
SVD + Scalar RF (diag in) results
SVD + Scalar RF (nondiag in) results
SVD + vector RF (nondiag in, diag out) results
SVD + vector RF (nondiag in, nondiag out) results
vector RF (nondiag in, diag out) results
vector RF (nondiag in, nondiag out) results

codecov · 2022-11-11T00:16:09Z

Codecov Report

Patch coverage: 5.82% and project coverage change: -38.88 ⚠️

Comparison is base (045ee4a) 88.65% compared to head (1ddc959) 49.78%.

❗ Current head 1ddc959 differs from pull request most recent head c8a993c. Consider uploading reports for the commit c8a993c to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #194       +/-   ##
===========================================
- Coverage   88.65%   49.78%   -38.88%     
===========================================
  Files           4        6        +2     
  Lines         388      697      +309     
===========================================
+ Hits          344      347        +3     
- Misses         44      350      +306

Impacted Files	Coverage Δ
src/ScalarRandomFeature.jl	`0.00% <0.00%> (ø)`
src/VectorRandomFeature.jl	`0.00% <0.00%> (ø)`
src/Emulator.jl	`86.88% <64.28%> (-6.76%)`	⬇️
src/MarkovChainMonteCarlo.jl	`79.85% <100.00%> (ø)`

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

odunbar · 2023-05-03T17:48:41Z

Note: Docs build locally, just not with the online setup - which does not dev the local repo.

odunbar · 2023-05-05T02:40:21Z

Unfortunately it is only myself developing the repo currently so merging without review.
Some reassurance, (1) this is a largely standalone feature, (2) it has comprehensive testing, (3) it has some comprehensive examples (4) the API and constructors are documented. A new issue #215 has been opened to schedule the future work,

Use of RF seamlessly in this codebase will still need a little work, but at this point the tool is in a good-to-go state with some reasonable defaults. I'm looking forward to future manageable PRs that will develop on this!

odunbar · 2023-05-05T03:03:33Z

bors r+

194: [WIP] Random Feature-based CES r=odunbar a=odunbar  ## Purpose  Adds the ability to use (Scalar and vector-valued) RF with uncertainty in place of GP within CES. using `RandomFeatures.jl` Closes #164 ## Content  - Interfaces with the currently registered RandomFeatures.jl - adds `ScalarRandomFeatureInterface` as a `MachineLearningTool` - adds `VectorRandomFeatureInterface` as a `MachineLearningTool` - new example `examples/Emulator/RandomFeature/optimize_and_plot_RF.jl` - new example `examples/Emulator/RandomFeature/vector_optimize_and_plot_RF.jl` - new example `examples/Lorenz/calibrate.jl` - new example `examples/Lorenz/emulate_sample.jl` The current implementation has Scalar RF replacing (exactly) the GP, whereas Vector RF does no SVD, and therefore learns the output space correlations. The hyperparameter learning is more involved, so to reduce some cost I learn the cholesky factors of an input and output covariance of the feature distribution, currently described by a MatrixVariate Normal distribution. - new example `examples/GCM/emulate_sample_script.jl` though *currently just the emulation!* In this example we have 4 options: (Note in all cases we train on cholesky factors for the input variables) 1. `GPR` trains an `output_dim`-length vector of scalar GPRs, 2. `Scalar RFR SVD` replaces the vector of scalar GPRs, with a vector of scalar RFRs, 3. `Vector RFR SVD Diagonal` assumes a diagonalized output in the vector problem (i.e. still in the setting of a system of Scalar RFs & GPs but only train one object) 4. `Vector RFR SVD nondiagonal` still applies the SVD, but does not assume that the resulting output must be diagonal. It therefore learns cholesky factors of the output 4. `Vector RFR nondiagonal` does not apply SVD, nor assumes the output is diagonal. It learns the cholesky factors of the direct output. ### Emulating an R^2 to R^2 function (150 data points) 1) SVD + Scalar GP (diag in) results <img src="https://user-images.githubusercontent.com/47412152/192404099-be8d1241-2dd4-4263-ba2a-31de94763abb.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/192404100-62d72ccc-2b36-4ba9-ad36-38bcdf4b9f0f.png" width="300"> 2) SVD + Scalar RF (nondiag in) results <img src="https://user-images.githubusercontent.com/47412152/230235711-6bb0557e-8914-4a43-8f91-f5a144659edc.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230235715-54fb7d5e-fa24-4528-a3fc-27ff7e9aceb8.png" width="300"> 3) SVD + vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230229962-c7eefa25-3a57-467c-8ca1-c9ef7b3dbb3e.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230229964-fcfcbe7c-73cd-4837-8db7-5eaa339eaec6.png" width="300"> 4) SVD + vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230124-bb50e4db-8ba7-4570-930e-b6504936a1b5.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230128-2422e85a-1100-4422-a52c-8fd32183b7f0.png" width="300"> 5) vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230230033-618dcfa8-99a7-4462-b31f-e9adf302dc14.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230040-7dad6ce8-b5dc-4057-9693-a71e0a72379c.png" width="300"> 6) vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230167-25377c56-c622-493d-bd0e-b98fc672189a.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230169-23c659eb-4218-4b27-9625-94a7fd44a941.png" width="300"> ### Emulating GCM data R^2 -> R^96, evaluated at a test point #### SVD + Scalar RF results <img src="https://user-images.githubusercontent.com/47412152/219200986-9a5f74e4-5e2a-48cf-8e26-5d66de2e751c.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200996-f5b88c6e-8b51-4df0-acac-187a6a786a78.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201000-afac8c7a-aa9a-4400-93e8-75f560954bba.png" width="150"> #### SVD + Vector RF (restrict to diagonal) results [hparam learnt with 202 features]) <img src="https://user-images.githubusercontent.com/47412152/219200373-7b0e2713-c3db-4891-9012-6852381266b5.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219252222-f75ba2c8-b11a-42eb-aa1c-f2308a775041.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200387-d328a6b6-5f64-4eda-bc7e-3a122054c2a5.png" width="150"> #### SVD + vector RF results (full non-diagonal [hparam learnt with 608 features]) <img src="https://user-images.githubusercontent.com/47412152/220444681-20b3ef41-5347-4406-afd6-ffa8cfc1e1b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444689-92f24b34-6937-4e19-b733-2d1b263ca9f7.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444685-70f88334-3c32-438e-8534-e4ca0a1c24d2.png" width="150"> #### No-SVD, with vector RF results (full non-diagonal) + standardize each data-type by median <img src="https://user-images.githubusercontent.com/47412152/235567696-4c5665b1-33db-4c83-a554-f003e5e015b6.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567703-0fae36f9-e49b-4cfc-b1b4-16a313af51b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567706-1d9513f7-ec37-4fad-b824-a3c62b7759d0.png" width="150"> #### SVD + GP results <img src="https://user-images.githubusercontent.com/47412152/219201341-13acb758-a444-4e05-98c9-ed6975dbd094.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201347-d5c13d9f-3d63-456b-8059-fea0f31346a0.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201351-fa2a4a61-bf75-4292-8549-8f304f05cebe.png" width="150"> ### Full CES test (with "E" emulating an R^2 -> R^12 forward map) 250 data points 1) SVD + Scalar GP (diag in) results 2) SVD + Scalar RF (diag in) results 3) SVD + Scalar RF (nondiag in) results 4) SVD + vector RF (nondiag in, diag out) results 5) SVD + vector RF (nondiag in, nondiag out) results 6) vector RF (nondiag in, diag out) results 7) vector RF (nondiag in, nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/236000320-bbf88ee3-6de7-4e8e-8797-13f48696337a.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364924-c41dd024-e56e-4506-ad19-2fc324d0db61.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364926-f74d9467-7083-4fb9-8c89-b6ea4ab8aad3.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364930-a0424381-ccc5-472c-b85f-e24aec7319f0.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364932-3cb2e6c6-0f5b-4544-9916-cddfcbd3c882.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364927-b05e1bc8-0430-4db8-97dc-29b0fce0f305.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364929-5e78ceb3-07e9-4230-95b1-2e638dab1ce3.png" width="175">  Co-authored-by: odunbar <odunbar@caltech.edu>

bors · 2023-05-05T03:12:11Z

Build failed:

docs-build

odunbar · 2023-05-05T17:03:11Z

bors r+

194: [WIP] Random Feature-based CES r=odunbar a=odunbar  ## Purpose  Adds the ability to use (Scalar and vector-valued) RF with uncertainty in place of GP within CES. using `RandomFeatures.jl` Closes #164 ## Content  - Interfaces with the currently registered RandomFeatures.jl - adds `ScalarRandomFeatureInterface` as a `MachineLearningTool` - adds `VectorRandomFeatureInterface` as a `MachineLearningTool` - new example `examples/Emulator/RandomFeature/optimize_and_plot_RF.jl` - new example `examples/Emulator/RandomFeature/vector_optimize_and_plot_RF.jl` - new example `examples/Lorenz/calibrate.jl` - new example `examples/Lorenz/emulate_sample.jl` The current implementation has Scalar RF replacing (exactly) the GP, whereas Vector RF does no SVD, and therefore learns the output space correlations. The hyperparameter learning is more involved, so to reduce some cost I learn the cholesky factors of an input and output covariance of the feature distribution, currently described by a MatrixVariate Normal distribution. - new example `examples/GCM/emulate_sample_script.jl` though *currently just the emulation!* In this example we have 4 options: (Note in all cases we train on cholesky factors for the input variables) 1. `GPR` trains an `output_dim`-length vector of scalar GPRs, 2. `Scalar RFR SVD` replaces the vector of scalar GPRs, with a vector of scalar RFRs, 3. `Vector RFR SVD Diagonal` assumes a diagonalized output in the vector problem (i.e. still in the setting of a system of Scalar RFs & GPs but only train one object) 4. `Vector RFR SVD nondiagonal` still applies the SVD, but does not assume that the resulting output must be diagonal. It therefore learns cholesky factors of the output 4. `Vector RFR nondiagonal` does not apply SVD, nor assumes the output is diagonal. It learns the cholesky factors of the direct output. ### Emulating an R^2 to R^2 function (150 data points) 1) SVD + Scalar GP (diag in) results <img src="https://user-images.githubusercontent.com/47412152/192404099-be8d1241-2dd4-4263-ba2a-31de94763abb.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/192404100-62d72ccc-2b36-4ba9-ad36-38bcdf4b9f0f.png" width="300"> 2) SVD + Scalar RF (nondiag in) results <img src="https://user-images.githubusercontent.com/47412152/230235711-6bb0557e-8914-4a43-8f91-f5a144659edc.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230235715-54fb7d5e-fa24-4528-a3fc-27ff7e9aceb8.png" width="300"> 3) SVD + vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230229962-c7eefa25-3a57-467c-8ca1-c9ef7b3dbb3e.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230229964-fcfcbe7c-73cd-4837-8db7-5eaa339eaec6.png" width="300"> 4) SVD + vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230124-bb50e4db-8ba7-4570-930e-b6504936a1b5.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230128-2422e85a-1100-4422-a52c-8fd32183b7f0.png" width="300"> 5) vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230230033-618dcfa8-99a7-4462-b31f-e9adf302dc14.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230040-7dad6ce8-b5dc-4057-9693-a71e0a72379c.png" width="300"> 6) vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230167-25377c56-c622-493d-bd0e-b98fc672189a.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230169-23c659eb-4218-4b27-9625-94a7fd44a941.png" width="300"> ### Emulating GCM data R^2 -> R^96, evaluated at a test point #### SVD + Scalar RF results <img src="https://user-images.githubusercontent.com/47412152/219200986-9a5f74e4-5e2a-48cf-8e26-5d66de2e751c.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200996-f5b88c6e-8b51-4df0-acac-187a6a786a78.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201000-afac8c7a-aa9a-4400-93e8-75f560954bba.png" width="150"> #### SVD + Vector RF (restrict to diagonal) results [hparam learnt with 202 features]) <img src="https://user-images.githubusercontent.com/47412152/219200373-7b0e2713-c3db-4891-9012-6852381266b5.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219252222-f75ba2c8-b11a-42eb-aa1c-f2308a775041.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200387-d328a6b6-5f64-4eda-bc7e-3a122054c2a5.png" width="150"> #### SVD + vector RF results (full non-diagonal [hparam learnt with 608 features]) <img src="https://user-images.githubusercontent.com/47412152/220444681-20b3ef41-5347-4406-afd6-ffa8cfc1e1b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444689-92f24b34-6937-4e19-b733-2d1b263ca9f7.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444685-70f88334-3c32-438e-8534-e4ca0a1c24d2.png" width="150"> #### No-SVD, with vector RF results (full non-diagonal) + standardize each data-type by median <img src="https://user-images.githubusercontent.com/47412152/235567696-4c5665b1-33db-4c83-a554-f003e5e015b6.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567703-0fae36f9-e49b-4cfc-b1b4-16a313af51b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567706-1d9513f7-ec37-4fad-b824-a3c62b7759d0.png" width="150"> #### SVD + GP results <img src="https://user-images.githubusercontent.com/47412152/219201341-13acb758-a444-4e05-98c9-ed6975dbd094.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201347-d5c13d9f-3d63-456b-8059-fea0f31346a0.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201351-fa2a4a61-bf75-4292-8549-8f304f05cebe.png" width="150"> ### Full CES test (with "E" emulating an R^2 -> R^12 forward map) 250 data points 1) SVD + Scalar GP (diag in) results 2) SVD + Scalar RF (diag in) results 3) SVD + Scalar RF (nondiag in) results 4) SVD + vector RF (nondiag in, diag out) results 5) SVD + vector RF (nondiag in, nondiag out) results 6) vector RF (nondiag in, diag out) results 7) vector RF (nondiag in, nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/236000320-bbf88ee3-6de7-4e8e-8797-13f48696337a.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364924-c41dd024-e56e-4506-ad19-2fc324d0db61.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364926-f74d9467-7083-4fb9-8c89-b6ea4ab8aad3.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364930-a0424381-ccc5-472c-b85f-e24aec7319f0.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364932-3cb2e6c6-0f5b-4544-9916-cddfcbd3c882.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364927-b05e1bc8-0430-4db8-97dc-29b0fce0f305.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364929-5e78ceb3-07e9-4230-95b1-2e638dab1ce3.png" width="175">  Co-authored-by: odunbar <odunbar@caltech.edu>

bors · 2023-05-05T17:13:26Z

Build failed:

docs-build

odunbar · 2023-05-05T20:11:19Z

bors r+

194: [WIP] Random Feature-based CES r=odunbar a=odunbar  ## Purpose  Adds the ability to use (Scalar and vector-valued) RF with uncertainty in place of GP within CES. using `RandomFeatures.jl` Closes #164 ## Content  - Interfaces with the currently registered RandomFeatures.jl - adds `ScalarRandomFeatureInterface` as a `MachineLearningTool` - adds `VectorRandomFeatureInterface` as a `MachineLearningTool` - new example `examples/Emulator/RandomFeature/optimize_and_plot_RF.jl` - new example `examples/Emulator/RandomFeature/vector_optimize_and_plot_RF.jl` - new example `examples/Lorenz/calibrate.jl` - new example `examples/Lorenz/emulate_sample.jl` The current implementation has Scalar RF replacing (exactly) the GP, whereas Vector RF does no SVD, and therefore learns the output space correlations. The hyperparameter learning is more involved, so to reduce some cost I learn the cholesky factors of an input and output covariance of the feature distribution, currently described by a MatrixVariate Normal distribution. - new example `examples/GCM/emulate_sample_script.jl` though *currently just the emulation!* In this example we have 4 options: (Note in all cases we train on cholesky factors for the input variables) 1. `GPR` trains an `output_dim`-length vector of scalar GPRs, 2. `Scalar RFR SVD` replaces the vector of scalar GPRs, with a vector of scalar RFRs, 3. `Vector RFR SVD Diagonal` assumes a diagonalized output in the vector problem (i.e. still in the setting of a system of Scalar RFs & GPs but only train one object) 4. `Vector RFR SVD nondiagonal` still applies the SVD, but does not assume that the resulting output must be diagonal. It therefore learns cholesky factors of the output 4. `Vector RFR nondiagonal` does not apply SVD, nor assumes the output is diagonal. It learns the cholesky factors of the direct output. ### Emulating an R^2 to R^2 function (150 data points) 1) SVD + Scalar GP (diag in) results <img src="https://user-images.githubusercontent.com/47412152/192404099-be8d1241-2dd4-4263-ba2a-31de94763abb.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/192404100-62d72ccc-2b36-4ba9-ad36-38bcdf4b9f0f.png" width="300"> 2) SVD + Scalar RF (nondiag in) results <img src="https://user-images.githubusercontent.com/47412152/230235711-6bb0557e-8914-4a43-8f91-f5a144659edc.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230235715-54fb7d5e-fa24-4528-a3fc-27ff7e9aceb8.png" width="300"> 3) SVD + vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230229962-c7eefa25-3a57-467c-8ca1-c9ef7b3dbb3e.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230229964-fcfcbe7c-73cd-4837-8db7-5eaa339eaec6.png" width="300"> 4) SVD + vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230124-bb50e4db-8ba7-4570-930e-b6504936a1b5.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230128-2422e85a-1100-4422-a52c-8fd32183b7f0.png" width="300"> 5) vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230230033-618dcfa8-99a7-4462-b31f-e9adf302dc14.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230040-7dad6ce8-b5dc-4057-9693-a71e0a72379c.png" width="300"> 6) vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230167-25377c56-c622-493d-bd0e-b98fc672189a.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230169-23c659eb-4218-4b27-9625-94a7fd44a941.png" width="300"> ### Emulating GCM data R^2 -> R^96, evaluated at a test point #### SVD + Scalar RF results <img src="https://user-images.githubusercontent.com/47412152/219200986-9a5f74e4-5e2a-48cf-8e26-5d66de2e751c.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200996-f5b88c6e-8b51-4df0-acac-187a6a786a78.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201000-afac8c7a-aa9a-4400-93e8-75f560954bba.png" width="150"> #### SVD + Vector RF (restrict to diagonal) results [hparam learnt with 202 features]) <img src="https://user-images.githubusercontent.com/47412152/219200373-7b0e2713-c3db-4891-9012-6852381266b5.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219252222-f75ba2c8-b11a-42eb-aa1c-f2308a775041.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200387-d328a6b6-5f64-4eda-bc7e-3a122054c2a5.png" width="150"> #### SVD + vector RF results (full non-diagonal [hparam learnt with 608 features]) <img src="https://user-images.githubusercontent.com/47412152/220444681-20b3ef41-5347-4406-afd6-ffa8cfc1e1b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444689-92f24b34-6937-4e19-b733-2d1b263ca9f7.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444685-70f88334-3c32-438e-8534-e4ca0a1c24d2.png" width="150"> #### No-SVD, with vector RF results (full non-diagonal) + standardize each data-type by median <img src="https://user-images.githubusercontent.com/47412152/235567696-4c5665b1-33db-4c83-a554-f003e5e015b6.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567703-0fae36f9-e49b-4cfc-b1b4-16a313af51b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567706-1d9513f7-ec37-4fad-b824-a3c62b7759d0.png" width="150"> #### SVD + GP results <img src="https://user-images.githubusercontent.com/47412152/219201341-13acb758-a444-4e05-98c9-ed6975dbd094.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201347-d5c13d9f-3d63-456b-8059-fea0f31346a0.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201351-fa2a4a61-bf75-4292-8549-8f304f05cebe.png" width="150"> ### Full CES test (with "E" emulating an R^2 -> R^12 forward map) 250 data points 1) SVD + Scalar GP (diag in) results 2) SVD + Scalar RF (diag in) results 3) SVD + Scalar RF (nondiag in) results 4) SVD + vector RF (nondiag in, diag out) results 5) SVD + vector RF (nondiag in, nondiag out) results 6) vector RF (nondiag in, diag out) results 7) vector RF (nondiag in, nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/236000320-bbf88ee3-6de7-4e8e-8797-13f48696337a.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364924-c41dd024-e56e-4506-ad19-2fc324d0db61.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364926-f74d9467-7083-4fb9-8c89-b6ea4ab8aad3.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364930-a0424381-ccc5-472c-b85f-e24aec7319f0.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364932-3cb2e6c6-0f5b-4544-9916-cddfcbd3c882.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364927-b05e1bc8-0430-4db8-97dc-29b0fce0f305.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364929-5e78ceb3-07e9-4230-95b1-2e638dab1ce3.png" width="175">  Co-authored-by: odunbar <odunbar@caltech.edu>

bors · 2023-05-05T20:21:10Z

Build failed:

docs-build

odunbar · 2023-05-05T21:08:51Z

bors r+

194: [WIP] Random Feature-based CES r=odunbar a=odunbar  ## Purpose  Adds the ability to use (Scalar and vector-valued) RF with uncertainty in place of GP within CES. using `RandomFeatures.jl` Closes #164 ## Content  - Interfaces with the currently registered RandomFeatures.jl - adds `ScalarRandomFeatureInterface` as a `MachineLearningTool` - adds `VectorRandomFeatureInterface` as a `MachineLearningTool` - new example `examples/Emulator/RandomFeature/optimize_and_plot_RF.jl` - new example `examples/Emulator/RandomFeature/vector_optimize_and_plot_RF.jl` - new example `examples/Lorenz/calibrate.jl` - new example `examples/Lorenz/emulate_sample.jl` The current implementation has Scalar RF replacing (exactly) the GP, whereas Vector RF does no SVD, and therefore learns the output space correlations. The hyperparameter learning is more involved, so to reduce some cost I learn the cholesky factors of an input and output covariance of the feature distribution, currently described by a MatrixVariate Normal distribution. - new example `examples/GCM/emulate_sample_script.jl` though *currently just the emulation!* In this example we have 4 options: (Note in all cases we train on cholesky factors for the input variables) 1. `GPR` trains an `output_dim`-length vector of scalar GPRs, 2. `Scalar RFR SVD` replaces the vector of scalar GPRs, with a vector of scalar RFRs, 3. `Vector RFR SVD Diagonal` assumes a diagonalized output in the vector problem (i.e. still in the setting of a system of Scalar RFs & GPs but only train one object) 4. `Vector RFR SVD nondiagonal` still applies the SVD, but does not assume that the resulting output must be diagonal. It therefore learns cholesky factors of the output 4. `Vector RFR nondiagonal` does not apply SVD, nor assumes the output is diagonal. It learns the cholesky factors of the direct output. ### Emulating an R^2 to R^2 function (150 data points) 1) SVD + Scalar GP (diag in) results <img src="https://user-images.githubusercontent.com/47412152/192404099-be8d1241-2dd4-4263-ba2a-31de94763abb.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/192404100-62d72ccc-2b36-4ba9-ad36-38bcdf4b9f0f.png" width="300"> 2) SVD + Scalar RF (nondiag in) results <img src="https://user-images.githubusercontent.com/47412152/230235711-6bb0557e-8914-4a43-8f91-f5a144659edc.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230235715-54fb7d5e-fa24-4528-a3fc-27ff7e9aceb8.png" width="300"> 3) SVD + vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230229962-c7eefa25-3a57-467c-8ca1-c9ef7b3dbb3e.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230229964-fcfcbe7c-73cd-4837-8db7-5eaa339eaec6.png" width="300"> 4) SVD + vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230124-bb50e4db-8ba7-4570-930e-b6504936a1b5.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230128-2422e85a-1100-4422-a52c-8fd32183b7f0.png" width="300"> 5) vector RF (diag out) results <img src="https://user-images.githubusercontent.com/47412152/230230033-618dcfa8-99a7-4462-b31f-e9adf302dc14.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230040-7dad6ce8-b5dc-4057-9693-a71e0a72379c.png" width="300"> 6) vector RF (nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/230230167-25377c56-c622-493d-bd0e-b98fc672189a.png" width="300"> <img src="https://user-images.githubusercontent.com/47412152/230230169-23c659eb-4218-4b27-9625-94a7fd44a941.png" width="300"> ### Emulating GCM data R^2 -> R^96, evaluated at a test point #### SVD + Scalar RF results <img src="https://user-images.githubusercontent.com/47412152/219200986-9a5f74e4-5e2a-48cf-8e26-5d66de2e751c.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200996-f5b88c6e-8b51-4df0-acac-187a6a786a78.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201000-afac8c7a-aa9a-4400-93e8-75f560954bba.png" width="150"> #### SVD + Vector RF (restrict to diagonal) results [hparam learnt with 202 features]) <img src="https://user-images.githubusercontent.com/47412152/219200373-7b0e2713-c3db-4891-9012-6852381266b5.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219252222-f75ba2c8-b11a-42eb-aa1c-f2308a775041.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219200387-d328a6b6-5f64-4eda-bc7e-3a122054c2a5.png" width="150"> #### SVD + vector RF results (full non-diagonal [hparam learnt with 608 features]) <img src="https://user-images.githubusercontent.com/47412152/220444681-20b3ef41-5347-4406-afd6-ffa8cfc1e1b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444689-92f24b34-6937-4e19-b733-2d1b263ca9f7.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/220444685-70f88334-3c32-438e-8534-e4ca0a1c24d2.png" width="150"> #### No-SVD, with vector RF results (full non-diagonal) + standardize each data-type by median <img src="https://user-images.githubusercontent.com/47412152/235567696-4c5665b1-33db-4c83-a554-f003e5e015b6.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567703-0fae36f9-e49b-4cfc-b1b4-16a313af51b8.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/235567706-1d9513f7-ec37-4fad-b824-a3c62b7759d0.png" width="150"> #### SVD + GP results <img src="https://user-images.githubusercontent.com/47412152/219201341-13acb758-a444-4e05-98c9-ed6975dbd094.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201347-d5c13d9f-3d63-456b-8059-fea0f31346a0.png" width="150"> <img src="https://user-images.githubusercontent.com/47412152/219201351-fa2a4a61-bf75-4292-8549-8f304f05cebe.png" width="150"> ### Full CES test (with "E" emulating an R^2 -> R^12 forward map) 250 data points 1) SVD + Scalar GP (diag in) results 2) SVD + Scalar RF (diag in) results 3) SVD + Scalar RF (nondiag in) results 4) SVD + vector RF (nondiag in, diag out) results 5) SVD + vector RF (nondiag in, nondiag out) results 6) vector RF (nondiag in, diag out) results 7) vector RF (nondiag in, nondiag out) results <img src="https://user-images.githubusercontent.com/47412152/236000320-bbf88ee3-6de7-4e8e-8797-13f48696337a.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364924-c41dd024-e56e-4506-ad19-2fc324d0db61.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364926-f74d9467-7083-4fb9-8c89-b6ea4ab8aad3.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364930-a0424381-ccc5-472c-b85f-e24aec7319f0.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364932-3cb2e6c6-0f5b-4544-9916-cddfcbd3c882.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364927-b05e1bc8-0430-4db8-97dc-29b0fce0f305.png" width="175"> <img src="https://user-images.githubusercontent.com/47412152/236364929-5e78ceb3-07e9-4230-95b1-2e638dab1ce3.png" width="175">  Co-authored-by: odunbar <odunbar@caltech.edu>

bors · 2023-05-05T21:28:27Z

Build failed:

test-macos

update project toml examples for RF shuffle data update example example to produce comparable figs updates for compatability with CES 0.2.0 and RF 0.1.0 vector random feature support added fixes to ensure CES pipeline runs regularization and lorenz example format allows training with fewer features than data VRFI with SVD, and cholesky options feature num dep on n GCM example replace data multithreading and rng add ProgressBars remove high-level threading for now (takes place within LinAlg solvers) sbatch script truth at some points increased number optimization features default bugfix reg matrix argument initial tik-reg for EKI working TEKI 0 default eki, small tweaks add logdet complexity more consistent adding of definiteness chol/svd add logdet to scalar learning shape bug logdetI unite common functions in Random Feature, expand Scalar feature learning extend reverse svd for covs add diag terms to MatrixNormal description, default to diagonal regularizations rather than pos-def add diagonal option trimmed, and added const hp for diag cov compat with svd truncation, and more standard posdef corrections added scaling to complexity data change scalar interface lorenz 2d statsplot combine all MLT examples into this improved interfacing, unification and initial unit testing condensed into emulate_sample simplify scalar interface bug improved vector interface reg should be multiplicative! fixed small edits update ess.jl MSE on next ensemble, add input-diag case inflation optimizer defaults and cov representation inflation vec inflation utility for ensembles test pass with new defaults and cov structure format format add RF tests GP test fails resolved scalar_optimize_and_plot_RF.jl with new RF accel removed some abstract types, compatible with RandomFeatures 0.2.5 format typo another typo compatible with v0.3 RandomFeatures dispatching over RandomFeatures v0.3.1 multthread options typo more flexible priors compataility for SRF and RF v0.3.1 updates to GCM example scripts docstrings docstring API and format rm duplicate API docs format API docs work locally rename test pass format better messages, bugfix priors for diagonalized options emulation test scenarios for scalar and vector RF multithread supp for lorenz example added cov samples user option, added opt-option for threading in prediction test format Lorenz example config verbose flag tests pass format test for tullio threading format comment api docs until github docs bug fixed format try adding manifest test/RandomFeature/runtests.jl standardized standardization in emulation

odunbar · 2023-05-05T22:03:21Z

bors r+

bors · 2023-05-05T22:38:10Z

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

odunbar force-pushed the orad/RF branch from f533dda to c151909 Compare February 28, 2023 22:51

odunbar force-pushed the orad/RF branch 2 times, most recently from 3d70d68 to a6e0978 Compare April 6, 2023 18:41

odunbar mentioned this pull request Apr 24, 2023

Random Feature development #164

Closed

6 tasks

odunbar force-pushed the orad/RF branch 3 times, most recently from d4f7fca to e35cdd2 Compare May 3, 2023 17:22

odunbar mentioned this pull request May 4, 2023

O3.1.1 Random Feature Development: Phase 2 #215

Open

odunbar force-pushed the orad/RF branch from 1231733 to defb3ee Compare May 5, 2023 02:36

odunbar force-pushed the orad/RF branch from f7941e3 to 5a33edf Compare May 5, 2023 17:03

odunbar force-pushed the orad/RF branch from 3bcb131 to 385d972 Compare May 5, 2023 20:10

odunbar force-pushed the orad/RF branch from 53df0fe to 82c21c0 Compare May 5, 2023 21:08

odunbar changed the title ~~[WIP] Random Feature-based CES~~ Random Feature-based CES May 5, 2023

odunbar force-pushed the orad/RF branch from 82c21c0 to a4f6599 Compare May 5, 2023 21:41

odunbar force-pushed the orad/RF branch from 18526f9 to c8a993c Compare May 5, 2023 21:50

bors bot merged commit 0766a3d into main May 5, 2023

bors bot deleted the orad/RF branch May 5, 2023 22:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random Feature-based CES #194

Random Feature-based CES #194

odunbar commented Oct 18, 2022 •

edited

Loading

codecov bot commented Nov 11, 2022 •

edited

Loading

odunbar commented May 3, 2023

odunbar commented May 5, 2023 •

edited

Loading

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

Random Feature-based CES #194

Random Feature-based CES #194

Conversation

odunbar commented Oct 18, 2022 • edited Loading

Purpose

Content

Emulating an R^2 to R^2 function (150 data points)

Emulating GCM data R^2 -> R^96, evaluated at a test point

SVD + Scalar RF results

SVD + Vector RF (restrict to diagonal) results [hparam learnt with 202 features])

SVD + vector RF results (full non-diagonal [hparam learnt with 608 features])

No-SVD, with vector RF results (full non-diagonal) + standardize each data-type by median

SVD + GP results

Full CES test (with "E" emulating an R^2 -> R^12 forward map) 250 data points

codecov bot commented Nov 11, 2022 • edited Loading

Codecov Report

odunbar commented May 3, 2023

odunbar commented May 5, 2023 • edited Loading

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented May 5, 2023

bors bot commented May 5, 2023

odunbar commented Oct 18, 2022 •

edited

Loading

codecov bot commented Nov 11, 2022 •

edited

Loading

odunbar commented May 5, 2023 •

edited

Loading