Merge pull request #38 from rffscghg/readme-and-tests

Add README and simple tests for higher level functions
rffscghg · Jul 30, 2023 · 5b77419 · 5b77419
2 parents 9cfcad1 + f20f1da
commit 5b77419
Show file tree

Hide file tree

Showing 4 changed files with 239 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -13,9 +13,34 @@ You probably also want to install the Mimi package into your Julia environment,
 pkg> add Mimi
 ```
 
-## Running the Model
+## The Model and the API
 
-The model uses the Mimi framework and it is highly recommended to read the Mimi documentation first to understand the code structure. This model presents two components, which will most often be used in tandem. The basic way to access the MimiRFFSPs components, both `RFF-SPs` and `RegionAggregatorSum` and explore the results is the following:
+The model uses the Mimi framework and it is highly recommended to read the Mimi documentation first to understand the code structure. This model presents two components, which will most often be used in tandem. 
+
+The basic way to access the MimiRFFSPs components, both `RFF-SPs` and `RegionAggregatorSum`, run a model or a Monte Carlo Simulation, and explore the results is the following
+
+### Running the Model
+
+To obtain the default version of the model, you may use the `get_model` function:
+
+```julia
+using MimiRFFSPs
+
+# Build the default model
+m = MimiRFFSPs.get_model()
+
+# Run the model
+run(m)
+
+# Explore interactive plots of all the model output.
+explore(m)
+
+# Access a specific variable
+emissions = m[:rffsp, :gdp]
+emissions_df = getdataframe(m, :rffsp, :gdp)
+```
+
+For some insight on the innerworkings of the `get_model` function, the following code uses the required `Mimi` functions to build the model and will return the same results. Note that here we rename the `SPs` component `rffsp` for clarity, but without that Symbol in the `add_comp!` call the component will default to the name `:SPs`.
 
 ```julia
 using Mimi 
@@ -28,13 +53,12 @@ m = Model()
 set_dimension!(m, :time, 1750:2300)
 
 # Add the Sps component as imported from `MimiRFFSPs`
-
-add_comp!(m, MimiRFFSPs.SPs, first = 2020, last = 2300)
+add_comp!(m, MimiRFFSPs.SPs, :rffsp, first = 2020, last = 2300) # note we name the component :rffsp here
 
 # Set country dimension and related parameter: As of now this must be exactly the 184 countries in the following file, but we will add flexibility for this in the future.
 all_countries = load(joinpath(@__DIR__, "data", "keys", "MimiRFFSPs_ISO3.csv")) |> DataFrame
 set_dimension!(m, :country, all_countries.ISO3)
-update_param!(m, :SPs, :country_names, all_countries.ISO3) # should match the dimension
+update_param!(m, :rffsp, :country_names, all_countries.ISO3) # should match the dimension
 
 # Run the model
 run(m)
@@ -43,13 +67,52 @@ run(m)
 explore(m)
 
 # Access a specific variable
-emissions = m[:SPs, :gdp]
+emissions = m[:rffsp, :gdp]
+emissions_df = getdataframe(m, :rffsp, :gdp)
 ```
-The `id` parameter in this component (default id of 6546) which allows one to run the model with a specified parameter set of ID `id` within the data. By nature of these projections, this component should be run using a Monte Carlo Simulation sampling over the IDs in order to obtain a representative distribution of plausible outcomes, but providing an ID may be useful for debugging purposes. See the section below for more information.
+Importantly, also note that the `rffsp` component has optional arguments `start_year` (default = 2020) and `end_year` (default = 2300) that can be altered to values within the 2020 - 2300 timeframe. Timesteps must be annual.
+
+### The `id` parameter and Monte Carlo Simulations
+
+The `id` parameter in this component (default id of 6546) which allows one to run the model with a specified parameter set of ID `id` within the data. By nature of these projections, this component should be run using a Monte Carlo Simulation sampling over the IDs in order to obtain a representative distribution of plausible outcomes, but providing an ID may be useful for debugging purposes. 
+
+The `get_mcs` function defines a simple Monte Carlo simulation. It returns a `Mimi.SimulationDef` obect, which can be `run` and explore using the [`run` function](https://www.mimiframework.org/Mimi.jl/stable/howto/howto_3/#.-The-[run](@ref)-function-1) and more generally the [Mimi API for Monte Carlo Simulations](https://www.mimiframework.org/Mimi.jl/stable/howto/howto_3/). By default the mcs assigns random variable taking on values of 1 through 10,000 with equal probability to the sample ID `id`, alternatively the user may provide a predetermined vector of `sampling_ids` holding the Integer values of the samples they would like run, in order.
+
+```julia
+using Mimi
+using MimiRFFSPs
 
-Also note that the `SPs` component has optional arguments `start_year` (default = 2020) and `end_year` (default = 2300) that can be altered to values within the 2020 - 2300 timeframe. Timesteps must be annual as of now, but we will add flexibility to this in the future.
+# get the SimulationDef
+mcs = MimiRFFSPs.get_mcs()
+
+# run the Monte Carlo Simulation on model `m` for 10 trials and return the results
+m = MimiRFFSPs.get_model()
+
+# Add some data to save
+Mimi.add_save!(mcs, (:rffsp, :id))
+Mimi.add_save!(mcs, (:rffsp, :co2_emissions))
+
+# run the mcs
+results = run(mcs, m, 10)
+
+# Explore the resulting distributions of co2 emissions and ID
+explore(results)
+
+# Get tabular data on outputs
+ids = getdataframe(results, :rffsp, :id)
+co2_emissions = getdataframe(results, :rffsp, :co2_emissions)
+
+# Alternatively run the Monte Carlo Simulation on model `m` for sample ids 1,2, and 3
+# note here that `num_trials` provided (3) must be shorter than or equal to the 
+# length of the provided vector of IDs
+mcs = MimiRFFSPs.get_mcs([1,2,3])
+Mimi.add_save!(mcs, (:rffsp, :id))
+Mimi.add_save!(mcs, (:rffsp, :co2_emissions))
+results = run(mcs, m, 3)
+
+```
 
----
+### Aggregating by Region
 
 If a user wants to connect the `m[:SPs, :population]` output variable to another Mimi component that requires population at a more aggregated regional level, the `RegionAggregatorSum` component can be helpful. This helper component aggregates countries to regions with a provided mapping via the `sum` function (other functions can be added as desired, this is a relatively new and nimble component). You will need to provide a mapping between the input regions (countries here) and output regions (regions here) in a Vector of the length of the input regions and each element being one of the output regions. Note that this component is not yet optimized for performance.
 
@@ -72,7 +135,7 @@ update_param!(m, :PopulationAggregator, :output_region_names, outputregions)
 update_param!(m, :PopulationAggregator, :input_output_mapping, mapping.Output_Region) # Vector with length of input regions, each element matching an output region in the output_region_names parameter (and outputregions dimension)
 
 # Make SPs component `:population` variable the feed into the `:input` variable of the `PopulationAggregator` component
-connect_param!(m, :PopulationAggregator, :input, :SPs, :population)
+connect_param!(m, :PopulationAggregator, :input, :rffsp, :population)
 
 run(m)
 

diff --git a/src/MimiRFFSPs.jl b/src/MimiRFFSPs.jl
@@ -47,7 +47,7 @@ function get_mcs(sampling_ids::Union{Vector{Int}, Nothing} = nothing)
  mcs = @defsim begin
  end
 
- distrib = isnothing(sampling_ids) ? EmpiricalDistribution(collect(1:10_000)) : SampleStore(sampling_ids)
+ distrib = isnothing(sampling_ids) ? Mimi.EmpiricalDistribution(collect(1:10_000)) : Mimi.SampleStore(sampling_ids)
  Mimi.add_RV!(mcs, :socio_id_rv, distrib)
  Mimi.add_transform!(mcs, :rffsp, :id, :(=), :socio_id_rv)
 

diff --git a/test/runtests.jl b/test/runtests.jl
@@ -12,3 +12,7 @@ end
 @testset "Coupled" begin
  include("test_Coupled.jl")
 end
+
+@testset "API" begin
+ include("test_API.jl")
+end
diff --git a/test/test_API.jl b/test/test_API.jl
@@ -0,0 +1,161 @@
+using MimiRFFSPs
+
+# test higher level functionality
+
+##
+## 1. get_model 
+##
+
+# basic function
+m = MimiRFFSPs.get_model()
+run(m)
+
+# validation
+tolerance = 1e-9
+id = Int(500) # choose any random value between 1 and 10_000
+
+m = MimiRFFSPs.get_model()
+update_param!(m, :rffsp, :id, id)
+run(m)
+
+# check emissions
+ch4 = load(joinpath(datadep"rffsps_v5", "emissions", "rffsp_ch4_emissions.csv")) |> 
+ DataFrame |> @filter(_.year in collect(2020:2300)) |> @filter(_.sample == id) |> DataFrame
+n2o = load(joinpath(datadep"rffsps_v5", "emissions", "rffsp_n2o_emissions.csv")) |> 
+ DataFrame |> @filter(_.year in collect(2020:2300)) |> @filter(_.sample == id) |> DataFrame
+co2 = load(joinpath(datadep"rffsps_v5", "emissions", "rffsp_co2_emissions.csv")) |> 
+ DataFrame |> @filter(_.year in collect(2020:2300)) |> @filter(_.sample == id) |> DataFrame
+
+@test m[:rffsp, :co2_emissions][findfirst(i -> i == 2020, collect(1750:2300)):end] ≈ co2.value atol = tolerance
+@test m[:rffsp, :ch4_emissions][findfirst(i -> i == 2020, collect(1750:2300)):end] ≈ ch4.value atol = tolerance
+@test m[:rffsp, :n2o_emissions][findfirst(i -> i == 2020, collect(1750:2300)):end] ≈ n2o.value atol = tolerance
+
+# check socioeconomics
+
+t = Arrow.Table(joinpath(datadep"rffsps_v5", "pop_income", "rffsp_pop_income_run_$id.feather"))
+socio_df = DataFrame( :Year => copy(t.Year), 
+ :Country => copy(t.Country), 
+ :Pop => copy(t.Pop), 
+ :GDP => copy(t.GDP)
+ ) |>
+ @filter(_.Year in collect(2020:5:2300)) |>
+ DataFrame
+
+for country in all_countries.ISO3
+
+ pop_data_model = getdataframe(m, :rffsp, :population) |>
+ @filter(_.time in collect(2020:5:2300) && _.country == country) |>
+ DataFrame |>
+ @orderby(:time) |>
+ DataFrame
+
+ gdp_data_model = getdataframe(m, :rffsp, :gdp) |>
+ @filter(_.time in collect(2020:5:2300) && _.country == country) |>
+ DataFrame |>
+ @orderby(:time) |>
+ DataFrame
+
+ socio_df_country = socio_df |>
+ @filter(_.Year in collect(2020:5:2300) && _.Country == country) |>
+ DataFrame |>
+ @orderby(:Year) |>
+ DataFrame
+
+ @test pop_data_model.population ≈ socio_df_country.Pop ./ 1e3 atol = tolerance
+ @test gdp_data_model.gdp ≈ socio_df_country.GDP ./ 1e3 .* MimiRFFSPs.pricelevel_2011_to_2005 atol = tolerance
+end
+
+socio_gdf = groupby(socio_df, :Year)
+
+model_population_global = getdataframe(m, :rffsp, :population_global) |> @filter(_.time in collect(2020:5:2300)) |> @orderby(:time) |> DataFrame
+model_gdp_global = getdataframe(m, :rffsp, :gdp_global) |> @filter(_.time in collect(2020:5:2300)) |> @orderby(:time)|> DataFrame
+
+@test model_population_global.population_global ≈ (combine(socio_gdf, :Pop => sum).Pop_sum ./ 1e3) atol = tolerance
+@test model_gdp_global.gdp_global ≈ (combine(socio_gdf, :GDP => sum).GDP_sum ./ 1e3 .* MimiRFFSPs.pricelevel_2011_to_2005) atol = 1e-7 # slightly higher tolerance
+
+# check death rate
+pop_trajectory_key = (load(joinpath(datadep"rffsps_v5", "sample_numbers", "sampled_pop_trajectory_numbers.csv")) |> DataFrame).x
+deathrate_trajectory_id = convert(Int64, pop_trajectory_key[id])
+
+# Load Feather File
+original_years = collect(2023:5:2300)
+t = Arrow.Table(joinpath(datadep"rffsps_v5", "death_rates", "rffsp_death_rates_run_$(deathrate_trajectory_id).feather"))
+deathrate_df = DataFrame(:Year => copy(t.Year), 
+ :Country => copy(t.ISO3), 
+ :DeathRate => copy(t.DeathRate)
+ ) |>
+ @filter(_.Year in original_years) |>
+ DataFrame
+
+for country in all_countries.ISO3
+
+ deathrate_data_model = getdataframe(m, :rffsp, :deathrate) |>
+ @filter(_.time in original_years && _.country == country) |>
+ DataFrame |>
+ @orderby(:time) |>
+ DataFrame
+
+ deathrate_df_country = deathrate_df |> 
+ @filter(_.Year in original_years && _.Country == country) |>
+ DataFrame |>
+ @orderby(:Year) |>
+ DataFrame
+
+ @test deathrate_data_model.deathrate ≈ deathrate_df_country.DeathRate atol = tolerance
+end
+
+# check pop 1990
+
+population1990 = load(joinpath(@__DIR__, "..", "data", "population1990.csv")) |> 
+ DataFrame |>
+ @orderby(_.ISO3) |>
+ DataFrame
+
+@test m[:rffsp, :population1990] ≈ population1990.Population atol = tolerance
+
+# check gdp 1990
+
+ypc1990 = load(joinpath(datadep"rffsps_v5", "ypc1990", "rffsp_ypc1990.csv")) |> 
+ DataFrame |> 
+ i -> insertcols!(i, :sample => 1:10_000) |> 
+ i -> stack(i, Not(:sample)) |> 
+ DataFrame |> 
+ @filter(_.sample == id) |>
+ DataFrame |>
+ @orderby(_.variable) |>
+ DataFrame
+
+gdp1990_model = getdataframe(m, :rffsp, :gdp1990) # billions
+pop1990_model = getdataframe(m, :rffsp, :population1990) # millions
+ypc1990_model = gdp1990_model.gdp1990 ./ pop1990_model.population1990 .* 1e3 # per capita
+
+@test ypc1990_model ≈ ypc1990.value .* MimiRFFSPs.pricelevel_2011_to_2005 atol = tolerance
+
+# 2. get_mcs
+
+# get the SimulationDef
+mcs = MimiRFFSPs.get_mcs()
+
+# run the Monte Carlo Simulation on model `m` for 10 trials and return the results
+m = MimiRFFSPs.get_model()
+
+# Add some data to save
+Mimi.add_save!(mcs, (:rffsp, :id))
+Mimi.add_save!(mcs, (:rffsp, :co2_emissions))
+
+# run the mcs
+results = run(mcs, m, 10)
+
+# examine outputs
+ids = getdataframe(results, :rffsp, :id).id
+for id in ids
+ @test id in collect(1:10_000)
+end
+
+# Alternatively run the Monte Carlo Simulation on model `m` for sample ids 1,2, and 3
+mcs = MimiRFFSPs.get_mcs([1,2,3])
+Mimi.add_save!(mcs, (:rffsp, :id))
+Mimi.add_save!(mcs, (:rffsp, :co2_emissions))
+results = run(mcs, m, 3)
+
+@test getdataframe(results, :rffsp, :id).id == [1,2,3]