Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to MAAT model wrappers #2137

Merged
merged 23 commits into from
Jan 28, 2019

Conversation

serbinsh
Copy link
Member

@serbinsh serbinsh commented Oct 4, 2018

Description

Updated models/maat/R/model2netcdf.MAAT.R to match new time variable standards in PEcAn standard output. Updates to scripts in /inst to provide new met drivers from NGEE Tropics met source files. A WIP given the inconsistencies in NGEE source data

Motivation and Context

Match time variable requirements in PEcAN standard output

Review Time Estimate

  • [X ] Immediately
  • Within one week
  • When possible

Types of changes

  • [X ] Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • [X ] I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Shawn P. Serbin added 2 commits October 3, 2018 16:15
…le standards in PEcAn standard output. Updates to scripts in /inst to provide new met drivers from NGEE Tropics met source files. A WIP given the inconsistencies in NGEE source data
@serbinsh serbinsh self-assigned this Oct 4, 2018
@serbinsh serbinsh requested a review from mdietze October 4, 2018 21:21
ashiklom and others added 7 commits November 2, 2018 17:29
Co-Authored-By: serbinsh <sserbin@bnl.gov>
Co-Authored-By: serbinsh <sserbin@bnl.gov>
Co-Authored-By: serbinsh <sserbin@bnl.gov>
Co-Authored-By: serbinsh <sserbin@bnl.gov>
Co-Authored-By: serbinsh <sserbin@bnl.gov>
Co-Authored-By: serbinsh <sserbin@bnl.gov>
leaf_user_met_list <- list(leaf = list(env = list(time = "'Time'", temp = "'Tair_degC'",
par = "'PAR_umols_m2_s'",vpd="'VPD_kPa'",
atm_press="'Atm_press_Pa'")))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is heavily redundant, for no good reason. I would suggest:

output_met_driver <- data.frame(
  Time = met_driver_subset$Time,
  Year = met_years, 
  DOY = lubridate::yday(met_driver_subset$Time),
  Hour = strftime(met_driver_subset$Time,"%H:%M:%S", tz="UTC"),
  Tair_degC = met_driver_subset$Tair_degC,
  Prec_mm = met_driver_subset$Prec_mm,
  Atm_press_Pa = met_driver_subset$Press_Pa,
  RH_perc = met_driver_subset$RH_perc,
  VPD_kPa = met_driver_subset$VPD_kPa,
  PAR_umols_m2_s = met_driver_subset$PAR_umols_m2_s
)

leaf_user_met_list <- list(leaf = list(
  env = list(
    time = "'Time'",
    temp = "'Tair_degC'",
    par = "'PAR_umols_m2_s'",
    vpd = "'VPD_kPa'",
    atm_press = "'Atm_press_Pa'"
)))

if (wind) {
  output_met_driver[["Windspeed_m_s"]] <- met_driver_subset$Windspeed_m_s
  leaf_user_met_list[["leaf"]][["env"]][["wind"]] <- "'Windspeed_m_s'"
}

@@ -62,7 +67,7 @@ model2netcdf.MAAT <- function(rundir, outdir, sitelat = -999, sitelon = -999, st
PEcAn.logger::logger.info(paste0("Skipping conversion for: ", newname))
dat.new <- dat
} else {
dat.new <- try(PEcAn.utils::misc.convert(dat,oldunits,newunits),silent = TRUE)
dat.new <- apply(as.matrix(dat,length(dat),1),1,f_sort)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is misc.convert already vectorized, and robust to infinite/NA values? If so, why not just do this?

dat.new <- PEcAn.utils::misc.convert(dat, oldunits, newunits)
dat.new[!is.finite(dat.new)] <- -999

Then, you also don't need the f_sort function above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not robust to inf/NA which is why I had worked around that

@@ -141,7 +153,7 @@ model2netcdf.MAAT <- function(rundir, outdir, sitelat = -999, sitelon = -999, st
out.year <- as.numeric(rep(year, sub.maat.output.dims[1]))
output <- var_update(out.year, output, "Year", "Year", oldunits='YYYY', newunits=NULL, missval=-999,
longname="Simulation Year", ncdims=ncdims)
output <- var_update(sub.maat.doy + day.steps, output, "FracJulianDay", "FracJulianDay", oldunits='Frac DOY', newunits=NULL, missval=-999,
output <- var_update(tvals, output, "FracJulianDay", "FracJulianDay", oldunits = "Frac DOY", newunits = NULL, missval = -999,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're defining var_update inside this function anyway (which I think is bad style, but no need to change), you might as well give it missval and ncdims as defaults to save you all the typing.

longname = "history time interval endpoints",
dim = list(time_interval, time = t),
prec = "double")
output$dat[[length(output$dat) + 1]] <- c(rbind(bounds[, 1], bounds[, 2]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what having the output object formatted as output$var$<somevariable> and output$dat$<somevariable> gains you, but it causes you a lot of headache and awkward code that could easily by avoided by inverting the structure, such that you have output$<somevariable>$dat/var. Then, you could assign it with a single list call, and loop over it more simply later:

output <- list(
  year = var_update(out.year, ...),
  leaf_assim = var_update(sub.maat.output$A, ...),
  leaf_resp = var_update(sub.maat.output$rd, ...),
  ...
)
# ...
for (out_var in output) {
  ncvar_put(ncout, out_var$avr, out_var$dat)
}

@ashiklom
Copy link
Member

ashiklom commented Nov 6, 2018

@serbinsh I accidentally pushed my refactor up here, but I think it is an improvement on the current code. You should test it and see if it still works (I would, but I don't have any MAAT output to work with), and if you're happy with it, this PR can move forward.

@serbinsh
Copy link
Member Author

serbinsh commented Nov 9, 2018

@ashiklom OK so you you already pushed the suggested changes you show here? I can test and let you know. I suppose I will need to pull this down locally and merge first

@ashiklom
Copy link
Member

ashiklom commented Nov 9, 2018

Yeah, sorry, that was a mistake. You can pull down the changes locally and test. Or just revert my commit -- we can keep it in the history and un-revert it later.

dim = list(time_interval, time = t),
prec = "double"),
dat = c(rbind(bounds[, 1], bounds[, 2]))
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found the issue! The list call here needs to be wrapped in another list to get this to work as expected.

    output <- c(output, list(list(
      var = ncdf4::ncvar_def(name = "time_bounds", units = "", 
                             longname = "history time interval endpoints", 
                             dim = list(time_interval, time = t), 
                             prec = "double"),
      dat = c(rbind(bounds[, 1], bounds[, 2]))
    )))

Basically, because of the way c.list works, this is currently adding two elements to the output list (var and dat), when it should be adding a single element which is itself a list containing var and dat. By wrapping the list in another list call, you are forcing it to be a single element which is a list containing two objects inside of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still seems like something is not quite right here:

    ## write netCDF data
    nc_vars <- purrr::map(output, "var") # Extract var from each output
    ncout <- ncdf4::nc_create(file.path(outdir, paste(year, "nc", sep = ".")), nc_vars)
    on.exit(ncdf4::nc_close(ncout))
    ncdf4::ncatt_put(ncout, "time", "bounds", "time_bounds", prec=NA)
    for (output_var in output) {
      #print(i)  # for debugging
      ncdf4::ncvar_put(ncout, output_var$var, output_var$dat)
    }

should that be nc_vars in output? Also, getting a dimension error writing to the nc file, perhaps the set dims are wrong

  ncvar_put: error: you asked to write 17520 values, but the passed data array only has 1 entries!
>     for (output_var in output) {
+       print(output_var$var$name)  # for debugging
+       ncdf4::ncvar_put(ncout, output_var$var, output_var$dat)
+     }
[1] "Year"
Error in ncdf4::ncvar_put(ncout, output_var$var, output_var$dat) :
  ncvar_put: error: you asked to write 17520 values, but the passed data array only has 1 entries!
>

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output[[1]]$var$ndims
[1] 3

$ndims
[1] 3

$varsize
[1] 1 1 17520

hmmm......

Copy link
Member

@ashiklom ashiklom Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, the design here is that output is a list of objects, where each object is a list containing var (the NetCDF variable) and dat (the underlying data values).

purrr::map(output, "var") is convenient shorthand for lapply(output, function(x) x[["var"]]) -- i.e. extracting the thing called var from each object in output. It seems like that's working, since the variable's dimension is correctly identified.

The problem seems to be with the dat part of output.

Copy link
Member

@ashiklom ashiklom Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, another guess at the problem: The year ("Year") variable is a constant -- just one value -- but its variable dimensions are set to the number of timesteps. So the solution is to either:

(1) Cycle the year variable across the dimensions of the NetCDF file (year <- rep(year, ncdims[[X]]) where X is whatever the correct dimension is -- I think 3?), or

(2) Set the dimensions of the "Year" variable to 1 (or [1,1,1]).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just kidding. I think maybe the _real_problem is that the data being passed into process_met_tbl is a length 1 list. In my tibble::tribble calls, try getting rid of the list(...) surrounding each data element -- turns out code like this does exactly what I want without wrapping complex elements in list (because x is automatically determined to be a list column):

a <- 1:10
b <- 2:15
tibble::tribble(
  ~x, ~y,
  a, 5,
  b, 27
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashiklom I am getting lost in this changes at this point because I havent had uninterrupted attention on this, Can we please put this PR to bed this week so I can move on? Perhaps we can just figure out what else I need to do to make this work on close it, we can work on further edits later.

@mdietze mdietze merged commit d6baacb into PecanProject:develop Jan 28, 2019
serbinsh pushed a commit to serbinsh/pecan that referenced this pull request Jan 28, 2019
serbinsh pushed a commit to serbinsh/pecan that referenced this pull request Jan 28, 2019
mdietze added a commit that referenced this pull request Jan 29, 2019
Reverting back from PR #2137 to fix issues with MAAT wrappers.
@robkooper robkooper mentioned this pull request Sep 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants