aqp::harmonize - method for profile level denormalization (of Lo-RV-Hi or similar) #164

brownag · 2020-09-03T19:47:52Z

soilDB issue 140 raises the issue of comparison of SPC data from different sources as well as among "similar" attributes within a single data source:

ncss-tech/soilDB#140

If columns reflecting a similar property within a layer have different names (e.g. socQ05, socQ50, socQ95; clay, clayQ50, clay_spline, clayQ50_spline) it is inconvenient for them to be shown on a common scale or analyzed in long form within a single vector. In general this is by design but for plots/sketches in particular it becomes an issue.

This would easily encapsulate what I have seen @dylanbeaudette do before, and what I show in the Gist (https://gist.github.com/brownag/d69b899253eef505e1771e7adbef37ad). In my gist, rather than a range I compare different data sources/splines of the RV by unioning separate SPCs that share a common [calculated] attribute finalclay. I don't think doing it manually is particularly difficult in simple case, but it would become cumbersome when dealing with multiple properties or many data sources.

The three-statistic representation of variation / common scale for a property across Low,RV,HI is important for showing "variation" in a concept. I think what Dylan described in part 3 of his comment on ncss-tech/soilDB#140 describe is a matter of dropping/renaming specific columns from the horizon data (that match a set of patterns), appending on profile IDs for each set to make them unique, then union-ing the result [or returning a list of SPCs]. This sequence of operations isn't "fetching" of the data or specific to the SoilGrids data model -- it is a view and it can be implemented generically in terms of aqp methods (profile_id<-, union). The method would apply to any product [or stack of products] where multiple values are reported for a single property*layer and stored in a single "parent" SPC.

I'll post a prototype of this soon.

The text was updated successfully, but these errors were encountered:

dylanbeaudette · 2020-09-04T21:58:45Z

Thanks. This is a far more general solution that I had put together in my private soilgrids-related functions.

The following does not work as expected. Did I miss something?

library(aqp)
library(soilDB)

x <- fetchSDA(WHERE = "cokey = '19623334'", duplicates = TRUE)
z <- harmonize(x, x.names = list(clay = c(low = 'claytotal_l', rv = 'claytotal_r', high = 'claytotal_h')))

brownag · 2020-09-05T00:52:24Z

I appreciate you testing this out with a realistic example -- as it points out far more problematic issue with my recent work. I admittedly got so caught up "generalizing" that I didn't try with fetchSDA...

Incidentally, the above bug reveals a rather hefty inconsistency and assumption on my part.

aqp:::.data.frame.j was not properly designed for re-arranging the column names [tangential fix here: e59258a] . Presumably in some cases this could result in corrupt show output? I don't think that I ever noticed that

In harmonize, there is an explicit order of the "preserved" columns that is broken by the fetchSDA result. fetchSDA first column in horizon data is chkey with cokey way after [in the middle of the table]. This got some stuff twisted where I was assuming things got rearranged correctly [idname, top depth, bottom depth, harmonized columns, keep columns, hzidname] as they were with data.frame [,j].

Had I fully developed the tests to "work" these parts in situations requiring re-arranging I probably would have come across this... but that wouldn't have been at least until next week some time.

dat <- data.frame(a=1, b=2)
dat[,c("b","a")]
#  b a
#1 2 1

The fix is here: c519a19; now works as expected:

library(aqp)
library(soilDB)

x <- fetchSDA(WHERE = "cokey = '19623334'", duplicates = TRUE)
z <- harmonize(x, x.names = list(clay = c(low = 'claytotal_l', rv = 'claytotal_r', high = 'claytotal_h')))

plot(z, color = "clay", plot.order = c(2,3,1))

dylanbeaudette · 2020-09-30T05:38:04Z

Seems like we are done here, yes?

brownag mentioned this issue Sep 3, 2020

Point comparisons of NASIS/KSSL v.s. SoilGrids using field/standard depth aggregation and mpspline'd data ncss-tech/soilDB#140

Closed

brownag changed the title ~~method for profile level denormalization (of Lo-RV-Hi or similar)~~ aqp::harmonize - method for profile level denormalization (of Lo-RV-Hi or similar) Sep 4, 2020

brownag added a commit that referenced this issue Sep 4, 2020

add harmonize + docs + test #164

818fb62

brownag mentioned this issue Sep 4, 2020

add aqp::harmonize #165

Merged

brownag added a commit that referenced this issue Sep 4, 2020

wordsmithing docs #164

2fd2ce6

brownag added a commit that referenced this issue Sep 5, 2020

fix for data.table-safe reordering of columns #164

c519a19

brownag added a commit that referenced this issue Sep 5, 2020

test: harmonize is immune to column name order #164

6b08626

dylanbeaudette closed this as completed Sep 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aqp::harmonize - method for profile level denormalization (of Lo-RV-Hi or similar) #164

aqp::harmonize - method for profile level denormalization (of Lo-RV-Hi or similar) #164

brownag commented Sep 3, 2020 •

edited

Loading

dylanbeaudette commented Sep 4, 2020

brownag commented Sep 5, 2020 •

edited

Loading

dylanbeaudette commented Sep 30, 2020

aqp::harmonize - method for profile level denormalization (of Lo-RV-Hi or similar) #164

aqp::harmonize - method for profile level denormalization (of Lo-RV-Hi or similar) #164

Comments

brownag commented Sep 3, 2020 • edited Loading

dylanbeaudette commented Sep 4, 2020

brownag commented Sep 5, 2020 • edited Loading

dylanbeaudette commented Sep 30, 2020

brownag commented Sep 3, 2020 •

edited

Loading

brownag commented Sep 5, 2020 •

edited

Loading