aggregate datasets into useful structure before returning #31

katrinleinweber · 2018-01-25T21:55:53Z

noticed while working on #16

retrieve_data() currently appends multiple downloads into a continuous list in which the datasets can't be addressed anymore. We need a data structure, that lets the user $-address the datasets, and their fields. Ideally, each dataset is referred to by index = bacdive_id. Something like a sparse list-of-lists?!?

ideas:

~~aggregate JSON strings in character vector, then rjson::fromJSON() them "in-place" or somehow that creates the nested lists "below / as lower hierarchies" of that vector~~
~~write-out each dataset to a file (kind of a local cache), then maybe concatenate files & re-import as a useful data structure~~
~~use jsonlite to create 1 dataframe per bacdive_ID, then add those to a list~~
~~keep on c()ombining downloads, but~~ aggregate into a higher-level list and use an apply variant to extract a field/element from the resulting "megastructure"

The text was updated successfully, but these errors were encountered:

katrinleinweber · 2018-03-12T15:12:06Z

jsonlite::fromJSON(…, flatten = TRUE) and simplifyDataFrame = TRUE both still return a list of nested lists with DFs as "leaves". Still need to work out how to extract a field/element (say culture_growth_condition$culture_temp$temp from a combination of these list-of-lists :-/

katrinleinweber · 2018-03-15T10:11:43Z

@sckott: Hello, and thanks for your advice! I got over this data structure problem :-)

katrinleinweber · 2018-04-18T14:49:30Z

For comparison with the above screen shot: between

a) data above / Bac_hal_data in this example, and
c) the lists (taxonomy_name, morphology_physiology, …, environment_sampli…, etc.) within the datasets, is now
b) a list-of-list for each dataset, named by its numeric BacDive ID (1095 & 1847)

katrinleinweber added bug help wanted labels Jan 25, 2018

katrinleinweber self-assigned this Jan 25, 2018

katrinleinweber mentioned this issue Mar 7, 2018

use more accessible datatype than list to aggregate data #28

Closed

katrinleinweber mentioned this issue Mar 20, 2018

Remove invalid \n in JSON #43

Closed

katrinleinweber closed this as completed in 949a675 Mar 20, 2018

katrinleinweber mentioned this issue Aug 29, 2018

Brainstorming topic ideas for follow-up events / study group TIBHannover/2018-07-09-FAIR-Data-and-Software#13

Closed

katrinleinweber mentioned this issue Nov 30, 2018

Converting retrieve_data() results to a data frame (tibble) #100

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aggregate datasets into useful structure before returning #31

aggregate datasets into useful structure before returning #31

katrinleinweber commented Jan 25, 2018 •

edited

Loading

katrinleinweber commented Mar 12, 2018

katrinleinweber commented Mar 15, 2018

katrinleinweber commented Apr 18, 2018

aggregate datasets into useful structure before returning #31

aggregate datasets into useful structure before returning #31

Comments

katrinleinweber commented Jan 25, 2018 • edited Loading

katrinleinweber commented Mar 12, 2018

katrinleinweber commented Mar 15, 2018

katrinleinweber commented Apr 18, 2018

katrinleinweber commented Jan 25, 2018 •

edited

Loading