Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while converting Hmsc model object to JSON: Error in rcpp_to_json(x, unbox, digits, numeric_dates, factors_as_string, : negative length vectors are not allowed #190

Open
elgabbas opened this issue May 16, 2024 · 3 comments

Comments

@elgabbas
Copy link

elgabbas commented May 16, 2024

I am preparing data for HMSC-HPC. The model implements GPP at the European scale (52K sampling units) for 142 species and 9 covariates and

M_Init
# Hmsc object with 52729 sampling units, 142 species, 9 covariates, 1 traits and 1 random levels
# No posterior samples

I can start sampling with no problem

Model <- Hmsc::sampleMcmc(hM = M_Init, samples = M_samples, thin = M_thin, 
transient = M_transient, nChains = nChains,  verbose = verbose, engine = "HPC")

However, I receive the following error when I convert the model object into JSON format.

Model <- jsonify::to_json(Model)
# Error in rcpp_to_json(x, unbox, digits, numeric_dates, factors_as_string,  : 
#  negative length vectors are not allowed

I have used the same approach for a subset of the data (smaller study area and less number of species) without a problem. This error could be due to the large object I have [but please see my next comment below].

pryr::object_size(Model)
# 1.06 GB

Model %>% sapply(pryr::object_size) %>% divide_by(1024*1024) %>% sort() %>% round(2)
#    initPar     samples   transient        thin     nChains 
#      0.00        0.00        0.00        0.00        0.00 
#   verbose   nParallel   useSocket     adaptNf   alignPost 
#       0.00        0.00        0.00        0.00        0.00 
#   Rupdater initParList          X1          hM dataParList 
#       0.00        3.29        6.84      130.59      879.97 

pryr::object_size(Model$dataParList[[1]][[1]]$distMat12)
# 887.53 MB

Is there a solution for this?
Would the Hmsc-HPC work if I try another function to convert the model object to JSON other than the jsonify function?

Thanks

@elgabbas
Copy link
Author

elgabbas commented May 16, 2024

Update:

I have multiple model variants for the same locations and species. The conversion to JSON worked for some of them while some others failed. The difference between these model variants is the Knots used (location and distances between them), #samples/thin/transient values.

I think there should be no problem with file size or #samples/thin/transient combinations. I can export similar models employed knot distances of 20 and 40 km, but distances of 30, 50, and 60 km failed.

It is unclear why only the conversion failed for a particular GPP locations. Please note that I ensured that the locations of the GPP knots do not exactly overlap with the locations of sampling units by adding a small spatial noise (up to 100 m) if by chance any of the knots exactly overlap with the sampling units. See this issue.

I can share an example model object if this would help.

@gtikhonov
Copy link
Member

gtikhonov commented May 20, 2024

Your hypothesis that the size of Hmsc model object being converted to json is the core source of problems seems to be the most plausible one. We have observed somewhat similar issues with overflowing json format ourselves. There is definitely no issue with #samples/thin/transient at the stage of R->HPC export, since these values have no effect on the exported object size.
However, from your description of "knot distance" effect that you observed, I am not clear whether the jsonify conversion was less stable with smaller or with larger number of knots. Could you please report how many knots did you have in these variants that you've tried?

@elgabbas
Copy link
Author

elgabbas commented May 20, 2024

Thanks @gtikhonov for your reply,

Earlier, I tried the following distances. Distances of 20 and 40 km worked (15K and 4K knots), while distances 30, 50, and 60 km failed (7K, 2.8K, and 2K knots).

Distance # Knots WORKED?
20 km 15,046 WORKED
30 km 7,129 FAILED
40 km 4,290 WORKED
50 km 2,877 FAILED
60 km 2,096 FAILED

It seems this issue is not directly related to the number of knots used or object size. The model using 20 km knots is 8.82 GB and works while smaller models failed (30 km - 4.08 GB; 60 km - 1.58 GB).

I uploaded the unfitted models to this link.

load("Model_unfitted.RData")

nrow(Model_20$rL$sample$sKnot) # 15046
nrow(Model_30$rL$sample$sKnot) # 7129
nrow(Model_40$rL$sample$sKnot) # 4290
nrow(Model_50$rL$sample$sKnot) # 2877
nrow(Model_60$rL$sample$sKnot) # 2096

The following worked:

pryr::object_size(Model_20)    # 665.08 MB
Model_20 <- Hmsc::sampleMcmc(
  hM = Model_20, samples = 2000, thin = 5, transient = 1500, nChains = 4, verbose = 1000, engine = "HPC")
pryr::object_size(Model_20)    # 8.82 GB
Model_20_JSON <- jsonify::to_json(Model_20)
pryr::object_size(Model_20_JSON)


pryr::object_size(Model_40)    # 664.72 MB
Model_40 <- Hmsc::sampleMcmc(
  hM = Model_40, samples = 2000, thin = 5, transient = 1500, nChains = 4, verbose = 1000, engine = "HPC")
pryr::object_size(Model_40)    # 2.62 GB
Model_40_JSON <- jsonify::to_json(Model_40)

The following failed:

Error in rcpp_to_json(x, unbox, digits, numeric_dates, factors_as_string, : negative length vectors are not allowed

pryr::object_size(Model_30)    # 664.82 MB
Model_30 <- Hmsc::sampleMcmc(
  hM = Model_30, samples = 2000, thin = 5, transient = 1500, nChains = 4, verbose = 1000, engine = "HPC")
pryr::object_size(Model_30)    # 4.08 GB
Model_30_JSON <- jsonify::to_json(Model_30)

pryr::object_size(Model_50)    # 664.67 MB
Model_50 <- Hmsc::sampleMcmc(
  hM = Model_50, samples = 2000, thin = 5, transient = 1500, nChains = 4, verbose = 1000, engine = "HPC")
pryr::object_size(Model_50)    # 1.95 GB
Model_50_JSON <- jsonify::to_json(Model_50)

pryr::object_size(Model_60)    # 664.64 MB
Model_60 <- Hmsc::sampleMcmc(
  hM = Model_60, samples = 2000, thin = 5, transient = 1500, nChains = 4, verbose = 1000, engine = "HPC")
pryr::object_size(Model_60)    # 1.58 GB
Model_60_JSON <- jsonify::to_json(Model_60)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants