Add options to handle larger dataset for location models #687
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently, the
location_choice.py
used byfrom activitysim.estimation.larch import component_model
cannot load the estimation data in reasonable length of time. See issue #686.Description of the problem
The main reason why that is the case is that activitysim uses a chooser-variable table by alternatives – “cv” (zone 1, 2, etc. as columns, and attributes as rows), whereas larch uses a idca table by attributes – “ca” (dist, accessibility, etc. attributes as columns and zones as rows). In order to work with larch, activitysim converts the table format using the cv_to_ca function. However, this function does not scale well with large tables, and we found some modification is required to process the data.
Changes proposed
Here is a summary of the changes we did to
location_choice.py
to ensure the workplace location model data can be processed faster:alt_values_to_feather=False
(which is the default) prevent errors. This however should not happen if location model specifications are properly written.chunking_size=None
, which fall back to the original method of processing the entire dataset.Usage
In the estimation notebook for school or workplace location models, modify the way model and data are loaded with
activitysim.estimation.larch.component_model
: