Skip to content

Project Meeting 2024.07.18

Michelle Bina edited this page Jul 18, 2024 · 3 revisions

Agenda

  • Admin
    • Revisit consortium meeting times to be more compatible with Australia time zones
  • Phase 9a wrap-up status

Action Items

  • Joe to send out a scheduling poll for a once-a-month meeting that is compatible with Australia time zones

Meeting Notes

  • Admin
    • Revisit consortium meeting times to be more compatible with Australia time zones
      • Joe to send out a scheduling poll for a once-a-month meeting that is compatible with Australia time zones
    • Meeting cadence: Canceling the next several Tuesday meetings so that consultant time can be focused on Phase 9b work. Tuesday meetings will resume on September 10.
  • Phase 9a wrap-up status
    • SANDAG model runs
      • Full Scale Performance: Multi-Process, Sharrow Off
        • The non-mandatory tour scheduling model was uniquely bad as more cores were added...not sure why this might be. This was not seen in the sharrow on version of the code.
        • Lowest run time is about 360 minutes.
        • Sijia ran similar tests with explicit chunking and had similar results.
      • Full Scale Performance: Multi-Process, Sharrow On
        • Main runtime savings comes from the interaction simulate models, particularly trip destination and location choice.
        • Simple simulate models are relatively unaffected by the increase in the number of cores.
        • Lowest run time is about 180 minutes, a 50% reduction in runtime with sharrow on.
        • The time it takes to apportion the data into more cores increases with the number of cores and this leads to an inflection point in the runtime.
      • Both the sharrow on and sharrow off versions of the ABM3 model had a minimum runtime with 20 cores. This suggests that the optimum number of cores is potentially independent of sharrow, but rather a function of the model and the machine.
    • MTC model runs
      • Full Scale Performance: Sharrow Off
        • Same sort of diminishing returns as seen in all the other profiling runs when adding more and more cores.
        • Interestingly, there was no inflection point for the MTC model -- more cores means lower runtime. This was different with sharrow on and different compared to both sharrow on and off ABM3 benchmarking runs.
        • Lowest run time is about 200 minutes.
      • Full Scale Performance: Sharrow On
        • 16 and 24 core runs are incomplete due to Multiprocess Fails Accessing Sharrow Cache
        • Saw roughly linear decreases in runtime for computationally intensive models going from 4 to 12 cores, but after that the gains decreased.
        • 20 cores took longer than 12 cores. This is due to some models being slower (school escorting, school location, joint tour scheduling, etc), and increased time spend apportioning and coalescing all of the cores. However, this runtime difference was pretty minimal.
        • The runtime in the final activitysim.log file is slightly longer than the total in the timing_log.csv file across all runs. The difference increases with the number of cores.
        • Lowest run time is about 140 minutes, a 30% reduction in runtime with sharrow on.
    • Priority is to wrap up this phase of work, even though we know that some things are still unexplained (such as the non-mandatory tour scheduling having increasing longer run time with more cores, with sharrow off). Jeff to open issues of the still unexplained results that could be addressed later.
    • Documentation Updates
      • Adding section to the User's Guide
      • Suggestions for documentation to recommend:
        • Users should not use string variables and implement categoricals from the beginning (would no longer need to go through the preprocessor process of converting strings to categoricals, in the event that the preprocessor wouldn't catch all the possible combinations which might trigger sharrow recompiling in production runs if a corner case is found that wasn't found in the preprocessor in the compiled version).
        • Users should specify larger sample sizes for the sharrow compiling step, so that it can best capture the universe of string variables to be converted to categoricals. This isn’t fool proof - there’s still a chance that corner cases could happen and trigger a recompile.
    • Recommendation for other agencies to test out the new code on their implementations.
      • SEMCOG has tried the new code. They had a couple of hiccups but it’s working now.
      • PSRC is testing out the new code and it’s working. Run times are in the 50-60 minute range (they run 60-minute time buckets and not running vehicle choice, so their specs are contributing to lower run times).
      • ARC hasn’t tested the with-sharrow-on version – but without sharrow and multiprocessing, run times with 100% sample is about 3.5 hours (on WSP’s machines).
Clone this wiki locally