-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runs with same seed are not producing identical results #1236
Comments
This is a bug that we'll look at as a high priority. The DocumentReference and IDs will likely be fixable. The Array ordering is probably not something we can fix -- because the FHIR JSON export is done by an underlying dependency -- but we'll take a look. |
@KevinCranmer How frequently are you seeing the issue of resource IDs changing between runs? I'm consistently able to reproduce the first two issues (array ordering and insurance plan) and I'm working on fixes, but across a dozen tests I haven't seen any instances of resource IDs being different across runs. (Though I have found a separate issue of ImagingStudy DICOM UIDs occasionally differing) I'm wondering if I'm missing something or if this is just an exceptionally rare case we need to chase down |
@dehall It seems like the IDs changing only happens if I re-clone Synthea and run the same command. I'm cloning the same commit each time, so the Synthea versions would be the same. But this would mean if two machines were to both clone and run Synthea, they should experience the ID changing difference. Do you see this ID difference if you re-clone? |
Hmm I'm still not seeing it. My sequence of commands I ran is below. Are you running two fresh copies now and seeing different IDs, or are you running one fresh copy and comparing the output to an existing dataset you have? If anything changed in between, like any modifications to the code, that would produce different results. Though I'm very confused because the clinical content of all the records you sent is the same (minus the insurance bit), and if the code was changed then the IDs should always mismatch on all of your test records, not just sometimes. Your first diff has expired but your diff 2 shows an instance where the IDs do match.
|
I haven't been able to reproduce it today either. I had a hunch that it was more likely the longer it takes between runs. If I compare with my runs yesterday, I see the IDs changed but all runs today haven't changed IDs. I'm pretty confident I haven't changed anything. This is what I've been running:
|
Thanks that's helpful info. I also noticed this morning in your example of mismatching IDs https://www.diffchecker.com/OFqRg6FW/ , the IDs are not completely different, they are offset by 13: the first Encounter in the file on the right has the same ID as the Observation 13 resources down in the left file "e1239afc-e199-39f6-1bb1-7488c675e51c". Then as you go down resource-by-resource the IDs do line up (they are on different resources obviously but the sequence of IDs as you work through the resources is the same). That makes sense given how the random number generator works to pick IDs, but it means somehow the patient on the right had their random number generator called 13 times that the patient on the left didn't, in a spot that didn't affect any of the clinical content on the record. Is there any postprocessing done to these records? For example is it possible you previously filtered out certain resource types in a separate step and have switched to using the |
We do post processing to convert the resources into our Database versions of the resources, filtering out some records. However; that's all done after Synthea has ran and then we aren't tweaking Synthea's output. We have always used the |
I ran Synthea twice, changed my computers date time to 2 days from now (Sat Jan 14th) and ran Synthea a third time. I'm seeing the ID difference only on the third run. My diff from your above command is now mostly the ID difference. Here's a chunk:
|
Thanks for the extra testing -- that's really interesting that the date has an effect on it, but that could explain why I haven't been able to replicate it by running multiple times in short succession. Maybe our "reference date" logic isn't as consistent as we expected and something from the current date/time sneaks in. I'll give that a shot as well |
You had mentioned that the ID differences means the randomNumberGenerator was being called more in one run than another. I had read this response about how Synthea will re-create patients that are deceased trying to get the population size: I'm wondering if my third run had created one or two patients that were deceased Jan 14th but were alive Jan 12th, So Synthea had to re-create these (now) deceased patients; whereas before, the patients were alive and only created once. This could explain the extra randomNumberGenerator calls. Edit: I took a look at the Synthea output from the three runs and they all only had 14 people listed as DECEASED and I would've expected 15/16 on the third run. So perhaps this is not the issue. |
Ok that last clue of the date being relevant solved it. I had thought the Reference Date config setting is also when the simulation ends, but no those are separate settings. So even though you specify the reference date, the end date is set to "today" by default and if you run the same set of patients a week later the simulation runs a week longer. (I'm a little surprised you didn't get any records that have additional data as a result.) The simulation running a little longer means the random number generator gets called a few more times for certain patients, resulting in the IDs being different as we saw.
Feel free to change 20230112 to another YYYYMMDD of your choice. More broadly I want to do the following:
|
Awesome! Thank you so much for looking into this quickly. It is much appreciated! |
@KevinCranmer just confirming we haven't forgotten about this - I have a PR up on branch |
@dehall Thanks for working on this so quickly. I just tested and saw that all my original reproducibility issues are no longer appearing, even without the Something I've recently noticed is that I'm getting different results on different machines. For example, running my original command on your new branch on my work machine, I get a file The files have the same UUID, but a different name and vastly different contents: Many files have differences like this. Is the randomness machine dependent? |
The short answer is it looks like you're running on 2 different versions here - you can see the commit hash that each was run on in the Patient.text field The longer answer is that we don't expect there to be differences across machines. We use a seeded version of the java.util.Random class, which has the following note in the docs so in general the assumption should be - equal command and equal version of synthea = equal results even if on a different system.
If you do find something that's different across systems and not attributable to anything else, definitely let us know and we can try to figure out what's going on and fix it. |
Ah yeah looks like I forgot to fetch on my personal machine. Unfortunately, I'm still seeing differences in a linux docker container which was what originally caught my eye. Perhaps this could be due to errors that appear in linux for me but not on my Mac:
|
Ok yeah, I can confirm the same thing. Runs on bare metal on my Mac are always consistent, runs with the exact same code copied into a Docker image are always consistent, but between the two it's not identical. At a glance my Mac output matches the left in your diff and my Docker output matches the right, so to me that suggests a difference by OS rather than by individual machine. I'll keep digging. |
Ok, I've pushed one more update to the branch that should hopefully fix everything. Turns out HashMaps tend to be sorted consistently, but differently across OSes. I've updated the spots I found in testing but it's very possible I missed some that might get hit via different patient trajectories. One other thing to make sure of is that your two instances are running with the same time zone. The internal simulation uses UTC but all dates are exported in the system time zone, so that can result in records that are conceptually equivalent but textually different |
@KevinCranmer just wanted to make sure you saw this latest update above -- hopefully everything should be consistent across OSes as well now |
@dehall Hey I've unfortunately found some more differences... The right side is on linux, the left on Mac. Both ran today. Looks like there are some differences in dates (expected, not an issue), differences in Observation values, and the right side has some additional resources that seem to screw up the IDs afterwards. The synthea command I'm running: |
Ahh ok thanks for the update. I'll be able to look into this early next week |
@KevinCranmer I just pushed up a new branch for the latest fixes -- |
What happened?
Reading the wiki it looks like "Populations generated with the same seed and the same version of Synthea should be identical".
But I've noticed that this is not the case. I'm consistently seeing the same 3 types of differences across multiple runs on the same Synthea version.
Command I'm using:
./run_synthea -s 1668660039331 -cs 1668660039331 -r 20230101 -p 83 Florida
Environment
Relevant log output
No response
The text was updated successfully, but these errors were encountered: