You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The fast export seems to be working as desired, but when I add a typeFilter it falls back to the search-based export implementation and the result is that the resources are being split across many more COS objects than they should be.
I turned up tracing and it looks like the currentUploadSize is being computed incorrectly:
Specifically, note how the "sizeThreshold" bytes are growing faster than the "readyToWrite" bytes even before we've started an upload and cleared the buffer.
Environment
Which version of IBM FHIR Server?
To Reproduce
Steps to reproduce the behavior:
configure the server with a bulkdata storageProvider that uses s3 (e.g. aws-s3 or ibm-cos)
issue an export command with a typeFilter (e.g. GET [base]/$export?_type=Patient&_typeFilter=Patient?_elements=gender
note that the results have more COS objects than it should
Expected behavior
The system should continue writing to a single cos object until either the objectSizeThresholdMB or the objectResourceCountThreshold has been reached
Additional context
The text was updated successfully, but these errors were encountered:
Previously, we added the entire size of the buffer after each page of
results was read. This leads us to think that we have a lot more data
than we actually do. Now we will add only the new bytes.
Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
Ran a variety of $export scenarios (type=file) against 250K resources focusing on system level and patient. Also used _type and _typeFilter query parm combos in a subset of the $export(s). Varied config parms and confirm honored ( ie writeTriggerSizeMB, sizeThresholdMB, resourceCountThreshold).
Ran bulk data sniff test which utilizes type=ibm-cos.
Describe the bug
I have an bulkdata config with a ibm-cos storageProvider and COS settings as follows:
The fast export seems to be working as desired, but when I add a typeFilter it falls back to the search-based export implementation and the result is that the resources are being split across many more COS objects than they should be.
I turned up tracing and it looks like the currentUploadSize is being computed incorrectly:
Specifically, note how the "sizeThreshold" bytes are growing faster than the "readyToWrite" bytes even before we've started an upload and cleared the buffer.
Environment
Which version of IBM FHIR Server?
To Reproduce
Steps to reproduce the behavior:
Expected behavior
The system should continue writing to a single cos object until either the objectSizeThresholdMB or the objectResourceCountThreshold has been reached
Additional context
The text was updated successfully, but these errors were encountered: