You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, .bgz files are read as plain text, which fails due to invalid characters. .bgz files are compatible with gunzip, and have the same data header (0x1F 0x8B). renaming *.bgz files to *.gz files allows them to be decompressed normally.
Adding .bgz to the list of files that can be decompressed by data.table::fread shouldn't require anything other than R.utils. I think adding ".bgz" to the vector in this line:
TMRHarrison
changed the title
Add .bgz file decompression (compatible with .gz)
Add .bgz file decompression to data.table::fread() (compatible with .gz)
Sep 16, 2022
Adding a key insight from the bgzip man page (it's hinted in OP's post):
Bgzip compresses files in a similar manner to, and compatible with, gzip(1).
i.e., any tool that can read gz files should also be able to read bgz files. generatingbgz files would be another story, but we don't do that in fread().
Currently, .bgz files are read as plain text, which fails due to invalid characters. .bgz files are compatible with gunzip, and have the same data header (0x1F 0x8B). renaming *.bgz files to *.gz files allows them to be decompressed normally.
Adding .bgz to the list of files that can be decompressed by data.table::fread shouldn't require anything other than R.utils. I think adding
".bgz"
to the vector in this line:data.table/R/fread.R
Line 121 in c4a2085
And checking for
w<=2
on this linedata.table/R/fread.R
Line 124 in c4a2085
Would allow fread to decompress .bgz files automatically. However, I haven't tested this.
The text was updated successfully, but these errors were encountered: