You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried CSV.File, CSV.read with NamedTuples, DTable, DataFrame etc. all have the same issue.
There's just some sticky memory leftover from loading the csv files
The table generated is about 1.6GB
# generateusing DataFrames
d =DataFrame((;[Symbol("a$i") =>rand(Int32(1):Int32(1000), Int(1e8)) for i in1:4]...));
# run GC.gc() a few times, memory usage settles at ~# prepusing CSV
genchunk = () -> (; [Symbol("a$i") =>rand(Int32(1):Int32(1000), Int(1e7)) for i =1:4]...)
mkpath("data")
for i =1:10
CSV.write(joinpath(["data", "datapart_$i.csv"]), genchunk())
end# load from multiple files
d = CSV.read(files, DataFrame)
generated
loaded from files
The text was updated successfully, but these errors were encountered:
krynju
changed the title
Loading a .csv uses ~double the memory than it should
Loading a multiple .csv files uses ~double the memory than it should
Dec 18, 2021
krynju
changed the title
Loading a multiple .csv files uses ~double the memory than it should
Loading a multiple .csv files uses ~double the memory it should
Dec 18, 2021
krynju
changed the title
Loading a multiple .csv files uses ~double the memory it should
Loading multiple .csv files uses ~double the memory it should
Dec 18, 2021
d = CSV.read(files, DataFrame, types=Int32)
forgot it parses as Int64 and that's where my double memory usage was coming from
this sticky memory related to glibc issue is still observable on my end though, but that's a different issue
I'm on Julia master/1.8
and Windows
I tried CSV.File, CSV.read with NamedTuples, DTable, DataFrame etc. all have the same issue.
There's just some sticky memory leftover from loading the csv files
The table generated is about 1.6GB
generated
loaded from files
The text was updated successfully, but these errors were encountered: