You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"I'm using bounter to count the frequency of items in a large set. I was periodically pickling the bounter object. Doing this causes the memory to continually increase" (based on https://groups.google.com/forum/#!topic/gensim/LsReiXXOzKY thread)
Steps/Code/Corpus to Reproduce
importpickleaspklfrombounterimportbounterimportnumpyasnpimportpsutilimportgcdefget_used_memory():
""" Return the current am't of used memory, in GB """return'{:.3f}'.format(psutil.virtual_memory().used/1024.0/1024.0/1024.0)
deflog(msg):
print(msg, ', memory =', get_used_memory())
defmain():
log('Starting with np array')
a=np.random.randint(0, 512, (8, 33554432), dtype='int32')
log('Initialized array')
foriinrange(6):
withopen('array.pkl', 'wb') asf:
pkl.dump(a, f, protocol=pkl.HIGHEST_PROTOCOL)
log('Finished saving the '+str(i) +'th copy of the array')
delagc.collect()
log('deleted array and performed gc.collect() ')
counter=bounter(size_mb=1024, need_iteration=False, log_counting=1024)
log('Initialized counter')
foriinrange(6):
withopen('counter.pkl','wb') asf:
pkl.dump(counter, f, protocol=pkl.HIGHEST_PROTOCOL)
log('Finished saving the '+str(i) +'th copy of the bounter')
delcountergc.collect()
log('deleted array and performed gc.collect() ')
log('Finished')
if__name__=='__main__':
main()
Expected Results
Memory shouldn't increase significantly after each dump
Actual Results
I get the resulting log statements along with the two pkl files each 1.1 GB in size:
('Starting with np array', ', memory =', '3.539')
('Initialized array', ', memory =', '4.540')
('Finished saving the 0th copy of the array', ', memory =', '4.540')
('Finished saving the 1th copy of the array', ', memory =', '4.544')
('Finished saving the 2th copy of the array', ', memory =', '4.549')
('Finished saving the 3th copy of the array', ', memory =', '4.549')
('Finished saving the 4th copy of the array', ', memory =', '4.553')
('Finished saving the 5th copy of the array', ', memory =', '4.562')
('deleted array and performed gc.collect() ', ', memory =', '3.561')
('Initialized counter', ', memory =', '3.561')
('Finished saving the 0th copy of the bounter', ', memory =', '4.567')
('Finished saving the 1th copy of the bounter', ', memory =', '5.573')
('Finished saving the 2th copy of the bounter', ', memory =', '6.577')
('Finished saving the 3th copy of the bounter', ', memory =', '7.576')
('Finished saving the 4th copy of the bounter', ', memory =', '8.579')
('Finished saving the 5th copy of the bounter', ', memory =', '9.582')
('deleted array and performed gc.collect() ', ', memory =', '9.580')
('Finished', ', memory =', '9.580')
Here, I see 2 suspicious places, first with memory increasing
('Finished saving the 0th copy of the bounter', ', memory =', '4.567')
('Finished saving the 1th copy of the bounter', ', memory =', '5.573')
('Finished saving the 2th copy of the bounter', ', memory =', '6.577')
('Finished saving the 3th copy of the bounter', ', memory =', '7.576')
('Finished saving the 4th copy of the bounter', ', memory =', '8.579')
('Finished saving the 5th copy of the bounter', ', memory =', '9.582')
and the second one (that looks like memory-leak)
('Finished saving the 5th copy of the bounter', ', memory =', '9.582')
('deleted array and performed gc.collect() ', ', memory =', '9.580')
('Finished', ', memory =', '9.580')
Description
"I'm using bounter to count the frequency of items in a large set. I was periodically pickling the bounter object. Doing this causes the memory to continually increase" (based on https://groups.google.com/forum/#!topic/gensim/LsReiXXOzKY thread)
Steps/Code/Corpus to Reproduce
Expected Results
Memory shouldn't increase significantly after each
dump
Actual Results
I get the resulting log statements along with the two pkl files each 1.1 GB in size:
Here, I see 2 suspicious places, first with memory increasing
and the second one (that looks like memory-leak)
Versions
The text was updated successfully, but these errors were encountered: