-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
batch task memory leak #7696
Comments
If it's caused by the spawned task not being freed, then it's reasonable to happen along with #7694. Since the task is not freed, then
|
Here's another test and I'm not sure how to interpret it 🥵 😇 Inserted 5000000 rows in the first hill. After a while the memory usage droped to 480M. Then inserted another 5000000 rows. This time the memory usage didn't drop. Then inserted again -- OOM I tested using the following script: create table t (id int primary key,uid int,v1 int,v2 float,s1 varchar,s2 varchar,update_time timestamp);
import psycopg2
import random
import datetime
class InsertValue(object):
def __init__(self):
self.conn = psycopg2.connect(
database="dev", user="root", host="127.0.0.1", port=4566
)
def parse(self, c1, c2):
cursor = self.conn.cursor()
for step in range(c1, c2, 10):
li = []
for j in range(step, step + 10):
v2 = random.uniform(1, 1000)
update_time = datetime.datetime.now()
vv = """({},{},{},{},'test1','test2','{}')""".format(
j, j, j, v2, update_time
)
li.append(vv)
v = ",".join(li)
sql = """insert into t values {};""".format(v)
cursor.execute(sql)
self.conn.commit()
cursor.close()
def run(self, c1, c2):
self.parse(c1, c2)
self.conn.close()
if __name__ == "__main__":
iv = InsertValue()
iv.run(0, 5000000) |
@xxchan Can you help to run a heap profile? https://github.com/risingwavelabs/risingwave/blob/main/docs/memory-profiling.md |
This is caused by |
We seem to have fixed a similar problem 🤔 cc. @BowenXiao1999 @liurenjie1024 |
IIRC that's query execution, which increases memory usage in frontend: #5827 |
Seems another |
Yes there will be some hashmap's reserve memory problem. We might need a background corountine to do the clean-up. WeakHashMap or naive way may not help, as the task's upstream may keep reading from it (To be specific, get the receiver), and we do not know in what time we can delete the task (can not delete it immediately once the task is finished, as upstream may not started, so it's possible the task output channel is dropped), so we might need some status flag. |
Yes, this is an known issue for distributed mode, and we need a fix later. |
BTW, should we have a test for such problems (memory leak after a lot of batch tasks). e.g., in longevity test? |
Good idea, but currently test team has no resources for longevity test for distributed query, so we can postpone it. |
Fixed. |
Describe the bug
I ran the following script:
And observed memory usage keep increasing. It seems the memory used by batch task is not freed. Is this normal?
To Reproduce
No response
Expected behavior
No response
Additional context
Find together with #7694, but they seem to be different issues?
The text was updated successfully, but these errors were encountered: