Text generation - getting slower and slower? #105

ziqizhang · 2019-08-15T15:05:24Z

I have fine tuned a model and I am now using the model to generate text. I am doing this on Google colab that has a default of 26GB memory. My process loops through a collection of sentences, using them as 'prefix' to generate a paragraph. However, it seems that this process is gradually getting slower and slower, as it takes longer and longer time to complete one loop, although the sentence does not get any longer from loop to loop.

I don't really understand this, as I thought the text generation process should have constant performance each time the method is called?

My code looks as follows - have I used it in the wrong way?

EDIT: I also notice that during this time the memory usage is increasing. When I started the process on my AWS server solely configured for this purpose, it only uses 5% memory. It has run over 24 hours now, and now using 53% of memory. Why is that?

EDIT 2: I can confirm the pattern again. The process has been running for almost 2 days, and now the memory usage has gone up to 73%. And it now takes 3 minutes to generate one output, gone up significantly from 10 seconds at the beginning.

Can anyone please help? This does not look normal to me.

Thanks

sess = gpt2.start_tf_sess()
gpt2.load_gpt2(sess, run_name='run1')
with open(outFile, 'a+', newline='\n') as f:
        writer = csv.writer(f, delimiter=",", quotechar='"')
        count=0
        for l in lineList:
            print(str(datetime.datetime.now())+","+str(count))
            l = re.sub('[^0-9a-zA-Z]+', ' ', l).strip()
            texts = gpt2.generate(sess, return_as_list=True,
                                        temperature=1.0,
                                        nsamples=2,
                                        batch_size=2,
                                        length=200,
                                        prefix=l,
                                        include_prefix=False)
            row=[l]
            for t in texts:
                if l in t:
                    t=t[len(l):].strip()
                row.append(t)

            writer.writerow(row)
            count+=1

Log showing the increase of time taken. As you can see it takes from just 10secs to more than a minute, per sentence.

2019-08-15 12:49:49.720246,4162
2019-08-15 12:50:00.720310,4163
2019-08-15 12:50:11.065400,4164
2019-08-15 12:50:21.630609,4165
2019-08-15 12:50:32.572490,4166
2019-08-15 12:50:47.027083,4167
2019-08-15 12:50:58.078473,4168
2019-08-15 12:51:09.834870,4169
2019-08-15 12:51:21.490914,4170
2019-08-15 12:51:34.091284,4171
2019-08-15 12:51:48.238152,4172
2019-08-15 12:52:01.631092,4173
2019-08-15 12:52:14.451645,4174
2019-08-15 12:52:27.794607,4175
2019-08-15 12:52:43.495325,4176
.....
2019-08-15 15:23:28.228918,4391
2019-08-15 15:24:39.403824,4392
2019-08-15 15:25:48.217059,4393
2019-08-15 15:26:59.058952,4394
2019-08-15 15:28:09.956804,4395
2019-08-15 15:29:21.806861,4396
2019-08-15 15:30:30.500894,4397
2019-08-15 15:31:41.235117,4398
2019-08-15 15:32:49.256143,4399

The text was updated successfully, but these errors were encountered:

minimaxir · 2019-08-18T19:35:12Z

This may be due to the memory leak issues which pollute the graph. In which case, use a technique I use in the Cloud Run apps to reset it:

tf.reset_default_graph()
sess.close()
sess = gpt2.start_tf_sess()
gpt2.load_gpt2(sess)

ziqizhang · 2019-08-19T09:25:34Z

Thank you.

I suppose you mean to put this inside the loop to reset the graph periodically. E.g., every 100 iterations. Is that correct?

I have tried this for 2 hours and and the problem seems to have gone. So I guess that is indeed the cause. Would be great to have this fixed in the future release if possible. However for now I am happy with this temporary fix.

Thanks again

greatblueheron · 2019-08-27T19:11:43Z

Related: would be great if prefix could be entered as a list, and the elements of the list were processed and returned in parallel -- like batch, but with different prefixes. The time generate is taking seems way too long.

RandomStrangerOnTheInternet · 2019-11-17T19:17:12Z

See #140

ramanshrivastava · 2019-12-12T05:05:20Z

I am facing issues with generate time too..since in production expectation is that it should be < 100ms. Do you think Tensorflow serving can help here? And how would encoding work during inference with Tensorflow Serving?

only-yao · 2020-06-22T06:50:43Z

I am facing issues with generate time too..since in production expectation is that it should be < 100ms. Do you think Tensorflow serving can help here? And how would encoding work during inference with Tensorflow Serving?

Have you solved the speed problem you encountered in TF serving

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text generation - getting slower and slower? #105

Text generation - getting slower and slower? #105

ziqizhang commented Aug 15, 2019 •

edited

Loading

minimaxir commented Aug 18, 2019 •

edited

Loading

ziqizhang commented Aug 19, 2019 •

edited

Loading

greatblueheron commented Aug 27, 2019

RandomStrangerOnTheInternet commented Nov 17, 2019

ramanshrivastava commented Dec 12, 2019

only-yao commented Jun 22, 2020

Text generation - getting slower and slower? #105

Text generation - getting slower and slower? #105

Comments

ziqizhang commented Aug 15, 2019 • edited Loading

minimaxir commented Aug 18, 2019 • edited Loading

ziqizhang commented Aug 19, 2019 • edited Loading

greatblueheron commented Aug 27, 2019

RandomStrangerOnTheInternet commented Nov 17, 2019

ramanshrivastava commented Dec 12, 2019

only-yao commented Jun 22, 2020

ziqizhang commented Aug 15, 2019 •

edited

Loading

minimaxir commented Aug 18, 2019 •

edited

Loading

ziqizhang commented Aug 19, 2019 •

edited

Loading