Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text-davinci-003 has been deprecated && the results of demo are not good #30

Open
Lins-01 opened this issue Mar 21, 2024 · 1 comment
Open

Comments

@Lins-01
Copy link

Lins-01 commented Mar 21, 2024

Hi! Thank you for releasing the code! This is a very interesting piece of work. Congratsssss on the NeurIPS acceptance! 🎉

i met some problem when i use your code.
when directly run the demo.ipynb ,error here.

Sampling with best hyper... defaultdict(<class 'dict'>, {'model': 'text-davinci-003', 'temp': 0.7, 'alpha': 0.95, 'beta': 0.3, 'basic': False, 'settings': SerializerSettings(base=10, prec=3, signed=True, fixed_length=False, max_val=10000000.0, time_sep=' ,', bit_sep=' ', plus_sign='', minus_sign=' -', half_bin_correction=True, decimal_point='', missing_str=' Nan'), 'dataset_name': 'AirPassengersDataset'}) 
 with NLL inf
  0%|          | 0/1 [00:00<?, ?it/s]
---------------------------------------------------------------------------
InvalidRequestError                       Traceback (most recent call last)
Cell In[3], line 11
      9 hypers = list(grid_iter(model_hypers[model]))
     10 num_samples = 10
---> 11 pred_dict = get_autotuned_predictions_data(train, test, hypers, num_samples, model_predict_fns[model], verbose=False, parallel=False)
     12 out[model] = pred_dict
     13 plot_preds(train, test, pred_dict, model, show_samples=True)

File [e:\Document\CodeSpace\OpenProject\llmtime-main\models\validation_likelihood_tuning.py:119](file:///E:/Document/CodeSpace/OpenProject/llmtime-main/models/validation_likelihood_tuning.py:119), in get_autotuned_predictions_data(train, test, hypers, num_samples, get_predictions_fn, verbose, parallel, n_train, n_val)
    117     best_val_nll = float('inf')
    118 print(f'Sampling with best hyper... {best_hyper} \n with NLL {best_val_nll:3f}')
--> 119 out = get_predictions_fn(train, test, **best_hyper, num_samples=num_samples, n_train=n_train, parallel=parallel)
    120 out['best_hyper']=convert_to_dict(best_hyper)
    121 return out

File [e:\Document\CodeSpace\OpenProject\llmtime-main\models\llmtime.py:228](file:///E:/Document/CodeSpace/OpenProject/llmtime-main/models/llmtime.py:228), in get_llmtime_predictions_data(train, test, model, settings, num_samples, temp, alpha, beta, basic, parallel, **kwargs)
    226 completions_list = None
    227 if num_samples > 0:
--> 228     preds, completions_list, input_strs = generate_predictions(completion_fn, input_strs, steps, settings, scalers,
    229                                                                 num_samples=num_samples, temp=temp, 
    230                                                                 parallel=parallel, **kwargs)
    231     samples = [pd.DataFrame(preds[i], columns=test[i].index) for i in range(len(preds))]
    232     medians = [sample.median(axis=0) for sample in samples]
...
    776         rbody, rcode, resp.data, rheaders, stream_error=stream_error
    777     )
    778 return resp

InvalidRequestError: The model `text-davinci-003` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?3b9460ae-8b25-48ef-a914-d7f7efda15e9) or open in a [text editor](command:workbench.action.openLargeOutput?3b9460ae-8b25-48ef-a914-d7f7efda15e9). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...

after check the openai's url, change this code:

model_predict_fns = {
    'LLMTime GPT-3': get_llmtime_predictions_data,
    'LLMTime GPT-4': get_llmtime_predictions_data,
    'PromptCast GPT-3': get_promptcast_predictions_data,
    'ARIMA': get_arima_predictions_data,
}

to

model_predict_fns = {
    'LLMTime GPT-3.5': get_llmtime_predictions_data,
    # 'LLMTime GPT-4': get_llmtime_predictions_data,
    # 'PromptCast GPT-3': get_promptcast_predictions_data,
    'ARIMA': get_arima_predictions_data,
}

here is the result i get,seem doesnt better than ARIMA,the bold purple line is farer from the actual, is the reason of gpt-3.5-turbo-instruct?
plz,can you update the demo for new api,or instruct me how to improve the performance?or only use the text-davinci-003 or llama-70B to get the result plot in your paper?
Sorry for taking up your time. Can you give me some help in your free time?
:

gpt3 51
ARIMA1
3 52
ARIMA2

@shikaiqiu
Copy link
Collaborator

Hi Changling,

It's indeed unfortunate that OpenAI has deprecated text-davinci-003. As mentioned in the README, we found gpt-3.5-turbo-instruct to perform worse than text-davinci-003. We found using a lower temperature (e.g. 0.3) improved performance slightly but still not matching text-davinci-003. Therefore, we do not recommend using gpt-3.5-turbo-instruct as a drop-in replacement. Using other models such as LLaMA 2 will work much better.

Shikai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants