Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added pipline args to HuggingFacePipeline.from_model_id #5268

Merged
merged 4 commits into from
May 26, 2023

Conversation

solomspd
Copy link
Contributor

The current HuggingFacePipeline.from_model_id does not allow passing of pipeline arguments to the transformer pipeline.
This PR enables adding important pipeline parameters like setting max_new_tokens for example.
Previous to this PR it would be necessary to manually create the pipeline through huggingface transformers then handing it to langchain.

For example instead of this

model_id = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline(
    "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=10
)
hf = HuggingFacePipeline(pipeline=pipe)

You can write this

hf = HuggingFacePipeline.from_model_id(
    model_id="gpt2", task="text-generation", pipeline_kwargs={"max_new_tokens": 10}
)

Who can review?

@hwchase17
@agola11

@dev2049 dev2049 added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label May 25, 2023
@dev2049 dev2049 merged commit 2ef5579 into langchain-ai:master May 26, 2023
@danielchalef danielchalef mentioned this pull request Jun 5, 2023
Undertone0809 pushed a commit to Undertone0809/langchain that referenced this pull request Jun 19, 2023
…ai#5268)

The current `HuggingFacePipeline.from_model_id` does not allow passing
of pipeline arguments to the transformer pipeline.
This PR enables adding important pipeline parameters like setting
`max_new_tokens` for example.
Previous to this PR it would be necessary to manually create the
pipeline through huggingface transformers then handing it to langchain.

For example instead of this
```py
model_id = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
pipe = pipeline(
    "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=10
)
hf = HuggingFacePipeline(pipeline=pipe)
```
You can write this
```py
hf = HuggingFacePipeline.from_model_id(
    model_id="gpt2", task="text-generation", pipeline_kwargs={"max_new_tokens": 10}
)
```


Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
This was referenced Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm PR looks good. Use to confirm that a PR is ready for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants