[AIC-py] hf machine translation (text to text) model parser #753

jonathanlastmileai · 2024-01-04T17:29:11Z

[AIC-py] hf mt model parser

text translation sans streaming. Streaming is not trivial here

Test:

import asyncio
from aiconfig_extension_hugging_face.local_inference.text_translation import HuggingFaceTextTranslationTransformer

from aiconfig import AIConfigRuntime, InferenceOptions, CallbackManager

# Load the aiconfig.

mp = HuggingFaceTextTranslationTransformer()

AIConfigRuntime.register_model_parser(mp, "translation_en_to_fr")

config = AIConfigRuntime.load("/Users/jonathan/Projects/aiconfig/test_hf_transl.aiconfig.json")
config.callback_manager = CallbackManager([])


def print_stream(data, _accumulated_data, _index):
    print(data, end="", flush=True)


async def run():
    # print("Stream")
    # options = InferenceOptions(stream=True, stream_callback=print_stream)
    # out = await config.run("test_hf_trans", options=options)
    # print("Output:\n", out)

    print("no stream")
    options = InferenceOptions(stream=False)
    out = await config.run("test_hf_trans", options=options)
    print("Output:\n", out)


asyncio.run(run())

Stack created with Sapling. Best reviewed with ReviewStack.

Very similar to text gen Test: ``` ... { "name": "test_hf_sum", "input": "HMS Duncan was a D-class destroyer ...", # [contents of https://en.wikipedia.org/wiki/HMS_Duncan_(D99)] "metadata": { "model": { "name": "stevhliu/my_awesome_billsum_model", "settings": { "min_length": 100, "max_length": 200, "num_beams": 1 } } } }, ... } ``` ``` import asyncio from aiconfig_extension_hugging_face.local_inference.text_summarization import HuggingFaceTextSummarizationTransformer from aiconfig import AIConfigRuntime, InferenceOptions, CallbackManager # Load the aiconfig. mp = HuggingFaceTextSummarizationTransformer() AIConfigRuntime.register_model_parser(mp, "stevhliu/my_awesome_billsum_model") config = AIConfigRuntime.load("/Users/jonathan/Projects/aiconfig/cookbooks/Getting-Started/travel.aiconfig.json") config.callback_manager = CallbackManager([]) def print_stream(data, _accumulated_data, _index): print(data, end="", flush=True) async def run(): print("Stream") options = InferenceOptions(stream=True, stream_callback=print_stream) out = await config.run("test_hf_sum", options=options) print("Output:\n", out) print("no stream") options = InferenceOptions(stream=False) out = await config.run("test_hf_sum", options=options) print("Output:\n", out) asyncio.run(run()) # OUT: Stream Token indices sequence length is longer than the specified maximum sequence length for this model (2778 > 512). Running this sequence through the model will result in indexing errors /opt/homebrew/Caskroom/miniconda/base/envs/aiconfig/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:430: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`. warnings.warn( /opt/homebrew/Caskroom/miniconda/base/envs/aiconfig/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:449: UserWarning: `num_beams` is set to 1. However, `length_penalty` is set to `2.0` -- this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `length_penalty`. warnings.warn( <pad> escorted the 13th Destroyer Flotilla in the Mediterranean and escorited the carrier Argus to Malta during the war The ship was recalled home to be converted into an escoter destroyer in late 1942. The vessel was repaired and given a refit at Gibraltar on 16 November, and was sold for scrap later that year. The crew of the ship escores the ship to the Middle East, and the ship is a 'disaster' of the </s>Output: [ExecuteResult(output_type='execute_result', execution_count=0, data="<pad> escorted the 13th Destroyer Flotilla in the Mediterranean and escorited the carrier Argus to Malta during the war The ship was recalled home to be converted into an escoter destroyer in late 1942. The vessel was repaired and given a refit at Gibraltar on 16 November, and was sold for scrap later that year. The crew of the ship escores the ship to the Middle East, and the ship is a 'disaster' of the </s>", mime_type=None, metadata={})] no stream Output: [ExecuteResult(output_type='execute_result', execution_count=0, data="escorted the 13th Destroyer Flotilla in the Mediterranean and escorited the carrier Argus to Malta during the war . The ship was recalled home to be converted into an escoter destroyer in late 1942. The vessel was repaired and given a refit at Gibraltar on 16 November, and was sold for scrap later that year. The crew of the ship escores the ship to the Middle East, and the ship is a 'disaster' of the .", mime_type=None, metadata={})] ```

text translation sans streaming. Streaming is not trivial here Test: ``` import asyncio from aiconfig_extension_hugging_face.local_inference.text_translation import HuggingFaceTextTranslationTransformer from aiconfig import AIConfigRuntime, InferenceOptions, CallbackManager # Load the aiconfig. mp = HuggingFaceTextTranslationTransformer() AIConfigRuntime.register_model_parser(mp, "translation_en_to_fr") config = AIConfigRuntime.load("/Users/jonathan/Projects/aiconfig/test_hf_transl.aiconfig.json") config.callback_manager = CallbackManager([]) def print_stream(data, _accumulated_data, _index): print(data, end="", flush=True) async def run(): # print("Stream") # options = InferenceOptions(stream=True, stream_callback=print_stream) # out = await config.run("test_hf_trans", options=options) # print("Output:\n", out) print("no stream") options = InferenceOptions(stream=False) out = await config.run("test_hf_trans", options=options) print("Output:\n", out) asyncio.run(run()) ```

cookbooks/Getting-Started/getting_started.ipynb

saqadri · 2024-01-04T17:35:13Z

What does "mt" stand for?

jonathanlastmileai · 2024-01-04T17:46:39Z

What does "mt" stand for?

machine translation :)

rossdanlm · 2024-01-05T07:53:41Z

extensions/HuggingFace/python/src/aiconfig_extension_hugging_face/__init__.py

@@ -2,9 +2,15 @@
 from .local_inference.text_generation import HuggingFaceTextGenerationTransformer
 from .remote_inference_client.text_generation import HuggingFaceTextGenerationClient
 from .local_inference.text_summarization import HuggingFaceTextSummarizationTransformer
+from .local_inference.text_translation import HuggingFaceTextTranslationTransformer

 # from .remote_inference_client.text_generation import HuggingFaceTextGenerationClient


I know this was from lsat diff, but can we delete? Thanks

rossdanlm

Generally looks good to me, thanks!

jonathanlastmileai requested review from saqadri, rholinshead, suyoglastmileai, Ankush-lastmile and rossdanlm as code owners January 4, 2024 17:29

jonathanlastmileai mentioned this pull request Jan 4, 2024

[RFC][AIC-py] hf summarization model parser #740

Merged

jonathanlastmileai force-pushed the pr753 branch from 30aac2d to 9fd71be Compare January 4, 2024 17:34

saqadri reviewed Jan 4, 2024

View reviewed changes

cookbooks/Getting-Started/getting_started.ipynb Outdated Show resolved Hide resolved

jonathanlastmileai changed the title ~~[AIC-py] hf mt model parser~~ [AIC-py] hf machine translation (text to text) model parser Jan 4, 2024

rossdanlm reviewed Jan 5, 2024

View reviewed changes

rossdanlm approved these changes Jan 5, 2024

View reviewed changes

jonathanlastmileai merged commit 00b7acd into main Jan 5, 2024

jonathanlastmileai deleted the pr753 branch January 5, 2024 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIC-py] hf machine translation (text to text) model parser #753

[AIC-py] hf machine translation (text to text) model parser #753

jonathanlastmileai commented Jan 4, 2024 •

edited

Loading

saqadri commented Jan 4, 2024

jonathanlastmileai commented Jan 4, 2024

rossdanlm Jan 5, 2024

rossdanlm left a comment

[AIC-py] hf machine translation (text to text) model parser #753

[AIC-py] hf machine translation (text to text) model parser #753

Conversation

jonathanlastmileai commented Jan 4, 2024 • edited Loading

saqadri commented Jan 4, 2024

jonathanlastmileai commented Jan 4, 2024

rossdanlm Jan 5, 2024

Choose a reason for hiding this comment

rossdanlm left a comment

Choose a reason for hiding this comment

jonathanlastmileai commented Jan 4, 2024 •

edited

Loading