Skip to content

Conversation

@hanishkvc
Copy link
Contributor

Add a flag --interactive-specials to examples/main

This controls whether the user entered text in interactive mode is tokenized with parse_special flag set or not.

When main is run with this flag, it will allow users to enter any special tokens for fill-in-mode or equivalent supported by any ai model, so that the model can do the same at the appropriate location, based on the surrounding context in the user text.

@ggerganov @teleprint-me

I havent tested this yet, but logically this should allow for the above mentioned use case.

@hanishkvc
Copy link
Contributor Author

hanishkvc commented May 6, 2024

Tested fill-in-middle sample prompt mentioned wrt Refact 1.6B,

and it correctly generates a suitable comment as expected that should be placed in-between the user provided code context in the sample prompt, when --interactive-specials argument is passed to main. And the proper token ids are generated for <fim_prefix>, <fim_suffix> and <fim_middle>

Without --interactive-specials argument, it generates a some what repeating general note about the code in the given sample prompt. And as expected <fim_prefix> <fim_suffix> and <fim_middle> dont get converted to proper tokens, because parse_special wont be set when tokenizing.

@hanishkvc hanishkvc changed the title main-interactive-mode: optionally allow for special tokens from user in interactive mode for fill-in etal main-interactive-mode: optionally allow for special tokens from user in interactive mode for fill-in-middle etal May 6, 2024
@mofosyne mofosyne added enhancement New feature or request Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix labels May 9, 2024
Merge master has of 20240510IST1236 into this branch.

Fix a merge conflict with the newly added conversation flag
in master branch.
@hanishkvc hanishkvc force-pushed the hkvc_chat_interactivespecials branch from b6f2e53 to 9566de9 Compare May 10, 2024 07:15
@hanishkvc
Copy link
Contributor Author

hanishkvc commented May 10, 2024

Have force pushed a merge with the master. @mofosyne , I think I overwrote your equivalent merge. Sorry I hadnt noticed that Allow edits by maintainers was enabled.

@mofosyne
Copy link
Collaborator

Not a problem. I would prefer that as I can't always guess your full intent. You are a more accurate author than I am. But I was hoping to at least make your life easier :)

Copy link
Collaborator

@mofosyne mofosyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checked that llama_tokenize() forth arguments is actually for special/control token in method definitions. Looks right.

Can observe change to code that would add a new flag that would allow for enabling the ability to add a special token in the middle of the embedding stream.

@mofosyne mofosyne merged commit f89fe27 into ggml-org:master May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants