Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epic: Jan Context Length issues #2320

Open
1 of 2 tasks
Tracked by #3614
hahuyhoang411 opened this issue Mar 12, 2024 · 5 comments
Open
1 of 2 tasks
Tracked by #3614

epic: Jan Context Length issues #2320

hahuyhoang411 opened this issue Mar 12, 2024 · 5 comments
Labels
category: model settings Inference params, presets, templates category: threads & chat Threads & chat UI UX issues move to Cortex needs eng decision Needs product or engineering specs needs pm Feature request is not clear, needs product decisions P1: important Important feature / fix type: feature request A new feature

Comments

@hahuyhoang411
Copy link
Contributor

hahuyhoang411 commented Mar 12, 2024

Goal

  • Jan needs an elegant way to deal with model context length issues

Possible Scope

  • e.g. Logic for Thread > context length?
  • e.g. User can adjust the context length to the model within model bounds
  • e.g. Can support longer context length support if model supported and hardware supported
  • e.g. Jan has adaptive context length, given GGUF or model.yaml, and hardware detection

Linked Issues

Cortex Issue

Original Post

Problem
In some cases, users can use the model to exceed the limit of 4096 tokens (~4000 words). But we haven't implemented any solutions to handle it.

Success Criteria

  1. Have an alert that notifies users are exceed the context length
  2. We can delete the very first user message (not the system) when exceed the context length

Additional context
Bug:


@imtuyethan

As discussed with @hahuyhoang411:

  • Error when thread exceeds the context length
  • Recommend users to delete message by themselves or create a new thread

Design:

https://www.figma.com/file/ytn1nRZ17FUmJHTlhmZB9f/Jan-App-(version-1)?type=design&node-id=6847-111809&mode=design&t=ErX19MBkMjVhBSjO-4

Screenshot 2024-03-27 at 3 59 07 PM

(This is the MVP for now, in the future we will have a standardized error format that will direct users to Discourse forum & users can see the answer there, see specs: https://www.notion.so/jan-ai/Standardized-Error-Format-for-Jan-abea56d32d6648bb8c6835f9176f800c?pvs=4)

@lv333ming
Copy link

Will this issue be improved? 4000 is too few conversations

@imtuyethan imtuyethan moved this to Planned in Jan & Cortex Mar 25, 2024
@imtuyethan imtuyethan added the needs designs Needs designs label Mar 25, 2024
@imtuyethan imtuyethan moved this from Planned to In Progress in Jan & Cortex Mar 27, 2024
@imtuyethan
Copy link
Contributor

imtuyethan commented Mar 27, 2024

As discussed with @hahuyhoang411:

  • Error when thread exceeds the context length
  • Recommend users to delete message by themselves or create a new thread

Design:

https://www.figma.com/file/ytn1nRZ17FUmJHTlhmZB9f/Jan-App-(version-1)?type=design&node-id=6847-111809&mode=design&t=ErX19MBkMjVhBSjO-4

Screenshot 2024-03-27 at 3 59 07 PM

(This is the MVP for now, in the future we will have a standardized error format that will direct users to Discourse forum & users can see the answer there, see specs: https://www.notion.so/jan-ai/Standardized-Error-Format-for-Jan-abea56d32d6648bb8c6835f9176f800c?pvs=4)

@imtuyethan imtuyethan assigned louis-jan and namchuai and unassigned imtuyethan Mar 27, 2024
@imtuyethan imtuyethan moved this from In Progress to Planned in Jan & Cortex Mar 27, 2024
@imtuyethan imtuyethan removed the needs designs Needs designs label Mar 27, 2024
@Van-QA Van-QA added this to the v0.4.11 milestone Apr 2, 2024
@namchuai namchuai moved this from Planned to In Progress in Jan & Cortex Apr 3, 2024
@Van-QA Van-QA modified the milestones: v0.4.11, v0.4.12 Apr 4, 2024
@louis-jan louis-jan removed their assignment Apr 5, 2024
@Van-QA Van-QA modified the milestones: v0.4.12, v0.4.13 Apr 17, 2024
@Propheticus
Copy link

How about a 'sliding window' that only uses the last X messages that fit in the context length?
The number of evaluated (prompt) and generated tokens are reported after every call, so the data is there. If the last inference evaluated+generated tokens comes close to the max context, you need to start excluding the first turn.

@IngEyn
Copy link

IngEyn commented Apr 24, 2024

I do not know if there are best practices regarding this but I'd just suggest to maybe not exclude the very first message as I believe most users set the stage with the first message. I could imagine there being some sort of placeholder put in between the first and the next query, when excluding message(s), like 'There have been messages in between these ones, that have been removed due to a moving context length window. Pretend this bit makes sense but disregard it as context going forward.'

@Propheticus
Copy link

inspiration from the competition:
image

@louis-jan louis-jan moved this from In Progress to Planned in Jan & Cortex Apr 26, 2024
@Van-QA Van-QA modified the milestones: v0.5.0 Broken Rice, v0.5.1 May 2, 2024
@Van-QA Van-QA removed this from the v.0.5.0 🍵 Bubur Ayam milestone May 29, 2024
@Van-QA Van-QA modified the milestones: v.0.6.0, v0.6.1 Aug 5, 2024
@namchuai namchuai assigned Van-QA and unassigned namchuai Aug 19, 2024
@imtuyethan imtuyethan removed this from the v0.7.1 milestone Aug 28, 2024
@dan-homebrew dan-homebrew added needs designs Needs designs needs eng decision Needs product or engineering specs needs pm Feature request is not clear, needs product decisions labels Aug 30, 2024
@dan-homebrew dan-homebrew moved this from Scheduled to Planning in Jan & Cortex Aug 30, 2024
@dan-homebrew dan-homebrew removed the P1: important Important feature / fix label Aug 30, 2024
@imtuyethan imtuyethan added P1: important Important feature / fix and removed needs designs Needs designs labels Aug 30, 2024
@dan-homebrew dan-homebrew changed the title feat: Jan can handle the exceed context length feat: Jan allows user to control context length Aug 30, 2024
@dan-homebrew dan-homebrew changed the title feat: Jan allows user to control context length feat: Jan context length issues Sep 10, 2024
@dan-homebrew dan-homebrew changed the title feat: Jan context length issues feat: Jan Context Length issues Sep 10, 2024
@dan-homebrew dan-homebrew changed the title feat: Jan Context Length issues epic: Jan Context Length issues Sep 11, 2024
@dan-homebrew dan-homebrew moved this from Planning to Scheduled in Jan & Cortex Sep 26, 2024
@0xSage 0xSage added category: threads & chat Threads & chat UI UX issues category: model settings Inference params, presets, templates and removed category: engines category: threads & chat Threads & chat UI UX issues labels Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model settings Inference params, presets, templates category: threads & chat Threads & chat UI UX issues move to Cortex needs eng decision Needs product or engineering specs needs pm Feature request is not clear, needs product decisions P1: important Important feature / fix type: feature request A new feature
Projects
Status: Scheduled
Development

No branches or pull requests

10 participants