-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interface improvements and --multiline-input
(previously --author-mode
)
#1040
Conversation
Compiles and appears to work fine on Intel macOS |
Looking through the pull requests I've noticed others wanting to add their own options. If we're going to keep adding options, it might be nice to separate common options from advanced options. When someone runs The "advanced" options would be things your average user isn't likely to use. Things such as I was even hesitant myself in adding another command switch when making this pull request. If this goes through, I'll submit another pull request to divide these options up. Although perhaps there's already enough options for such a separation. |
910aedf
to
1a01a04
Compare
I think it's best to rename this to something that more clearly describes what it does. VIM has a paste mode. My two alternative names are And this commit still contains quality of life improvements authors may like. I'm worried the size of this pull request is keeping reviewers away. I may also try to separate the console changes into its own pull request but it's less justified without the different line ending support added here. |
e3e5f8c
to
5d4c019
Compare
5d4c019
to
26ec0a8
Compare
This prevents any non‐ASCII input for me on GNU/Linux. I tried with various terminals, shells, locales, and input methods. I verified that the input is not simply invisible (not echoed), but that it is not being put into the input buffer. Here are some strings that become impossible to enter:
|
I did have this working, at least partially. This was the bug:
It should have been:
It's working with unicode for me if you want to try again @Senemu. |
It is working for me now too. 👏 |
I can enter all characters now, but when deleting them, the terminal output does not match if the characters are fullwidth or combining. For example, if I type Something similar happens when using combining characters: If I type |
This change is great - merge it when it is ready |
Ok, pretty sure I got it this time. I tested it with "Čüßtøm åçćéñtš ğḥḯṯ ḵḷṃñṓṕ ŕśẗŭṿẅẋýź Жизнь Волга ありがとう 你好". This string includes Latin-based characters with diacritics, Cyrillic characters ("Жизнь Волга" means "Life Volga"), and phrases in Japanese ("ありがとう" means "Thank you") and Chinese ("你好" means "Hello"). These characters should display correctly in most modern terminals and fonts. No problems in Linux but the default Windows command prompt has an issue with a couple characters (the newer Windows Terminal shows them all). Either way, the code seems to handle everything correctly. @Senemu Did you want to try breaking it again? 😄 Pretty sure I got it this time. |
--multiline-input
(previously --author-mode)
--multiline-input
(previously --author-mode)--multiline-input
(previously --author-mode
)
I think that you did! 😃 The only thing I have found is that if I type a combining character by itself at the start of an input, it applies to the preceding character instead of being treated as an isolated combining character. As such, it is impossible to (visually) delete it. Ideally, a combining character without a corresponding base character (which is what conceptually happens in this case, even if there happens to be a character before it in the terminal output) should display by itself (as if combined with a blank space). But note that wanting to do this and expecting it to work is weird, and that the unpatched Combining characters work as expected in any other circumstance I tested. I couldnʼt make what I see and what LLaMA sees mismatch. 👏 |
Oh, good catch. That's something else I didn't consider. The original getline doesn't handle this any better, but it would be easy to prevent the user from starting with a combining character since they are clearly defined. I guess the bigger question is, is there any utility in starting your input with a combining character? If the text ended on "cafe", you could turn it into "café", and it does seem LLaMA models can tokenize the combining character separately, but when tokenizing, it doesn't break it out automatically, it's got to be written like that. I wonder if the text used for training was normalized in some way or if it's been trained on both representations of the letters.
I guess since it "breaks" the output on the screen, I feel like the correct thing to do is to prevent user input from starting with a combining character. But at the same time I'm hesitant to take that away if someone depends on that use case. Any thoughts? |
After this update, the program began to hang after entering the input. I'm using w64devkit to run. |
Can you explain hang? Like it just waits? Is your CPU going? Can you put more input in? |
I enter an input, press enter and for some reason the cursor does not go to a new line. After that, my 6 core 6 thread processor is loaded by 20% by this process, although 3 cores are specified in the arguments. I can’t add anything more to the input, all my additional inputs are simply ignored. I tried to wait for 5 minutes, nothing changes, the load on the processor remains the same, the RAM is filled in the same way. Changing the model does not change anything. |
@newTomas: Try You could also pass the argument |
@Senemu |
It's strange I see "Ярлык" in your title bar, so your system is configured for Russian? I wonder if it's related to that or if it could be due to something misconfigured in w64devkit. I wonder if it would still get stuck with a different keyboard layout. |
Well, it looks like you've narrowed it down to the
Either way, I may be able to replace |
Indeed, the official build works correctly. If you make a version for w64devkit with mingw, I'll test it. |
If it's only happening in w64devkit then this has to be a mingw bug. More precisely, w64devkit uses the mingw-w64 toolchain, which is a fork of the original mingw project. Are you able to try other versions of w64devkit/mingw-w64? It might be fixed in a later version or not be there in an earlier version. I'll try to swap out Are you a programmer or developer yourself? We could file a bug report with w64devkit/mingw-w64, but you typically want the smallest example code that features the bug. I don't have time to test it right now but I believe I've boiled down the bug to just this code: #include <windows.h>
#include <winnls.h>
#include <fcntl.h>
#include <wchar.h>
#include <stdio.h>
#include <io.h>
int main() {
// Initialize for reading wchars and writing out UTF-8
DWORD dwMode = 0;
HANDLE hConsole = GetStdHandle(STD_OUTPUT_HANDLE);
if (hConsole == INVALID_HANDLE_VALUE || !GetConsoleMode(hConsole, &dwMode)) {
hConsole = GetStdHandle(STD_ERROR_HANDLE);
if (hConsole != INVALID_HANDLE_VALUE && (!GetConsoleMode(hConsole, &dwMode))) {
hConsole = NULL;
}
}
if (hConsole) {
SetConsoleMode(hConsole, dwMode | ENABLE_VIRTUAL_TERMINAL_PROCESSING);
SetConsoleOutputCP(CP_UTF8);
}
HANDLE hConIn = GetStdHandle(STD_INPUT_HANDLE);
if (hConIn != INVALID_HANDLE_VALUE && GetConsoleMode(hConIn, &dwMode)) {
_setmode(_fileno(stdin), _O_WTEXT);
dwMode &= ~(ENABLE_LINE_INPUT | ENABLE_ECHO_INPUT);
SetConsoleMode(hConIn, dwMode);
}
// Echo input
while (1) {
// Read
wchar_t wcs[2] = { getwchar(), L'\0' };
if (wcs[0] == WEOF) break;
if (wcs[0] >= 0xD800 && wcs[0] <= 0xDBFF) { // If we have a high surrogate...
wcs[1] = getwchar(); // Read the low surrogate
if (wcs[1] == WEOF) break;
}
// Write
char utf8[5] = {0};
int result = WideCharToMultiByte(CP_UTF8, 0, wcs, (wcs[1] == L'\0') ? 1 : 2, utf8, 4, NULL, NULL);
if (result > 0) {
printf("%s", utf8);
}
}
return 0;
} There might be other settings we change that trigger the bug in mingw-w64, but I'm guessing it's in this snippet. (We might even be able to make it smaller by cutting out some of that initialization.) If you're able to test that and see if the bug still happens, we could consider opening a bug report with mingw-w64. |
I have quite a bit of experience with C/C++, my main language is typescript. I created a folder with the I also have the latest version of I also found a separate I also found Then I downloaded the latest version of w64devkit, but this bug is still there. Is this a bug in the compiler itself and has been for a long time? |
That's great for two reasons. First, there's nothing crazy we're doing. This is a perfectly acceptable way to read Unicode in Windows. And second, that means we can open a bug report with that small snippet. You spent the time isolating the function at large, so if you'd like you can take credit and open a bug report. (I edited the code in my previous reply to use
It even happens in cygwin? Is that compiler even based on mingw? Either way it could be the compiler itself or their implementation (or wrapper around) Windows' Edit: Oh, in your cygwin output it says "Built by MinGW-W64". For some reason I thought they used regular GCC internally or something. |
I like to test with these two strings: You may have to use Windows Terminal for some of these characters to show up properly. Limitations in the Windows Command Prompt prevent some of them from displaying there. |
I think the best thing to do is open a bug report. We aren't using it wrong. It shouldn't be hanging. |
Okay, will you open it? |
I've tried it in Debian on my Linux system, in an Ubuntu WSL install, and in Windows with the MS compilers. I haven't been able to reproduce the bug. I believe you, but when filling out the bug report is asks you about your system and setup and versions and stuff. Also, if they ask me questions about something or suggest I try something different, I want to be in a position where I can check. So I'll probably get around to it, but it's going to take a little time. |
As I understand it, you don’t really want to deal with this bug report, can you then help open the bug report itself, and then I will test possible fixes myself? Can we discuss this, for example, in a telegram? |
Don't worry, I haven't abandoned you. I caught COVID. I'm sure I'll be fine but it's wiping me out a bit. I'm just going to rewrite the console reads to process the Windows key events. We already have |
@newTomas Can you try out #1462 and verify that it fixes the problem for you? If you don't know how to checkout a pull request you can apply this patch with |
@DannyDaemonic Get well. Let me know when you can start creating a bug report. |
I wanted to make it easier for authors to use this in their writing process.
I've tested the code on both Linux and Windows. I don't see why it wouldn't also work on a Mac, but it would be nice to have confirmation of that.
I also apologize for the volume of lines changed but each change seemed tied to the next so I ended up putting it through as one commit. With any code changes, my goal was to make
main.cpp
simpler to read. There's still a lot that can be done, but I do believe the input loop is a lot cleaner now.Changes:
--author-mode
where we read from input until the user ends their line with a\
/
such that inference starting from that position (without the\n
)--color
is enabled) so it's harder to accidentally end a line with\
or/
Before:

After:


Detailed changes
Author mode allows one to write (or paste) multiple lines of text without ending each line in a
\
. LLM models do better the more context you give them, but the original input flow made this difficult. (Before pasting your writing you'd have to edit a\
into each line or paste it to a file, save that file, and then pass that path in with-f
. If you're doing multiple revisions, that can become overwhelming.Another useful feature for writing is the ability to start inference in the middle of a line. It’s not uncommon for writers to get stuck in the middle of a paragraph or piece of dialogue. I wanted to provide a way for the language model to generate text from that point, but previously, the best one could do was to hit enter and have the code append a
\n
to the buffer before starting inference.I considered making author mode the default and creating a
--chat-mode
option with the previous behavior (and implying chat mode with--instruct
). However, I decided against changing the default interaction people were familiar with. As a compromise, I allow the/
operator to work in both modes.The other quality of life improvement I really wanted for writing with the AI was to get rid of all the superfluous newlines and control characters that jumble up your text while working. To do this and have it display properly, I had to change the code to read one char at a time. By reading one character at a time, we don't have to print the newline when the user hits enter. This allows the use of
/
and\
to flow more smoothly and lets us clean up any control characters from the input before we start text inference. This also makes the Windows and Linux versions behave the same in regards to input buffering, for better and for worse. (For example, if you hit a key while it's inferring, it won't jumble up the text, but you also won't see your text while the model is loading.)Another improvement I made for writing with the AI was to eliminate superfluous newlines and control characters that can clutter up text while working. To achieve this and display text properly, I changed the code to read one character at a time. To start with, when the user hits enter, we don’t have to print a newline. It allows for smoother use of
/
and and lets us remove any control characters from the input before starting text inference. This change also makes the Windows and Linux versions behave consistently in terms of input buffering, with both advantages and disadvantages. For instance, hitting a key while the model is inferring won’t mix it into the text on Linux anymore, but you also won’t see any of the input you type while the model is loading.The other code changes are to
console_state
. It now holds the previoustermios
settings so we can safely reset the terminal on exit for POSIX systems. This also means includingtermios.h
. The other change was to add theauthor_mode
boolean. I thought about calling itinput_mode
and making anenum
but unless we want to have different input modes for chat, instruct, etc, this is just simpler (and it's easily changed in the future).I also considered finding a way of hiding that first space that's generated when priming the llama models since the original llama tokenizer also starts each prompt with a single space. Are we sure this is necessary? If so, would it be acceptable to just hide that first space?
Future possible changes
Further quality of life enhancements:
isatty
on POSIX systems and_isatty
on Windowsgrep
already do this--color=always,never,auto
just likegrep
has but perhaps switch fromcolor=never
tocolor=auto
on any interactive mode (for compatibility--color
without an option would just indicatealways
)use_color
is true since changing the cursor will leave artifacts in stdout if piped\
and/
Further code improvements:
last_n_tokens
with a ring-buffermain.cpp
codeOf the future possible changes, I think using the other control keys is most appealing to me. Followed by putting the input on a separate thread to allow context chugging on previous lines as the user writes subsequent lines. I'm not very active on github so I have no way of proving it, but I'm quite experience with threading and synchronization. Both of these (particularly the multithreading) may add more complexity to the code than is desired for an "example" program.
Edit: This is also a first step towards moving away from
Ctrl-C
to interrupt inference if that's desired. Once we are handling keys one a time like this, we can more easily makeEsc
(or any other key) interrupt inference.