Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the new mmap method does not work on Windows 11 ? #639

Closed
cmp-nct opened this issue Mar 30, 2023 · 19 comments
Closed

the new mmap method does not work on Windows 11 ? #639

cmp-nct opened this issue Mar 30, 2023 · 19 comments
Assignees
Labels
need more info The OP should provide more details about the issue

Comments

@cmp-nct
Copy link
Contributor

cmp-nct commented Mar 30, 2023

I tried migration and to create the new weights from pth, in both cases the mmap fails.
Always says "failed to mmap"

@jart
Copy link
Contributor

jart commented Mar 30, 2023

Could you modify llama.cpp on your local machine to add the following error reporting code (sorry it isn't in master yet)

static int WinStrerror(int err, char *buf, int size) {
    return FormatMessageA(
        FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
        NULL, err, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
        buf, size, NULL);
}

static void LogWindowsError(const char* file, int line, const char* thing) {
#define LogWindowsError(thing) LogWindowsError(__FILE__, __LINE__, thing)
    char s[256];
    int e = GetLastError();
    WinStrerror(e, s, sizeof(s));
    fprintf(stderr, "%s:%d: error[%#x]: %s failed: %s\n", file, line, e, thing, s);
}

Then modify map_file() so it calls LogWindowsError("mmap") from all the places in the function where it does a return 0;. This should hopefully give us a better clue why.

@jart jart added the need more info The OP should provide more details about the issue label Mar 30, 2023
@jart jart self-assigned this Mar 30, 2023
@cmp-nct
Copy link
Contributor Author

cmp-nct commented Mar 30, 2023

Hi Jart
That's a neat error reporting function.
void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, 0);
error[0x8]: mmap failed: Not enough memory resources are available to process this command.
(I have 50GB of RAM free)
To fix it I tried to pass it the number of bytes as last parameter, that causes the mmap to go through without error:
void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, size);

But something goes bad, likely some type of overflow issue now.

I have 2 cases after adding "size":

  1. When using the 16 bit converted (from PTH, 7B llama) model as release version:
    It just stops mid way.
.\Release\main.exe  -m .\models\7B\ggml-model-f16_ggjt.bin -p Fuck -c 10
main: seed = 1680218175
llama_model_load: loading model from '.\models\7B\ggml-model-f16_ggjt.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 10
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 1
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 12853.45 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 2357.53 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from '.\models\7B\ggml-model-f16_ggjt.bin'

As Debug:

.\Debug\main.exe  -m .\models\7B\ggml-model-f16_ggjt.bin -p Fuck -c 10
main: seed = 1680218185
llama_model_load: loading model from '.\models\7B\ggml-model-f16_ggjt.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 10
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 1
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 12853.45 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 2357.53 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from '.\models\7B\ggml-model-f16_ggjt.bin'
llama_model_load: unknown tensor '=���Ɩ�'���$���!8�ߤ���!� z
����y�!���B(i����&��D�k�F��(��S&���#($��E'v��4+���&u���c���j��t�������6���\ b%��/&- D)8�N���I�ޡ�����ƨ[���Q���v)h�P��$E%����p�������$�X$�� �� �%/�
'\"t���3$���
�֐��#� ���$��$�%���"?�ܥŠ�������Қ
��@(#�r$��]��(΍��� Ρ��n#ˠ� ��ݧ�!���#G&���))N��(�����">�Z��������&��K���k��H ���H���զS������)ئ��E(�!9$�(У�%Q��$I���')�#�$o��#6���e����#��j�2+蜚���{�Р�#}���_����#�(N���8$Ϩ5�P����$ۙ((�#&k�#��'U��*��;�~&�$W�9����$�%ɤ���P��
�ԥ�)}������'�'A�-%���)�!e�s(����;�"�� p�."R������h#��C$p#��g���W���[��(n#�(���%�%_$󢜥�# �'��G�� N(��ĨɥݤK�&
�' in model file

And the 3rd variant: the 4 bit version (used migrate script on this one):

 .\Debug\main.exe  -m .\models\7B\ggml-model-q4_ggjt.bin -p Fuck -c 10
main: seed = 1680219253
llama_model_load: loading model from '.\models\7B\ggml-model-q4_ggjt.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 10
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
Failed to map file: .\models\7B\ggml-model-q4_ggjt.bin of size 4212859520
Q:\llama.cpp:366: error[0x57]: mmap failed: The parameter is incorrect.

@prusnak
Copy link
Collaborator

prusnak commented Mar 30, 2023

@cmp-nct does it help if you drop the fileSize.QuadPart = -1; line?

@cmp-nct
Copy link
Contributor Author

cmp-nct commented Mar 30, 2023

@prusnak
Nop, makes no difference.
I also tried a completely different implementation with the same result.

For your reference both variants below:
The usual includes are needed for the linuxesque version.
#include <io.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

 /*int fd = _open(fname, _O_RDONLY);
    if (fd == -1) {
        LogWindowsError("mmap");
        return NULL;
    }

    int64_t length = _filelengthi64(fd);
    if (length == -1) {
        LogWindowsError("mmap");
        _close(fd);
        return NULL;
    }
    *mm_length = length;

    HANDLE hFile = (HANDLE)_get_osfhandle(fd);
    if (hFile == INVALID_HANDLE_VALUE) {
        LogWindowsError("mmap");
        _close(fd);
        return NULL;
    }

    HANDLE hMapping = CreateFileMappingA(hFile, NULL, PAGE_READONLY, 0, 0, NULL);
    if (!hMapping) {
        LogWindowsError("mmap");
        _close(fd);
        return NULL;
    }

    void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, length);
    CloseHandle(hMapping);
    _close(fd);
    if (!addr) {
        fprintf(stderr, "Failed to map file: %s of size %zu\n", fname, length);
        LogWindowsError("mmap");
        return NULL;
    }*/
       HANDLE hFile = CreateFileA(fname,
                               GENERIC_READ,
                               FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
                               NULL,
                               OPEN_EXISTING,
                               FILE_ATTRIBUTE_NORMAL | FILE_ATTRIBUTE_NOT_CONTENT_INDEXED,
                               NULL);
    if (hFile == INVALID_HANDLE_VALUE) 
    {
        LogWindowsError("mmap");
        return 0;
    }

    LARGE_INTEGER fileSize;
    // fileSize.QuadPart = -1;
    GetFileSizeEx(hFile, &fileSize);
    int64_t length = fileSize.QuadPart;
    HANDLE hMapping = CreateFileMappingA(hFile, NULL, PAGE_READONLY, 0, 0, NULL);
    CloseHandle(hFile);
    if (!hMapping) 
    {
        LogWindowsError("mmap");
        return 0;
    }
    void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, 0);
    CloseHandle(hMapping);
    if (!addr) 
    {
        LogWindowsError("mmap");
        return 0;
    }

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

What are you using to compile this program CMAKE or make (#103 (comment))

@cmp-nct
Copy link
Contributor Author

cmp-nct commented Mar 31, 2023

What are you using to compile this program CMAKE or make (#103 (comment))

VS Code cmake

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

Ohhh for me it works because I used a POSIX compatibility layer.

@cmp-nct
Copy link
Contributor Author

cmp-nct commented Mar 31, 2023

Ohhh for me it works because I used a POSIX compatibility layer.

You mean you compile it with gcc/mingw ?
I can give that a try, though I think using the Microsoft compiler is likely the best in terms of performance optimizations.
(based on experience a decade ago, so times might have changed)

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

Just try to follow my guide it is in #103 (comment)

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

I can debug the problem for you there ok.

@cmp-nct
Copy link
Contributor Author

cmp-nct commented Mar 31, 2023

Changing the whole build environment plus external dependencies to your custom libs is a bit of a pain.
I like what you are doing with the posix functions but I think we really need to fix the bug that causes this.

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

Did it work using my strategy?

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

^
|
Comment above deleted
My mmap is in https://github.com/CoderRC/libmingw32_extended.git

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

I separate each function with it's own file.

@cmp-nct
Copy link
Contributor Author

cmp-nct commented Mar 31, 2023

Damn it, I found the issue. Wrong compiler !
I reinstalled build tools, selected the AMD64 compiler and now it works.

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

Try with my steps I want to see if I have bugs

@cmp-nct
Copy link
Contributor Author

cmp-nct commented Mar 31, 2023

Try with my steps I want to see if I have bugs

The problem was that I had a x86 compiler linked.
I believe the size variable (last parameter of the mapping function) in the windows system call overflowed, that's why I had such inconsistent results (the larger one worked partly, the smaller one not)
Sorry for creating a fuzz, that was the last location I expected such an error to originate from.

@CoderRC
Copy link

CoderRC commented Mar 31, 2023

Yes, the models are larger than 2^32, so I also had a problem in my library, but I fixed it.
That is why I want this project to be compatible with different OS.

@cmp-nct cmp-nct closed this as completed Mar 31, 2023
@kevingosse
Copy link

kevingosse commented Apr 1, 2023

For future people running into the same issue: in my case the project was building in x86 despite using the x64 command prompt because I had a separate installation of cmake that was taking precedence in my path:

E:\git\llama.cpp\build>where cmake
C:\Program Files\CMake\bin\cmake.exe
C:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe

I regenerated the files by explicitly calling the VS cmake and it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need more info The OP should provide more details about the issue
Projects
None yet
Development

No branches or pull requests

5 participants