the new mmap method does not work on Windows 11 ? #639

cmp-nct · 2023-03-30T22:26:30Z

I tried migration and to create the new weights from pth, in both cases the mmap fails.
Always says "failed to mmap"

jart · 2023-03-30T22:50:23Z

Could you modify llama.cpp on your local machine to add the following error reporting code (sorry it isn't in master yet)

static int WinStrerror(int err, char *buf, int size) {
    return FormatMessageA(
        FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS,
        NULL, err, MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
        buf, size, NULL);
}

static void LogWindowsError(const char* file, int line, const char* thing) {
#define LogWindowsError(thing) LogWindowsError(__FILE__, __LINE__, thing)
    char s[256];
    int e = GetLastError();
    WinStrerror(e, s, sizeof(s));
    fprintf(stderr, "%s:%d: error[%#x]: %s failed: %s\n", file, line, e, thing, s);
}

Then modify map_file() so it calls LogWindowsError("mmap") from all the places in the function where it does a return 0;. This should hopefully give us a better clue why.

cmp-nct · 2023-03-30T23:35:19Z

Hi Jart
That's a neat error reporting function.
void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, 0);
error[0x8]: mmap failed: Not enough memory resources are available to process this command.
(I have 50GB of RAM free)
To fix it I tried to pass it the number of bytes as last parameter, that causes the mmap to go through without error:
void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, size);

But something goes bad, likely some type of overflow issue now.

I have 2 cases after adding "size":

When using the 16 bit converted (from PTH, 7B llama) model as release version:
It just stops mid way.

.\Release\main.exe  -m .\models\7B\ggml-model-f16_ggjt.bin -p Fuck -c 10
main: seed = 1680218175
llama_model_load: loading model from '.\models\7B\ggml-model-f16_ggjt.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 10
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 1
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 12853.45 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 2357.53 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from '.\models\7B\ggml-model-f16_ggjt.bin'

As Debug:

.\Debug\main.exe  -m .\models\7B\ggml-model-f16_ggjt.bin -p Fuck -c 10
main: seed = 1680218185
llama_model_load: loading model from '.\models\7B\ggml-model-f16_ggjt.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 10
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 1
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
llama_model_load: ggml map size = 12853.45 MB
llama_model_load: ggml ctx size =  81.25 KB
llama_model_load: mem required  = 2357.53 MB (+ 1026.00 MB per state)
llama_model_load: loading tensors from '.\models\7B\ggml-model-f16_ggjt.bin'
llama_model_load: unknown tensor '=���Ɩ�'���$���!8�ߤ���!� z
����y�!���B(i����&��D�k�F��(��S&���#($��E'v��4+���&u���c���j��t�������6���\ b%��/&- D)8�N���I�ޡ�����ƨ[���Q���v)h�P��$E%����p�������$�X$�� �� �%/�
'\"t���3$���
�֐��#� ���$��$�%���"?�ܥŠ�������Қ
��@(#�r$��]��(΍��� Ρ��n#ˠ� ��ݧ�!���#G&���))N��(�����">�Z��������&��K���k��H ���H���զS������)ئ��E(�!9$�(У�%Q��$I���')�#�$o��#6���e����#��j�2+蜚���{�Р�#}���_����#�(N���8$Ϩ5�P����$ۙ((�#&k�#��'U��*��;�~&�$W�9����$�%ɤ���P��
�ԥ�)}������'�'A�-%���)�!e�s(����;�"�� p�."R������h#��C$p#��g���W���[��(n#�(���%�%_$󢜥�# �'��G�� N(��ĨɥݤK�&
�' in model file

And the 3rd variant: the 4 bit version (used migrate script on this one):

 .\Debug\main.exe  -m .\models\7B\ggml-model-q4_ggjt.bin -p Fuck -c 10
main: seed = 1680219253
llama_model_load: loading model from '.\models\7B\ggml-model-q4_ggjt.bin' - please wait ...
llama_model_load: n_vocab = 32000
llama_model_load: n_ctx   = 10
llama_model_load: n_embd  = 4096
llama_model_load: n_mult  = 256
llama_model_load: n_head  = 32
llama_model_load: n_layer = 32
llama_model_load: n_rot   = 128
llama_model_load: f16     = 2
llama_model_load: n_ff    = 11008
llama_model_load: n_parts = 1
llama_model_load: type    = 1
Failed to map file: .\models\7B\ggml-model-q4_ggjt.bin of size 4212859520
Q:\llama.cpp:366: error[0x57]: mmap failed: The parameter is incorrect.

prusnak · 2023-03-30T23:46:37Z

@cmp-nct does it help if you drop the fileSize.QuadPart = -1; line?

cmp-nct · 2023-03-30T23:58:39Z

@prusnak
Nop, makes no difference.
I also tried a completely different implementation with the same result.

For your reference both variants below:
The usual includes are needed for the linuxesque version.
#include <io.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

 /*int fd = _open(fname, _O_RDONLY);
    if (fd == -1) {
        LogWindowsError("mmap");
        return NULL;
    }

    int64_t length = _filelengthi64(fd);
    if (length == -1) {
        LogWindowsError("mmap");
        _close(fd);
        return NULL;
    }
    *mm_length = length;

    HANDLE hFile = (HANDLE)_get_osfhandle(fd);
    if (hFile == INVALID_HANDLE_VALUE) {
        LogWindowsError("mmap");
        _close(fd);
        return NULL;
    }

    HANDLE hMapping = CreateFileMappingA(hFile, NULL, PAGE_READONLY, 0, 0, NULL);
    if (!hMapping) {
        LogWindowsError("mmap");
        _close(fd);
        return NULL;
    }

    void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, length);
    CloseHandle(hMapping);
    _close(fd);
    if (!addr) {
        fprintf(stderr, "Failed to map file: %s of size %zu\n", fname, length);
        LogWindowsError("mmap");
        return NULL;
    }*/
       HANDLE hFile = CreateFileA(fname,
                               GENERIC_READ,
                               FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
                               NULL,
                               OPEN_EXISTING,
                               FILE_ATTRIBUTE_NORMAL | FILE_ATTRIBUTE_NOT_CONTENT_INDEXED,
                               NULL);
    if (hFile == INVALID_HANDLE_VALUE) 
    {
        LogWindowsError("mmap");
        return 0;
    }

    LARGE_INTEGER fileSize;
    // fileSize.QuadPart = -1;
    GetFileSizeEx(hFile, &fileSize);
    int64_t length = fileSize.QuadPart;
    HANDLE hMapping = CreateFileMappingA(hFile, NULL, PAGE_READONLY, 0, 0, NULL);
    CloseHandle(hFile);
    if (!hMapping) 
    {
        LogWindowsError("mmap");
        return 0;
    }
    void *addr = MapViewOfFile(hMapping, FILE_MAP_READ, 0, 0, 0);
    CloseHandle(hMapping);
    if (!addr) 
    {
        LogWindowsError("mmap");
        return 0;
    }

CoderRC · 2023-03-31T00:00:01Z

What are you using to compile this program CMAKE or make (#103 (comment))

cmp-nct · 2023-03-31T00:00:57Z

What are you using to compile this program CMAKE or make (#103 (comment))

VS Code cmake

CoderRC · 2023-03-31T00:02:08Z

Ohhh for me it works because I used a POSIX compatibility layer.

cmp-nct · 2023-03-31T00:04:50Z

Ohhh for me it works because I used a POSIX compatibility layer.

You mean you compile it with gcc/mingw ?
I can give that a try, though I think using the Microsoft compiler is likely the best in terms of performance optimizations.
(based on experience a decade ago, so times might have changed)

CoderRC · 2023-03-31T00:06:28Z

Just try to follow my guide it is in #103 (comment)

CoderRC · 2023-03-31T00:07:05Z

I can debug the problem for you there ok.

cmp-nct · 2023-03-31T00:30:16Z

Changing the whole build environment plus external dependencies to your custom libs is a bit of a pain.
I like what you are doing with the posix functions but I think we really need to fix the bug that causes this.

CoderRC · 2023-03-31T00:30:53Z

Did it work using my strategy?

CoderRC · 2023-03-31T01:19:12Z

^
|
Comment above deleted
My mmap is in https://github.com/CoderRC/libmingw32_extended.git

CoderRC · 2023-03-31T01:21:36Z

I separate each function with it's own file.

cmp-nct · 2023-03-31T01:23:21Z

Damn it, I found the issue. Wrong compiler !
I reinstalled build tools, selected the AMD64 compiler and now it works.

CoderRC · 2023-03-31T01:24:19Z

Try with my steps I want to see if I have bugs

cmp-nct · 2023-03-31T01:26:07Z

Try with my steps I want to see if I have bugs

The problem was that I had a x86 compiler linked.
I believe the size variable (last parameter of the mapping function) in the windows system call overflowed, that's why I had such inconsistent results (the larger one worked partly, the smaller one not)
Sorry for creating a fuzz, that was the last location I expected such an error to originate from.

CoderRC · 2023-03-31T01:27:39Z

Yes, the models are larger than 2^32, so I also had a problem in my library, but I fixed it.
That is why I want this project to be compatible with different OS.

kevingosse · 2023-04-01T14:13:43Z

For future people running into the same issue: in my case the project was building in x86 despite using the x64 command prompt because I had a separate installation of cmake that was taking precedence in my path:

E:\git\llama.cpp\build>where cmake
C:\Program Files\CMake\bin\cmake.exe
C:\Program Files\Microsoft Visual Studio\2022\Preview\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe

I regenerated the files by explicitly calling the VS cmake and it worked.

jart added the need more info The OP should provide more details about the issue label Mar 30, 2023

jart self-assigned this Mar 30, 2023

cmp-nct closed this as completed Mar 31, 2023

Bearsaerker mentioned this issue Mar 12, 2025

Eval bug: Gemma 3 extremly slow prompt processing when using quantized kv cache. #12352

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the new mmap method does not work on Windows 11 ? #639

the new mmap method does not work on Windows 11 ? #639

cmp-nct commented Mar 30, 2023

jart commented Mar 30, 2023

cmp-nct commented Mar 30, 2023 •

edited

Loading

prusnak commented Mar 30, 2023

cmp-nct commented Mar 30, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023 •

edited

Loading

CoderRC commented Mar 31, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023

CoderRC commented Mar 31, 2023

CoderRC commented Mar 31, 2023 •

edited

Loading

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023 •

edited

Loading

CoderRC commented Mar 31, 2023

kevingosse commented Apr 1, 2023 •

edited

Loading

the new mmap method does not work on Windows 11 ? #639

the new mmap method does not work on Windows 11 ? #639

Comments

cmp-nct commented Mar 30, 2023

jart commented Mar 30, 2023

cmp-nct commented Mar 30, 2023 • edited Loading

prusnak commented Mar 30, 2023

cmp-nct commented Mar 30, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023 • edited Loading

CoderRC commented Mar 31, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023

CoderRC commented Mar 31, 2023

CoderRC commented Mar 31, 2023 • edited Loading

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023

CoderRC commented Mar 31, 2023

cmp-nct commented Mar 31, 2023 • edited Loading

CoderRC commented Mar 31, 2023

kevingosse commented Apr 1, 2023 • edited Loading

cmp-nct commented Mar 30, 2023 •

edited

Loading

cmp-nct commented Mar 31, 2023 •

edited

Loading

CoderRC commented Mar 31, 2023 •

edited

Loading

cmp-nct commented Mar 31, 2023 •

edited

Loading

kevingosse commented Apr 1, 2023 •

edited

Loading