-
-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using compiled version on Windows #98
Comments
updateI managed to load the extension into db.execute("""
CREATE VIRTUAL TABLE IF NOT EXISTS vss_post USING vss0(embeddings(3));
""") it crash the program without any error. |
Thanks for the detailed report and updates! You're the first person to report being able to compile A few questions:
|
See it in action: C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>sqlite3.exe
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .open random.db
sqlite> .load vector0.dll
sqlite> select vector_version();
v0.1.2
sqlite> .load vss0.dll
sqlite> select vector_version();
v0.1.2
sqlite> select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');
0.200000002980232
sqlite> When running the same from select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]'); |
Does it throw an |
No error, it just exit from Code hereimport sqlite3
conn = sqlite3.connect('random.db')
conn.enable_load_extension(True)
conn.load_extension('vector0.dll')
conn.load_extension('vss0.dll')
cur = conn.cursor()
cur.execute('select vector_version();')
version = cur.fetchone()
print(version) # Working
cur.execute("select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');") # < ---- Crash
res = cur.fetchone()
print(res)
cur.close()
conn.close() |
The fact that it only happens when executing functions that use Faiss's vector computations (ie fails on That's my guess at least - my knowledge with Windows is very limited. I'd say double check that those dll's exist and work correctly (probably with ldd ? ). I'd be curious to see if there's a sample Faiss C++ project you could compile + execute on your Windows machine, to see if it's a Faiss compilation error or a |
I used the same |
I added Strace outputUser@DESKTOP-HPEE9O3 /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/sqlite/.libs
$ strace /cygdrive/c/Users/User/AppData/Local/Programs/Python/Python311/python.exe main.py
--- Process 17912 created
--- Process 17912 loaded C:\Windows\System32\ntdll.dll at 00007ff9bb4f0000
--- Process 17912 loaded C:\Windows\System32\kernel32.dll at 00007ff9ba020000
--- Process 17912 loaded C:\Windows\System32\KernelBase.dll at 00007ff9b8cf0000
--- Process 17912 thread 19168 created
--- Process 17912 thread 9276 created
--- Process 17912 loaded C:\Windows\System32\ucrtbase.dll at 00007ff9b8b10000
--- Process 17912 thread 3052 created
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\vcruntime140.dll at 00007ff9206e0000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python311.dll at 00007ff915b90000
--- Process 17912 loaded C:\Windows\System32\version.dll at 00007ff9af240000
--- Process 17912 loaded C:\Windows\System32\ws2_32.dll at 00007ff9b9fa0000
--- Process 17912 loaded C:\Windows\System32\msvcrt.dll at 00007ff9b92b0000
--- Process 17912 loaded C:\Windows\System32\rpcrt4.dll at 00007ff9ba740000
--- Process 17912 loaded C:\Windows\System32\advapi32.dll at 00007ff9ba9f0000
--- Process 17912 loaded C:\Windows\System32\sechost.dll at 00007ff9b94f0000
--- Process 17912 loaded C:\Windows\System32\bcrypt.dll at 00007ff9b8210000
--- Process 17912 loaded C:\Windows\System32\bcryptprimitives.dll at 00007ff9b8a90000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python3.dll at 000002db01780000
--- Process 17912 unloaded DLL at 000002db01780000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python3.dll at 000002db01780000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\DLLs\_sqlite3.pyd at 00007ff9b31c0000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\DLLs\sqlite3.dll at 00007ff923b90000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\vector0.dll at 000000055d4d0000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggcc_s-seh-1.dll at 00000003ff870000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygwin1.dll at 00007ff921030000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygstdc++-6.dll at 00000003fec80000
0 0 [main] python (17912) **********************************************
206 206 [main] python (17912) Program name: c:\Users\User\AppData\Local\Programs\Python\Python311\python.exe (windows pid 17912)
165 371 [main] python (17912) OS version: Windows NT-10.0
132 503 [main] python (17912) **********************************************
--- Process 17912 loaded C:\Windows\System32\cryptbase.dll at 00007ff9b8040000
3365 3868 [main] python (17912) sigprocmask: 0 = sigprocmask (0, 0x0, 0x7FF9213093B0)
533 4401 [main] python (17912) open_shared: name shared.5, shared 0x1A4000000 (wanted 0x1A4000000), h 0x1A0, m 0, created 1
216 4617 [main] python (17912) shared_info::initialize: Installation root: <┬ג┬ה> key: <a32a5794382fce65>
167 4784 [main] python (17912) user_heap_info::init: heap base 0xA00000000, heap top 0xA00000000, heap size 0x20000000 (536870912)
169 4953 [main] python (17912) open_shared: name S-1-5-21-567552140-2017299312-2275771347-1001.1, shared 0x1A4010000 (wanted 0x1A4010000), h 0x1A4, m 1, created 1
158 5111 [main] python (17912) user_info::create: opening user shared for 'S-1-5-21-567552140-2017299312-2275771347-1001' at 0x1A4010000
171 5282 [main] python (17912) user_info::create: user shared version 0
175 5457 [main] python (17912) dll_crt0_0: finished dll_crt0_0 initialization
202 5659 [main] python (17912) time: 1693222268 = time(0x0)
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\vss0.dll at 00000005ca2f0000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygblas-0.dll at 00000003f8ca0000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggomp-1.dll at 00000003fe310000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyglapack-0.dll at 00000003f8160000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggfortran-5.dll at 00000003f8890000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygquadmath-0.dll at 00000003fc150000
('v0.1.2',)
28871 34530 [main] python (17912) mmap: addr 0x0, len 34319826944, prot 0x3, flags 0x22, fd -1, off 0x0
233972 268502 [main] python (17912) mmap: 0x6FF802610000 = mmap()
--- Process 17912, exception c0000005 at 00007ff921031026
--- Process 17912 thread 19168 exited with status 0xc0000005
--- Process 17912 thread 9276 exited with status 0xc0000005
--- Process 17912 thread 3052 exited with status 0xc0000005
--- Process 17912 exited with status 0xc0000005
Segmentation fault Also, when I run the script using cygwin64 python output$ file /usr/bin/python3.9.exe
/usr/bin/python3.9.exe: PE32+ executable (console) x86-64, for MS Windows, 11 sections
$ /usr/bin/python3.9.exe main.py
('v0.1.2',)
(0.20000000298023224,) |
I tried faiss sample code#include <cmath>
#include <cstdio>
#include <cstdlib>
#include <random>
#include <sys/time.h>
#include <faiss/IndexFlat.h>
#include <faiss/IndexIVFFlat.h>
#include <faiss/IndexPQ.h>
#include <faiss/index_io.h>
double elapsed() {
struct timeval tv;
gettimeofday(&tv, nullptr);
return tv.tv_sec + tv.tv_usec * 1e-6;
}
int main() {
double t0 = elapsed();
// dimension of the vectors to index
int d = 128;
// size of the database we plan to index
size_t nb = 1000 * 1000;
// make a set of nt training vectors in the unit cube
// (could be the database)
size_t nt = 100 * 1000;
//---------------------------------------------------------------
// Define the core quantizer
// We choose a multiple inverted index for faster training with less data
// and because it usually offers best accuracy/speed trade-offs
//
// We here assume that its lifespan of this coarse quantizer will cover the
// lifespan of the inverted-file quantizer IndexIVFFlat below
// With dynamic allocation, one may give the responsability to free the
// quantizer to the inverted-file index (with attribute do_delete_quantizer)
//
// Note: a regular clustering algorithm would be defined as:
// faiss::IndexFlatL2 coarse_quantizer (d);
//
// Use nhash=2 subquantizers used to define the product coarse quantizer
// Number of bits: we will have 2^nbits_coarse centroids per subquantizer
// meaning (2^12)^nhash distinct inverted lists
size_t nhash = 2;
size_t nbits_subq = int(log2(nb + 1) / 2); // good choice in general
size_t ncentroids = 1 << (nhash * nbits_subq); // total # of centroids
faiss::MultiIndexQuantizer coarse_quantizer(d, nhash, nbits_subq);
printf("IMI (%ld,%ld): %ld virtual centroids (target: %ld base vectors)",
nhash,
nbits_subq,
ncentroids,
nb);
// the coarse quantizer should not be dealloced before the index
// 4 = nb of bytes per code (d must be a multiple of this)
// 8 = nb of bits per sub-code (almost always 8)
faiss::MetricType metric = faiss::METRIC_L2; // can be METRIC_INNER_PRODUCT
faiss::IndexIVFFlat index(&coarse_quantizer, d, ncentroids, metric);
index.quantizer_trains_alone = true;
// define the number of probes. 2048 is for high-dim, overkilled in practice
// Use 4-1024 depending on the trade-off speed accuracy that you want
index.nprobe = 2048;
std::mt19937 rng;
std::uniform_real_distribution<> distrib;
{ // training
printf("[%.3f s] Generating %ld vectors in %dD for training\n",
elapsed() - t0,
nt,
d);
std::vector<float> trainvecs(nt * d);
for (size_t i = 0; i < nt * d; i++) {
trainvecs[i] = distrib(rng);
}
printf("[%.3f s] Training the index\n", elapsed() - t0);
index.verbose = true;
index.train(nt, trainvecs.data());
}
size_t nq;
std::vector<float> queries;
{ // populating the database
printf("[%.3f s] Building a dataset of %ld vectors to index\n",
elapsed() - t0,
nb);
std::vector<float> database(nb * d);
for (size_t i = 0; i < nb * d; i++) {
database[i] = distrib(rng);
}
printf("[%.3f s] Adding the vectors to the index\n", elapsed() - t0);
index.add(nb, database.data());
// remember a few elements from the database as queries
int i0 = 1234;
int i1 = 1244;
nq = i1 - i0;
queries.resize(nq * d);
for (int i = i0; i < i1; i++) {
for (int j = 0; j < d; j++) {
queries[(i - i0) * d + j] = database[i * d + j];
}
}
}
{ // searching the database
int k = 5;
printf("[%.3f s] Searching the %d nearest neighbors "
"of %ld vectors in the index\n",
elapsed() - t0,
k,
nq);
std::vector<faiss::idx_t> nns(k * nq);
std::vector<float> dis(k * nq);
index.search(nq, queries.data(), k, dis.data(), nns.data());
printf("[%.3f s] Query results (vector ids, then distances):\n",
elapsed() - t0);
for (int i = 0; i < nq; i++) {
printf("query %2d: ", i);
for (int j = 0; j < k; j++) {
printf("%7ld ", nns[j + i * k]);
}
printf("\n dis: ");
for (int j = 0; j < k; j++) {
printf("%7g ", dis[j + i * k]);
}
printf("\n");
}
}
return 0;
} Compile output$ g++ main.cpp -I./sqlite-vss/vendor/faiss ./sqlite-vss/build_release/vendor/faiss/faiss/libfaiss.a -fopenmp -lblas -llapack
$ ls
a.exe main.cpp sqlite-vss Program output$ ./a.exe
IMI (2,9): 262144 virtual centroids (target: 1000000 base vectors)[0.005 s] Generating 100000 vectors in 128D for training
[0.648 s] Training the index
Training level-1 quantizer
IVF quantizer trains alone...
Training IVF residual
IndexIVF: no residual training
[8.816 s] Building a dataset of 1000000 vectors to index
[15.241 s] Adding the vectors to the index
MultiIndexQuantizer::search: 0:32768 / 1000000
MultiIndexQuantizer::search: 32768:65536 / 1000000
MultiIndexQuantizer::search: 65536:98304 / 1000000
MultiIndexQuantizer::search: 98304:131072 / 1000000
MultiIndexQuantizer::search: 131072:163840 / 1000000
MultiIndexQuantizer::search: 163840:196608 / 1000000
MultiIndexQuantizer::search: 196608:229376 / 1000000
MultiIndexQuantizer::search: 229376:262144 / 1000000
MultiIndexQuantizer::search: 262144:294912 / 1000000
MultiIndexQuantizer::search: 294912:327680 / 1000000
MultiIndexQuantizer::search: 327680:360448 / 1000000
MultiIndexQuantizer::search: 360448:393216 / 1000000
MultiIndexQuantizer::search: 393216:425984 / 1000000
MultiIndexQuantizer::search: 425984:458752 / 1000000
MultiIndexQuantizer::search: 458752:491520 / 1000000
MultiIndexQuantizer::search: 491520:524288 / 1000000
MultiIndexQuantizer::search: 524288:557056 / 1000000
MultiIndexQuantizer::search: 557056:589824 / 1000000
MultiIndexQuantizer::search: 589824:622592 / 1000000
MultiIndexQuantizer::search: 622592:655360 / 1000000
MultiIndexQuantizer::search: 655360:688128 / 1000000
MultiIndexQuantizer::search: 688128:720896 / 1000000
MultiIndexQuantizer::search: 720896:753664 / 1000000
MultiIndexQuantizer::search: 753664:786432 / 1000000
MultiIndexQuantizer::search: 786432:819200 / 1000000
MultiIndexQuantizer::search: 819200:851968 / 1000000
MultiIndexQuantizer::search: 851968:884736 / 1000000
MultiIndexQuantizer::search: 884736:917504 / 1000000
MultiIndexQuantizer::search: 917504:950272 / 1000000
MultiIndexQuantizer::search: 950272:983040 / 1000000
MultiIndexQuantizer::search: 983040:1000000 / 1000000
IndexIVFFlat::add_core: added 1000000 / 1000000 vectors
[20.667 s] Searching the 5 nearest neighbors of 10 vectors in the index
[20.684 s] Query results (vector ids, then distances):
query 0: 1234 65776 815632 518751 168411
dis: 0 13.2041 13.7313 13.9331 13.9852
query 1: 1235 235209 32981 339156 485140
dis: 0 12.5675 13.2757 13.3526 13.3626
query 2: 1236 46384 393794 279123 803578
dis: 0 13.2079 13.337 13.5685 13.5999
query 3: 1237 172600 435871 490284 116815
dis: 0 12.9845 13.4125 13.4894 13.5741
query 4: 1238 185348 630264 685103 672356
dis: 0 11.3711 12.2562 12.2871 12.2897
query 5: 1239 820990 306204 3096 549432
dis: 0 12.4804 13.2535 13.5721 13.5853
query 6: 1240 122701 687644 802575 350632
dis: 0 13.7758 13.9611 14.1327 14.2155
query 7: 1241 985126 686744 336958 926803
dis: 0 13.2923 13.636 13.7428 14.0614
query 8: 1242 880999 488401 181311 712631
dis: 0 13.1505 13.4343 13.488 13.6331
query 9: 1243 829029 233144 108428 402759
dis: 0 12.5892 12.7777 12.8653 13.1447
$ echo $?
0 |
So to summarize, on your windows machine using cygwin64:
Is that right? |
Everything correct, except for the last one - |
Cool - so is there anything actionable you'd like from this issue then? My guess is that since I'll probably try compiling sqlite-vss with cygwin64 on a github actions runner, but its been very difficult in the past |
Currenly I want to figure out why do I get It will not be usable if we can't use it with regular |
I managed to compile |
Hello, Thanks in advance, |
Here's a way that worked for me, based on @thewh1teagle 's approach. I haven't tested it in-depth, but the sqlite3 cli can load the extensions and
# see https://github.com/facebookresearch/faiss/issues/3067#issuecomment-1873007384
pacman --needed -S $MINGW_PACKAGE_PREFIX-{toolchain,cmake,make,swig,autotools,lapack} git
git clone https://github.com/asg017/sqlite-vss.git && cd sqlite-vss
# see https://github.com/asg017/sqlite-vss/blob/main/docs.md
./vendor/get_sqlite.sh
cd vendor/sqlite
./configure && make
cd ../../
diff --git a/faiss/CMakeLists.txt b/faiss/CMakeLists.txt
index 16eb9e9c..940ba03f 100644
--- a/faiss/CMakeLists.txt
+++ b/faiss/CMakeLists.txt
@@ -214,8 +214,8 @@ add_library(faiss_avx2 ${FAISS_SRC})
if(NOT FAISS_OPT_LEVEL STREQUAL "avx2")
set_target_properties(faiss_avx2 PROPERTIES EXCLUDE_FROM_ALL TRUE)
endif()
-if(NOT WIN32)
- target_compile_options(faiss_avx2 PRIVATE $<$<COMPILE_LANGUAGE:CXX>:-mavx2 -mfma -mf16c -mpopcnt>)
+if(NOT MSVC)
+ target_compile_options(faiss_avx2 PRIVATE $<$<COMPILE_LANGUAGE:CXX>:-mavx2 -mfma -mf16c -mpopcnt -fpermissive>)
else()
# MSVC enables FMA with /arch:AVX2; no separate flags for F16C, POPCNT
# Ref. FMA (under /arch:AVX2): https://docs.microsoft.com/en-us/cpp/build/reference/arch-x64
diff --git a/faiss/impl/platform_macros.h b/faiss/impl/platform_macros.h
index 9cec8260..44293e3e 100644
--- a/faiss/impl/platform_macros.h
+++ b/faiss/impl/platform_macros.h
@@ -83,6 +83,17 @@ inline int __builtin_clzll(uint64_t x) {
#endif
#else
+
+/*******************************************************
+ * Windows MinGW
+ *******************************************************/
+#ifdef _WIN32
+
+#define posix_memalign(p, a, s) \
+ (((*(p)) = _aligned_malloc((s), (a))), *(p) ? 0 : errno)
+#endif
+
+
/*******************************************************
* Linux and OSX
*******************************************************/
diff --git a/faiss/invlists/InvertedListsIOHook.cpp b/faiss/invlists/InvertedListsIOHook.cpp
index 0081c4f9..2c3a6006 100644
--- a/faiss/invlists/InvertedListsIOHook.cpp
+++ b/faiss/invlists/InvertedListsIOHook.cpp
@@ -13,9 +13,9 @@
#include <faiss/invlists/BlockInvertedLists.h>
-#ifndef _MSC_VER
+#ifndef _WIN32
#include <faiss/invlists/OnDiskInvertedLists.h>
-#endif // !_MSC_VER
+#endif // !_WIN32
namespace faiss {
@@ -33,7 +33,7 @@ namespace {
/// std::vector that deletes its contents
struct IOHookTable : std::vector<InvertedListsIOHook*> {
IOHookTable() {
-#ifndef _MSC_VER
+#ifndef _WIN32
push_back(new OnDiskInvertedListsIOHook());
#endif
push_back(new BlockInvertedListsIOHook());
cmake -B build-release . -G "MinGW Makefiles" -DCMAKE_BUILD_TYPE=Release
cmake --build build-release -- -j<number of cores here>
# copy dlls for use outside of MSYS2
cp /ucrt64/bin/{libgcc_s_seh-1.dll,libwinpthread-1.dll,libblas.dll,libgomp-1.dll,liblapack.dll,libgfortran-5.dll,libquadmath-0.dll,libstdc++-6.dll} ./build-release/
|
Hi, @ma-chengyuan could you please share the dll file? |
@ma-chengyuan hi, I follow your instruction. The dlls only work with sqlite tool only. The built-in version sqlite3 (installed using .msi file on Windows) or node sqlite3 cannot load that dll as extension. |
Hi
I built
sqlite-vss
onWindows 11 x64
usingcygwin64
I followed the instructions in #building-sqlite-vss-yourself
And as shown, I got
vector0.dll
andvss0.dll
Then I placed the
dll
files in the same folder ofsqlite-vss/vendor/sqlite/.libs
which has the compiledsqlite.exe
and tried to load the extension into
sqlite
Looks like it loads successfully
vector0.dll
but it fails to loadvss0.dll
ldd output
The text was updated successfully, but these errors were encountered: