-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add new plugin : Genkit HNSW * chore: fix npm ci and sync package lock * fix glob dependancies conflict and update code * fix glob types dependancies * sync package lock * update lockfile, standardize plugin to existent ones * update lockfile * doc: remove coverage docs * remove LICENSE and update tsconfig.json * add source file headers * try to fix build error * revert tsconfig change * add logo and update readme --------- Co-authored-by: EPMatt <30753195+EPMatt@users.noreply.github.com> Co-authored-by: David Alonso <davidoort@hotmail.com>
- Loading branch information
1 parent
8bc8aa9
commit 1982eee
Showing
28 changed files
with
5,862 additions
and
4,416 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,3 +3,6 @@ node_modules/ | |
lib/ | ||
.env | ||
.vscode/launch.json | ||
plugins/.DS_Store | ||
lerna-debug.log | ||
.DS_Store |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Need contribution | ||
- Implement Genkit dot prompt | ||
- Implement new Genkit generate function to pass HNSW context to new context parameter | ||
- Implement another model for more model options | ||
- Implement multimodal | ||
- Implement leaderboard or ranking system | ||
- Increase validations | ||
- Implement more Genkit functionality of the latest release | ||
- Catch any bugs | ||
|
||
|
||
# How to contribute | ||
Learn more how to contribute [here](https://github.com/TheFireCo/genkit-plugins/blob/main/CONTRIBUTING.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
![Firebase Genkit + Convex](https://github.com/TheFireCo/genkit-plugins/blob/main/assets/genkit-hnsw.png?raw=true) | ||
|
||
<h1 align="center"> Firebase Genkit <> HNSW Vector Plugin</h1> | ||
|
||
<h4 align="center">HNSW Community Plugin for Google Firebase Genkit </h4> | ||
|
||
<div align="center"> | ||
<img alt="Github lerna version" src="https://img.shields.io/github/lerna-json/v/TheFireCo/genkit-plugins?label=version"> | ||
<img alt="NPM Downloads" src="https://img.shields.io/npm/dw/genkitx-hnsw"> | ||
<img alt="GitHub Org's stars" src="https://img.shields.io/github/stars/TheFireCo?style=social"> | ||
<img alt="GitHub License" src="https://img.shields.io/github/license/TheFireCo/genkit-plugins"> | ||
<img alt="Static Badge" src="https://img.shields.io/badge/yes-a?label=maintained"> | ||
</div> | ||
|
||
<div align="center"> | ||
<img alt="GitHub Issues or Pull Requests" src="https://img.shields.io/github/issues/TheFireCo/genkit-plugins?color=blue"> | ||
<img alt="GitHub Issues or Pull Requests" src="https://img.shields.io/github/issues-pr/TheFireCo/genkit-plugins?color=blue"> | ||
<img alt="GitHub commit activity" src="https://img.shields.io/github/commit-activity/m/TheFireCo/genkit-plugins"> | ||
</div> | ||
|
||
**`genkitx-hnsw`** is a community plugin for using HNSW Vector Store with | ||
[Firebase GenKit](https://github.com/firebase/genkit). Built by [**The Fire Company**](https://github.com/TheFireCo). 🔥 | ||
|
||
## Installation | ||
|
||
Install the plugin in your project with your favorite package manager: | ||
|
||
- `npm install genkitx-hnsw` | ||
- `yarn add genkitx-hnsw` | ||
- `pnpm add genkitx-hnsw` | ||
|
||
## Usage | ||
|
||
|
||
## Usage HNSW Indexer plugin | ||
This is a usage of Genkit plugin flow to save data into vector store with HNSW Vector Store, Gemini Embedder and Gemini LLM. | ||
|
||
#### Data preparations | ||
Prepare your data or documents in a Folder | ||
![Restaurants data](https://github.com/TheFireCo/genkit-plugins/blob/main/plugins/hnsw/assets/restaurants-data.png?raw=true) | ||
|
||
#### Register HNSW Indexer Plugin | ||
Import the plugin into your Genkit project | ||
```bash | ||
import { hnswIndexer } from "genkitx-hnsw"; | ||
|
||
export default configureGenkit({ | ||
plugins: [ | ||
hnswIndexer({ apiKey: "GOOGLE_API_KEY" }) | ||
] | ||
}); | ||
``` | ||
#### Genkit UI HNSW Indexer flow running | ||
Open Genkit UI and choose the registered plugin `HNSW Indexer` | ||
Execute the flow with Input and Output required parameter | ||
- `dataPath` : Your data and other documents path to be learned by the AI | ||
- `indexOutputPath` : Your expected output path for your Vector Store Index that is processed based on the data and documents you provided | ||
![Genkit UI HNSW Indexer Flow](https://github.com/TheFireCo/genkit-plugins/blob/main/plugins/hnsw/assets/hnsw-indexer-flow.png?raw=true) | ||
#### Vector Store Index Result | ||
![HNSW Vector](https://github.com/retzd-tech/TheFireCo/genkit-plugins/blob/main/plugins/hnsw/assets/hnsw-indexer-result.png?raw=true) | ||
Vector store will be saved in the defined output path. this index will be used for the prompt generation process with the HNSW Retriever plugin. you can continue the implementation by using the HNSW Retriever plugin | ||
### Optional Parameter | ||
- `chunkSize: number` | ||
How much data is processed at a time. It's like breaking a big task into smaller pieces to make it more manageable. By setting the chunk size, we decide how much information the AI handles in one go, which can affect both the speed and accuracy of the AI's learning process. | ||
`default value : 12720` | ||
- `separator: string` | ||
During the creation of a vector index is a symbol or character used to separate different pieces of information in the input data. It helps the AI understand where one unit of data ends and another begins, enabling it to process and learn from the data more effectively. | ||
`default value : "\n"` | ||
## Usage HNSW Retriever plugin | ||
This is a usage of Genkit plugin flow to process your prompt with Gemini LLM Model enriched with additional and specific information or knowledge within the HNSW Vector Database you provided. with this plugin you will get LLM response with additional specific context. | ||
#### Register HNSW Retriever Plugin | ||
Import the plugin into your Genkit project | ||
```bash | ||
import { googleAI } from "@genkit-ai/googleai"; | ||
import { hnswRetriever } from "genkitx-hnsw"; | ||
|
||
export default configureGenkit({ | ||
plugins: [ | ||
googleAI(), | ||
hnswRetriever({ apiKey: "GOOGLE_API_KEY" }) | ||
] | ||
}); | ||
``` | ||
Make sure you import the GoogleAI plugin for the Gemini LLM Model provider, currently this plugin only supports Gemini, will provide more model soon! | ||
#### Genkit UI HNSW Retriever flow running | ||
Open Genkit UI and choose the registered Plugin `HNSW Retriever` | ||
Execute the flow with the required parameter | ||
- `prompt` : Type your prompt where you will get answers with more enriched context based on the vector you provided. | ||
- `indexPath` : Define folder Vector Index path you wanna use as a knowledge reference, where you get this files path from HNSW Indexer plugin. | ||
In this example, Let's try to ask about the price list information of a restaurant in Surabaya city, where it has been provided within the Vector Index. | ||
We can type the prompt and run it, after the flow finished, you will get response enriched with specific knowledge based on your Vector Index. | ||
![Genkit UI Prompt Result](https://github.com/TheFireCo/genkit-plugins/blob/main/plugins/hnsw/assets/hnsw-retriever-flow.png?raw=true) | ||
### Optional Parameter | ||
- `temperature: number` | ||
temperature controls the randomness of the generated output. Lower temperatures result in more deterministic output, with the model selecting the most likely token at each step. Higher temperatures increase the randomness, allowing the model to explore less probable tokens, potentially generating more creative but less coherent text. | ||
`default value : 0.1` | ||
- `maxOutputTokens: number` | ||
This parameter specifies the maximum number of tokens (words or subwords) the model should generate in a single inference step. It helps control the length of the generated text. | ||
`default value : 500` | ||
- `topK: number` | ||
Top-K sampling restricts the model's choices to the top K most likely tokens at each step. This helps prevent the model from considering overly rare or unlikely tokens, improving the coherence of the generated text. | ||
`default value : 1` | ||
- `topP: number` | ||
Top-P sampling, also known as nucleus sampling, considers the cumulative probability distribution of tokens and selects the smallest set of tokens whose cumulative probability exceeds a predefined threshold (often denoted as P). This allows for dynamic selection of the number of tokens considered at each step, depending on the likelihood of the tokens. | ||
`default value : 0` | ||
- `stopSequences: string[]` | ||
These are sequences of tokens that, when generated, signal the model to stop generating text. This can be useful for controlling the length or content of the generated output, such as ensuring the model stops generating after reaching the end of a sentence or paragraph. | ||
`default value : []` | ||
## Contributing | ||
Want to contribute to the project? That's awesome! Head over to our [Contribution Guidelines](https://github.com/TheFireCo/genkit-plugins/blob/main/CONTRIBUTING.md). | ||
## Need support? | ||
> \[!NOTE\]\ | ||
> This repository depends on Google's Firebase Genkit. For issues and questions related to GenKit, please refer to instructions available in [GenKit's repository](https://github.com/firebase/genkit). | ||
Reach out by opening a discussion on [Github Discussions](https://github.com/TheFireCo/genkit-plugins/discussions). | ||
## Credits | ||
This plugin is proudly maintained by the team at [**The Fire Company**](https://github.com/TheFireCo). 🔥 | ||
## License | ||
This project is licensed under the [Apache 2.0 License](https://github.com/TheFireCo/genkit-plugins/blob/main/LICENSE). |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
/** @type {import('ts-jest').JestConfigWithTsJest} */ | ||
module.exports = { | ||
preset: 'ts-jest', | ||
testEnvironment: 'node', | ||
testPathIgnorePatterns: [ | ||
"/node_modules/", | ||
"/lib/", | ||
], | ||
}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
{ | ||
"name": "genkitx-hnsw", | ||
"description": "Firebase Genkit AI framework plugin for HNSW vector database. Get AI response enriched with additional context and knowledge with HNSW Vector Database using RAG Implementation", | ||
"keywords": [ | ||
"genkit-plugin", | ||
"genkit-indexer", | ||
"genkit-retriever", | ||
"genkit-embedder", | ||
"genkit-flow", | ||
"genkit-rag", | ||
"hnsw", | ||
"vector", | ||
"rag", | ||
"ai", | ||
"generative-ai", | ||
"genai" | ||
], | ||
"version": "0.1.5-rc.1", | ||
"type": "commonjs", | ||
"repository": { | ||
"type": "git", | ||
"url": "git+https://github.com/TheFireCo/genkit-plugins.git", | ||
"directory": "plugins/hnsw" | ||
}, | ||
"author": "TheFireCo", | ||
"license": "Apache-2.0", | ||
"dependencies": { | ||
"@genkit-ai/flow": "^0.5.0", | ||
"@genkit-ai/googleai": "^0.5.0", | ||
"@langchain/google-genai": "^0.0.11", | ||
"@types/node": "^20.12.12", | ||
"@types/jest": "^29.5.12", | ||
"fs": "^0.0.1-security", | ||
"glob": "^10.4.1", | ||
"hnswlib-node": "^1.3.0", | ||
"langchain": "^0.0.11", | ||
"node-fetch": "^3.2.6", | ||
"redis": "^4.6.13", | ||
"zod": "^3.22.4" | ||
}, | ||
"peerDependencies": { | ||
"@genkit-ai/ai": "^0.5.0", | ||
"@genkit-ai/core": "^0.5.0" | ||
}, | ||
"devDependencies": { | ||
"jest": "^29.7.0", | ||
"ts-jest": "^29.1.2", | ||
"typescript": "^5.4.5" | ||
}, | ||
"main": "lib/index.js", | ||
"types": "lib/index.d.ts", | ||
"exports": { | ||
".": { | ||
"require": "./lib/index.js", | ||
"default": "./lib/index.js", | ||
"import": "./lib/index.mjs", | ||
"types": "./lib/index.d.ts" | ||
} | ||
}, | ||
"files": [ | ||
"lib" | ||
], | ||
"publishConfig": { | ||
"provenance": true, | ||
"access": "public" | ||
}, | ||
"scripts": { | ||
"check": "tsc", | ||
"compile": "tsup-node", | ||
"build:clean": "rm -rf ./lib", | ||
"build:watch": "tsup-node --watch", | ||
"test": "jest --coverage", | ||
"build": "npm run test && npm run build:clean && npm run compile" | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
/** | ||
* Copyright 2024 The Fire Company | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
const { saveVectorIndexer } = require('../indexer'); | ||
|
||
jest.mock('../indexer', () => ({ | ||
saveVectorIndexer: jest.fn(), | ||
})); | ||
|
||
describe('checkApiKey on HNSW Indexer', () => { | ||
beforeEach(() => { | ||
jest.clearAllMocks(); | ||
jest.resetAllMocks(); | ||
}); | ||
|
||
it('should successfully save vector indexer data when provided valid options', async () => { | ||
const mockFlowOptions = { | ||
dataPath: 'valid/path', | ||
indexOutputPath: 'output/path', | ||
}; | ||
const mockPluginOptions = { apiKey: 'valid-api-key' }; | ||
saveVectorIndexer.mockResolvedValue('Indexing completed'); | ||
const { hnswIndexerAction } = require('./index'); | ||
const result = await hnswIndexerAction(mockFlowOptions, mockPluginOptions); | ||
expect(saveVectorIndexer).toHaveBeenCalledWith( | ||
mockFlowOptions, | ||
mockPluginOptions | ||
); | ||
expect(result).toEqual('Indexing completed'); | ||
}); | ||
|
||
it('should handle errors when dataPath is missing or invalid', async () => { | ||
const mockFlowOptions = { dataPath: '', indexOutputPath: 'output/path' }; | ||
const mockPluginOptions = { apiKey: 'valid-api-key' }; | ||
const error = new Error('Invalid data path'); | ||
saveVectorIndexer.mockRejectedValue(error); | ||
const { hnswIndexerAction } = require('./index'); | ||
const result = await hnswIndexerAction(mockFlowOptions, mockPluginOptions); | ||
expect(saveVectorIndexer).toHaveBeenCalledWith( | ||
mockFlowOptions, | ||
mockPluginOptions | ||
); | ||
expect(result).toEqual(`Vector index saving error, ${error}`); | ||
}); | ||
}); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
/** | ||
* Copyright 2024 The Fire Company | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
import { saveVectorIndexer } from './../indexer'; | ||
import { retrieveResponseWithVector } from './../retriever'; | ||
import { | ||
RetrieverFlowOptions, | ||
PluginOptions, | ||
IndexerFlowOptions, | ||
} from './../interfaces'; | ||
|
||
export const hnswIndexerAction = async ( | ||
flowOptions: IndexerFlowOptions, | ||
pluginOptions: PluginOptions | ||
) => { | ||
try { | ||
return await saveVectorIndexer(flowOptions, pluginOptions); | ||
} catch (error) { | ||
return `Vector index saving error, ${error}`; | ||
} | ||
}; | ||
|
||
export const hnswRetrieverAction = async ( | ||
flowOptions: RetrieverFlowOptions, | ||
pluginOptions: PluginOptions | ||
) => { | ||
try { | ||
return await retrieveResponseWithVector(flowOptions, pluginOptions); | ||
} catch (error) { | ||
return `Error generating prompt response, ${error}`; | ||
} | ||
}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
/** | ||
* Copyright 2024 The Fire Company | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
// Verify that indexerFlowConfig correctly uses FLOW_NAME_INDEXER for its name | ||
it('should use HNSW Indexer as the name for indexerFlowConfig', () => { | ||
const { indexerFlowConfig } = require('./index.ts'); | ||
expect(indexerFlowConfig.name).toEqual('HNSW Indexer'); | ||
}); | ||
|
||
it('should use HNSW Retriever as the name for retrieverflowConfig', () => { | ||
const { retrieverflowConfig } = require('./index.ts'); | ||
expect(retrieverflowConfig.name).toEqual('HNSW Retriever'); | ||
}); |
Oops, something went wrong.