Backend repository and playground for experimental datagovmy AI/ML services:
- 👨💻 Open API Documentation Assistant (See in action)
- 📈 MyDataGPT Assistant (Coming Soon)
- Install pyenv and then use it to install the Python version in
.python-version
. - Create virtual environment in root directory with
python -m venv env
- Activate virtual environment and install all dependencies from
requirements.txt
. - Create your own
.env
file from.env.example
. - Run
docker-compose up
. - Interact with the chat endpoint at
/chat/stream
This project has the following dependencies:
Click on the link of the respective projects to find out how to set them up for your environment.
Built to assist developers in getting started with using the data.gov.my open API. The docs assistant is a Retrieval Augmented Generation (RAG) application powered by OpenAI's gpt-4o
(Updated from gpt-3.5-turbo
in Oct 2024) model. Its data pipeline indexes .mdx
files from the API documentation in the datagovmy-front repository and stores embeddings in a Weaviate vectorstore for retrieval.
This assistant is part of a bigger effort to build a one-stop data assistant for the nation's open data designed to eventually answer data queries and show insights on all data released on data.gov.my. Similar to the docs assistant, it is also an RAG application that leverages a Weaviate vector index loaded with metadata. This is currently under development. Stay tuned!
- Bahasa Malaysia (BM) language support - in our testing with BM queries, OpenAI models tend to lean towards responses that sound more like Bahasa Indonesia despite our best efforts in prompting. YMMV, but more work to be done here!
Full understanding of the Data Catalogue API fields (coming soon)
This is an experimental product that utilizes the OpenAI API. It is provided for testing and educational purposes only. The government and its representatives make no warranties or guarantees regarding the accuracy, completeness, or suitability of the information provided by this product.
Thank you for your willingness to contribute to this free and open source project by the Malaysian public sector! When contributing, consider first discussing your desired change with the core team via GitHub issues or discussions!
data.gov.my is licensed under MIT
Copyright © 2024 Government of Malaysia