This project creates a visual map of Apache projects and allows filtering based on user queries.
-
Clone the repository:
git clone https://github.com/yourusername/apache-projects-visualizer.git cd apache-projects-visualizer
-
Create and activate a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the required packages:
pip install -r requirements.txt
-
Create a
.env
file in the root directory and add your configuration:LLM_PROVIDER=openai # or 'local' OPENAI_API_KEY=your_api_key_here OPENAI_MODEL=gpt-4o # or another OpenAI model LOCAL_MODEL_NAME=your_local_model_name # if using a local LLM HUGGINGFACE_TOKEN=your_huggingface_token # if using Hugging Face models
-
Getting your OpenAI API Key:
- Go to https://platform.openai.com/signup and sign up for an account if you don't have one.
- After logging in, navigate to https://platform.openai.com/account/api-keys
- Although they will recommend a new project API key, for the moment this project only works with the old secret API key
- Click on "Create new secret key"
- Copy the generated key (you won't be able to see it again)
- Paste this key as the value for OPENAI_API_KEY in your .env file
-
Run the initial data collection script:
python src/data_collector.py --collect
-
(Optional) If using a local LLM, train it using the collected data:
python src/fine_tune_model.py
-
Run the enhanced data collection using the configured LLM:
python src/data_collector.py --enhance
-
Start the Flask server:
python src/app.py
-
Open
http://127.0.0.1:5000
in a web browser.
- Use the dimension selector to choose how projects are grouped (category, key features, refined category, or programming language).
- Enter your requirements in the input field and click "Query" to find relevant Apache projects.
- Use the checkboxes to filter projects by their groupings.
- Click on a project to view more details, including its description, features, and latest release information.
This project supports two LLM providers: OpenAI and a local LLM. You can configure which one to use by setting the LLM_PROVIDER
environment variable in the .env
file.
Set the LLM_PROVIDER
to openai
and provide your OPENAI_API_KEY
in the .env
file. This is currently the recommended option due to its superior performance and quality of results.
Set the LLM_PROVIDER
to local
and specify your LOCAL_MODEL_NAME
in the .env
file.
Note: The local LLM option is currently experimental and not yet as performant as the OpenAI backend. The fine-tuning process and training algorithm need further improvement to match the quality of OpenAI's models. We welcome contributions from the community to enhance the local LLM training and performance.
src/data_collector.py
: Handles data collection and enhancement for Apache projects.src/app.py
: Flask server that provides API endpoints for the frontend.src/llms.py
: Contains the LLM interface for querying project information.src/config.py
: Manages configuration and environment variables.src/fine_tune_model.py
: Script for fine-tuning a local LLM (if used).static/
: Contains the frontend files (HTML, CSS, JavaScript).
Contributions are welcome! Here are some areas where we particularly need help:
- Improving the fine-tuning process for the local LLM to enhance its performance.
- Developing better training algorithms for the local model to improve the quality of its outputs.
- Expanding the dataset used for training to cover a wider range of Apache projects and their characteristics.
If you're interested in contributing to these areas or have other ideas for improvement, please feel free to submit a Pull Request or open an Issue for discussion.
This project is licensed under the Apache License, Version 2.0. See the LICENSE file for details.