Assignment: INSPIREHEP Search & Summarization Web App

Setup Application

Backend

Prerequisites: Docker, Docker-compose

Rename ".env.example" file to ".env" and adjust

Setup

Run the django app

    make up 
    make bootstrap 
    # visit localhost:8000
    # user: admin
    # password: admin

The backend app should run on http://localhost:8000/ To access admin view: go to http://localhost:8000/admin with user and *password *

Celery Data Harvester

Run the backend
In another terminal copy the celery docker-container id by running

docker ps

Run celery worker

The harvester will fill the db with the api results in the interval set in CELERY_SCHEDULE_MINUTES in .env file (1 day default). Recommendation to put to 1 minute for test purposes! docker exec -it f <celery-docker-container-id> celery -A sis_exercise worker --loglevel=info
The command should return something like "celery@00769f42c709 ready." Depending on system sudo might be needed fro this command

Run celery beat docker exec -it <celery-docker-container-id> celery -A sis_exercise beat --loglevel=info The command should return something like "beat: Starting... Scheduler: Sending due task scheduler (api.tasks.harvest_literature)". Depending on system sudo might be needed fro this command

Frontend

Prerequisites: developed on node version v22.9.0

Go to frontend folder and rename frontend/.env.example to frontend/.env and adjust REACT_APP_BACKEND_URL if needed (default as above)
In frontend folder and run npm install npm start

The frontend app should run on http://localhost:3000/

Overview solution: Searching App

The solution provides a simple web application that allows users to search for high-energy physics papers using the INSPIREHEP REST API and receive a summary generated by the OpenAI API and the list of results. The application consists of a Django backend and a React frontend.

The celery harvester harvests everyday the INSPIREHEP REST API, gets the first 40 papers and saves them
The frontend UI provides an accessible search bar for the search queries and displays the OpenAI generated summary and the list of results
The backend provides the endpoint for the search query, extracts titles and abstracts from the top search results, summizes the information using the OpenAI API or a mock function and returns it with the search results

Assumptions or shortcuts

Publication date is not returned by the api for tested results but publication year. For the mocking of date assumed, the publication date is "01/01/year"
Usage of ant.design for quick prototyping of the frontend
Unit tests are not covering the use cases and are implemented as example
For setup of frontend the simplest setup with react-create-app and js is taken. For real-world applications recommended setup with ts.
cors policies allow all: would need to be changed for production

Given task: Create a task description for the following request:

The product owner wants to have metrics about the OPENAI API response time and the most common user queries.
- Create an issue for this task that you will give to your team to solve, please be specific. in the implementation.
- OPTIONAL: Implement your suggestion.

Task Description: Metrics for OPENAI API Response Time and User Queries

Task Title: Implement Metrics for OPENAI API Performance and User Interaction
User Story: As a product owner, I want to gather metrics about the OPENAI API response time and the most common user queries displayed in the Django admin view so that we can analyze it's performance.
Acceptance Criteria:
- Response Time Measurement: Implement functionality to track the response time for each API call to the OPENAI API. Log response times in milliseconds for each request.
User Queries Tracking: Capture and store all user queries and their counts sent to the OPENAI API. Ensure queries are logged with timestamps for tracking trends over time. Data Aggregation:
Reporting Dashboard: Create a user-friendly dashboard in django admin panel that displays:
- Logs with real-time response time metrics on each query (Openai api metrics)
- A list of the user queries along with their frequency, average, max, min response time
Error Handling: Implement robust error handling to log and report failed API requests separately.
Documentation: Provide clear documentation on: How metrics are collected and reported. Instructions for accessing and interpreting the dashboard.
Definition of Done:
- Code is implemented, tested, and reviewed.
- The metrics are being logged and can be accessed in the Django admin view via http://localhost:8000/admin/api/openaiapimetrics/ (logs with timestamps and response times per query) http://localhost:8000/admin/api/openaiapistatistics/ (statistics per query with count,avg, min, max response time)
- The admin dashboard displays accurate metrics.
- Documentation is updated and accessible.
- Metrics are being logged and can be accessed by the product owner.

Sprint considerations: Estimate story points for this task and prioritize it in the upcoming sprint. After implementation, gather feedback from the product owner to refine metrics further.

Overview solution: Metrics for OPENAI API

Run backend with instructions above
Access metrics http://localhost:8000/admin/api/openaiapimetrics/ (logs with timestamps and response times per query) http://localhost:8000/admin/api/openaiapistatistics/ (statistics per query with count,avg, min, max response time)

Constraints, assumptions or shortcuts

All queries that return 0 results from Elasticsearch are not sent to openai and not displayed in the statistics

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
app		app
docker		docker
frontend		frontend
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
docker-compose.yml		docker-compose.yml
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assignment: INSPIREHEP Search & Summarization Web App

Setup Application

Backend

Celery Data Harvester

Frontend

Overview solution: Searching App

Assumptions or shortcuts

Given task: Create a task description for the following request:

Task Description: Metrics for OPENAI API Response Time and User Queries

Overview solution: Metrics for OPENAI API

Constraints, assumptions or shortcuts

About

Releases

Packages

Languages

License

elizavetaRa/sis-exercise-ragozina

Folders and files

Latest commit

History

Repository files navigation

Assignment: INSPIREHEP Search & Summarization Web App

Setup Application

Backend

Celery Data Harvester

Frontend

Overview solution: Searching App

Assumptions or shortcuts

Given task: Create a task description for the following request:

Task Description: Metrics for OPENAI API Response Time and User Queries

Overview solution: Metrics for OPENAI API

Constraints, assumptions or shortcuts

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages