Prerequisites: Docker, Docker-compose
- Rename ".env.example" file to ".env" and adjust
- Setup
- Run the django app
make up make bootstrap # visit localhost:8000 # user: admin # password: admin
The backend app should run on http://localhost:8000/ To access admin view: go to http://localhost:8000/admin with user and *password *
- Run the backend
- In another terminal copy the celery docker-container id by running
docker ps
- Run celery worker
The harvester will fill the db with the api results in the interval set in CELERY_SCHEDULE_MINUTES in .env file (1 day default). Recommendation to put to 1 minute for test purposes!
docker exec -it f <celery-docker-container-id> celery -A sis_exercise worker --loglevel=info
The command should return something like "celery@00769f42c709 ready." Depending on system sudo might be needed fro this command
- Run celery beat
docker exec -it <celery-docker-container-id> celery -A sis_exercise beat --loglevel=info
The command should return something like "beat: Starting... Scheduler: Sending due task scheduler (api.tasks.harvest_literature)". Depending on system sudo might be needed fro this command
Prerequisites: developed on node version v22.9.0
- Go to frontend folder and rename frontend/.env.example to frontend/.env and adjust REACT_APP_BACKEND_URL if needed (default as above)
- In frontend folder and run
npm install
npm start
The frontend app should run on http://localhost:3000/
The solution provides a simple web application that allows users to search for high-energy physics papers using the INSPIREHEP REST API and receive a summary generated by the OpenAI API and the list of results. The application consists of a Django backend and a React frontend.
- The celery harvester harvests everyday the INSPIREHEP REST API, gets the first 40 papers and saves them
- The frontend UI provides an accessible search bar for the search queries and displays the OpenAI generated summary and the list of results
- The backend provides the endpoint for the search query, extracts titles and abstracts from the top search results, summizes the information using the OpenAI API or a mock function and returns it with the search results
- Publication date is not returned by the api for tested results but publication year. For the mocking of date assumed, the publication date is "01/01/year"
- Usage of ant.design for quick prototyping of the frontend
- Unit tests are not covering the use cases and are implemented as example
- For setup of frontend the simplest setup with react-create-app and js is taken. For real-world applications recommended setup with ts.
- cors policies allow all: would need to be changed for production
- The product owner wants to have metrics about the OPENAI API response time and the most common user queries.
- Create an issue for this task that you will give to your team to solve, please be specific. in the implementation.
- OPTIONAL: Implement your suggestion.
-
Task Title: Implement Metrics for OPENAI API Performance and User Interaction
-
User Story: As a product owner, I want to gather metrics about the OPENAI API response time and the most common user queries displayed in the Django admin view so that we can analyze it's performance.
-
Acceptance Criteria:
- Response Time Measurement: Implement functionality to track the response time for each API call to the OPENAI API. Log response times in milliseconds for each request.
-
User Queries Tracking: Capture and store all user queries and their counts sent to the OPENAI API. Ensure queries are logged with timestamps for tracking trends over time. Data Aggregation:
-
Reporting Dashboard: Create a user-friendly dashboard in django admin panel that displays:
- Logs with real-time response time metrics on each query (Openai api metrics)
- A list of the user queries along with their frequency, average, max, min response time
-
Error Handling: Implement robust error handling to log and report failed API requests separately.
-
Documentation: Provide clear documentation on: How metrics are collected and reported. Instructions for accessing and interpreting the dashboard.
-
Definition of Done:
- Code is implemented, tested, and reviewed.
- The metrics are being logged and can be accessed in the Django admin view via http://localhost:8000/admin/api/openaiapimetrics/ (logs with timestamps and response times per query) http://localhost:8000/admin/api/openaiapistatistics/ (statistics per query with count,avg, min, max response time)
- The admin dashboard displays accurate metrics.
- Documentation is updated and accessible.
- Metrics are being logged and can be accessed by the product owner.
- Sprint considerations: Estimate story points for this task and prioritize it in the upcoming sprint. After implementation, gather feedback from the product owner to refine metrics further.
- Run backend with instructions above
- Access metrics http://localhost:8000/admin/api/openaiapimetrics/ (logs with timestamps and response times per query) http://localhost:8000/admin/api/openaiapistatistics/ (statistics per query with count,avg, min, max response time)
- All queries that return 0 results from Elasticsearch are not sent to openai and not displayed in the statistics