-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xabier branch to merge with main #11
base: main
Are you sure you want to change the base?
Conversation
put in new files for summary generation to fill in from colab notebooks.
updated summary_generation.py with abstractive summary stage of summary process.
summary generation added to hangul with necessary files: summary_generation.py and sentence_ranking.py
new disaster detection module added
update agg_summary_input in hangul.py
fixed new_disaster_detection in hangul.py
Only ranked sentences for summary generation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code performed locally as expected, need to remove some large uncommented lines in hangul.py, add modification notes in header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested using:
Create virtual environment
pip install virtualenv
python -m venv env
ACTIVATE VIRTUAL ENVIROMENT
git checkout xabier_branch
source env/bin/activate
pip install -r requirements.txt
OPEN THE APP
waitress-serve --port=8080 wsgi:app
Open another terminal to use the app
TEST CHETAH
curl -X POST http://127.0.0.1:8080/api/v1/products/chetah
-H "Content-Type: application/json"
-d '{"query": "Are there hurricane in Mexico?"}'
Result:
[{"cluster":"Camp Coordination","date":"2019-11-13","link":"https://reliefweb.int/sites/reliefweb.int/files/resources/DPR-Synthesis-Report-Final_02.08.2019.pdf","summary_full":"Legal and Institutional Frameworks .............................................................................................. 13 2. Disaster risk finance ...................................................................................................................... 13 3. Legal facilities ................................................................................................................................ 16 7. Disaster-Related Human Mobility ................................................................................................. 16 8. Emergency Shelter and Housing, Land and Property Ri ......................................
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PASSED TEST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API was moved to wsgi.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partial file not used in chetah. Chetah_v1.py PASSED TEST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested Hangul V1
curl -X POST http://127.0.0.1:8080/api/v1/products/hangul
-F "file=@han-21 Report Type Detection/Data/News and Press Release/3266793.pdf"
-F "kw_num=5"
TESTED HANGUL V2 LOCALLY :
Memory usage after Hangul 2.0 second API call:
Memory usage: 3122.53 MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REMOVE LARGE COMMENTED SECTIONS
NEED TO ADD HEADER, EXAMPLE:
"""
Filename: hangul.py
Description: Extracts and analyzes content and metadata from PDF files, detects language, generates keywords, etc.
Author: Sidra Effendi
Created on: ?
Modification History:
- 2024-10-29: Explain update by Xabier Urruchua
- 2024-10-30: Explain update by Xabier Urruchua
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code performs during testing, need to add header with modification notes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to add header with created date and author
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new title extractions added, and tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in local virtual environment, and works as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testing script:
Specify python version
python3.10 -m venv /path/to/your/venv
Create virtual environment
pip install virtualenv
python -m venv env
ACTIVATE VIRTUAL ENVIROMENT
git checkout xabier_branch
source env/bin/activate
pip install -r requirements.txt
OPEN THE APP
waitress-serve --port=8080 wsgi:app
Open another terminal to use the app
TEST CHETAH
curl -X POST http://127.0.0.1:8080/api/v1/products/chetah
-H "Content-Type: application/json"
-d '{"query": "Are there hurricane in Mexico?"}'
HANGUL_V1
curl -X POST http://127.0.0.1:8080/api/v1/products/hangul
-F "file=~/3266793.pdf"
-F "kw_num=5"
HANGUL V2
curl -X POST http://127.0.0.1:8080/api/v2/products/hangul -F "file=~/3266793.pdf" -F "kw_num=5"
SUMMARY
curl -X POST http://127.0.0.1:8080/api/v2/products/summary -H "Content-Type: application/json" -d '{
"themes_detected": [
"Food and Nutrition",
"Contributions",
"Health",
"Water Sanitation Hygiene",
"Protection and Human Rights",
"Shelter and Non-Food Items",
"Education"
],
"ranked_sentences": [
""In Iraq, support for Syrian refugees has been included in the wider UK Iraq Crisis response from 2015."",
"Agriculture/Livelihoods and Education include results achieved under the DFID CSSF portfolio in Syria, in addition to the DFID Syria humanitarian portfolio.",
"Syria Lebanon Jordan Turkey Iraq",
"Figures do not include allocations made and spend incurred under the Home Office resettlement scheme for Syrian refugees or UK support to Syrian refugees who have migrated to Europe.",
"Note: UK support for Syrian refugees in Turkey is ongoing.",
"As the brutal conflict continues in Syria, millions of people continue to be in need.",
"This includes DFID allocations to over 30 implementing partners (including United Nations agencies, international non-governmental organisations and the Red Cross) and is helping to meet the immediate needs of vulnerable people in Syria and of refugees in the region.",
"Our support is reaching millions of people and has saved lives in Syria, Jordan, Lebanon, Turkey, Iraq and Egypt.",
"Key Facts: 11.7 million People in need of humanitarian assistance in Syria (Syria 2019 HNO March 2019)",
"The 2018 UN inter-agency appeals for the Syria crisis are an estimated $9 billion, including $3.36 billion for projects inside Syria and $5.6 billion for regional projects."
],
"top_locations": [
{"name": "Syrian Arab Republic", "occurrences": 20},
{"name": "United Kingdom of Great Britain and Northern Ireland", "occurrences": 11},
{"name": "Iraq", "occurrences": 6},
{"name": "Türkiye", "occurrences": 6},
{"name": "Jordan", "occurrences": 4},
{"name": "Lebanon", "occurrences": 4},
{"name": "Egypt", "occurrences": 4}
],
"_detected_disasters": ["Earthquake", "Epidemic"]
}'
EXIT
No description provided.