Research a topic in-depth and make it searchable
python3 start.py
python -m pip install pytest
python -m pytest test.py
Pull data from various sources, determine if its relevant, crawl the links from those sources and build up the source data set.
- Current this supports accepting a list of URLs and crawling them for links.
- In the future, will accept a folder of items, link to a Google Drive folder, links to social profiles etc.
Extract summary, facts and metadata from the source data set.
- Extracts facts in batches from the text with GPT-3.5 -- verifies that they aren't hallucinations and prepares context for clustering.
Relate the extracted data to each other and cluster them into topics.
- Using DBScan to cluster the data into topics.
- Topics are derived from the clusters.
Various code that is shared across the steps of the research process.
Coming soon... for now, please see the code.
If you like this library and want to contribute in any way, please feel free to submit a PR and I will review it. Please note that the goal here is simplicity and accesibility, using common language and few dependencies.
If you have any questions, please feel free to reach out to me on Twitter or Discord @new.moon.