A sync tool to scrape websites and store the content as markdown files for knowledge base system.
Instructions on how to use the project.
go run main.go
provide a .metadata.json file to specify the website to be scraped and the local directory to store the markdown files.
{
"input": {
"urls": ["https://coral.org"]
}
}
There are two modes to run the tool, colly
and firecrawl
.
MODE=colly go run main.go
MODE=firecrawl go run main.go