Fix Bitcoin Book Srapper #84

elraphty · 2024-10-22T12:04:37Z

Currently the bitcoin book scrapper does not work, the urls do not exists and beautiful soup cannot scrape the chapters of the book.

Write a fix to enable the bitcoin book scrapper work with the new version of the book on Github
Update the readme with the changes

kouloumos · 2024-10-23T07:09:44Z

Good point. As you can see in sources, we last scraped the Bitcoin Book (Mastering Bitcoin) a year ago. Normally, this wouldn’t be an issue since books aren’t frequently updated, but the new edition has been released since then.

The chapters were indexed as separate documents during the last scrape (#46), which aligns with the chunking concept we've been discussing. While we successfully broke the content into chunks, there isn’t yet a system in place to connect these chunks meaningfully. If we decide to adopt the chunking strategy moving forward, this would be a good opportunity to implement it effectively for the book as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Bitcoin Book Srapper #84

Fix Bitcoin Book Srapper #84

elraphty commented Oct 22, 2024

kouloumos commented Oct 23, 2024

Fix Bitcoin Book Srapper #84

Fix Bitcoin Book Srapper #84

Comments

elraphty commented Oct 22, 2024

kouloumos commented Oct 23, 2024