- Install the requirements:
$ pip install requirements.txt
- Run the scraper with:
$ python scraper.py
or
scraper = Scraper(duckdb_file_path=None)
scraper.fetch_all()
If duckdb_file_path
is None
, a file named poesie_francaise.duckdb
is created in the working directory.
The DuckDB database has two tables:
poets
poems
D SELECT table_catalog, table_schema, table_name, table_type FROM INFORMATION_SCHEMA.TABLES;
┌──────────────────┬──────────────┬────────────┬────────────┐
│ table_catalog │ table_schema │ table_name │ table_type │
│ varchar │ varchar │ varchar │ varchar │
├──────────────────┼──────────────┼────────────┼────────────┤
│ poesie_francaise │ main │ poems │ BASE TABLE │
│ poesie_francaise │ main │ poets │ BASE TABLE │
└──────────────────┴──────────────┴────────────┴────────────┘
D SELECT COUNT(*) FROM poets;
┌──────────────┐
│ count_star() │
│ int64 │
├──────────────┤
│ 72 │
└──────────────┘
D SELECT COUNT(*) FROM poems;
┌──────────────┐
│ count_star() │
│ int64 │
├──────────────┤
│ 5873 │
└──────────────┘
D SELECT table_name, column_name FROM INFORMATION_SCHEMA.COLUMNS;
┌────────────┬─────────────┐
│ table_name │ column_name │
│ varchar │ varchar │
├────────────┼─────────────┤
│ poems │ poet_slug │
│ poems │ poem_title │
│ poems │ poem_slug │
│ poems │ poet_name │
│ poems │ poem_book │
│ poems │ poem_text │
│ poets │ poet_slug │
│ poets │ poet_name │
│ poets │ poet_dob │
│ poets │ poet_dod │
├────────────┴─────────────┤
│ 10 rows 2 columns │
└──────────────────────────┘