-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
22 lines (11 loc) · 795 Bytes
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
This program crawls Spanish idioms from the Refranero Multingüe, maintained by the Centro Virtual Cervantes: http://cvc.cervantes.es/lengua/refranero/Default.aspx. The center reserves all rights (reservados todos los derechos), so you likely can only use the data non-commercially.
It outputs all idioms with Spanish definitions and their usage. If you just want the output, see the idioms.tsv file included here.
Usage:
0. Install dependencies:
go get .
1. Build the refranero-scraper program:
go build
2. Get link slugs for all idioms:
refranero-scraper -print-slugs > slugs.txt
3. Parse link slugs, and print out the idioms and definitions. It could take a while to parse these ~2000 pages.
< slugs.txt | refranero-scraper -read-slugs > idioms_and_definitions.txt