webScraping

About

A javaScript/node.js webscraper to pull images off of the Library of Congress' website. It creates a directory called photos and downloads photos there.

It also creates a info.json with all information about that photo. The h3 header ids are the keys and the lis are the values. When there are multiple lines they are stored in an array linked to the key. This happens often with notes.

Installation

Make sure you have Node.js installed. Then clone repo and run:

npm install axios cheerio fs request

Usage

Open project directory on ternimal and run:

node main.js

This will download 21 photos and store them in ./photos and store their info in a info.json.

To change what photos you want to download/scrape change the keywords array with the lccn numbers.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.gitignore		.gitignore
README.md		README.md
main.js		main.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

webScraping

About

Installation

Usage

About

Releases

Packages

Languages

Jmerc03/webScraping

Folders and files

Latest commit

History

Repository files navigation

webScraping

About

Installation

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages