Scrape information of kindle books & highlights from amazon site
Recorded with Recordit
Add this line to your application's Gemfile:
gem 'kindle_manager'
And then execute:
$ bundle
Or install it yourself as:
$ gem install kindle_manager
chromedriver is required. Please download chromedriver and update chromedriver regularly.
Create .env following the instructions of https://github.com/kyamaguchi/amazon_auth
amazon_auth
vi .env
And Dotenv.load
or gem 'dotenv-rails'
may be required when you use this in your app.
In console
require 'kindle_manager'
client = KindleManager::Client.new(keep_cookie: true, verbose: true, limit: 1000)
client.fetch_kindle_list
books = client.load_kindle_books
client.quit
Once fetch_kindle_list
succeeds, you can load books information of downloaded pages anytime.
(You don't need to fetch pages with launching browser every time.)
client = KindleManager::Client.new
books = client.load_kindle_books
Example of data
console> pp books.first.to_hash
{"asin"=>"B0026OR2TU",
"title"=>
"Rails Cookbook: Recipes for Rapid Web Development with Ruby (Cookbooks (O'Reilly))",
"tag"=>"Sample",
"author"=>"Rob Orsini",
"date"=>Fri, 17 Mar 2017,
"collection_count"=>0}
In console
require 'kindle_manager'
client = KindleManager::Client.new(keep_cookie: true, verbose: true, limit: 10)
client.fetch_kindle_highlights
books = client.load_kindle_highlights
Example of data
console> pp books.first.to_hash
{"asin"=>"B004YW6M6G",
"title"=>
"Design Patterns in Ruby (Adobe Reader) (Addison-Wesley Professional Ruby Series)",
"author"=>"Russ Olsen",
"last_annotated_on"=>Wed, 21 Jun 2017,
"highlights_count"=>8,
"notes_count"=>7,
"highlights_and_notes"=>
[{"location"=>350,
"highlight"=>
"Design Patterns: Elements of Reusable Object-Oriented Software,",
"color"=>"orange",
"note"=>""},
{"location"=>351,
"highlight"=>"\"Gang of Four book\" (GoF)",
"color"=>"yellow",
"note"=>""},
{"location"=>356, "highlight"=>nil, "color"=>nil, "note"=>"note foo"},
...
{"location"=>385,
"highlight"=>nil,
"color"=>nil,
"note"=>"object oriented"}]}
Limit fetching with number of fetched books: client = KindleManager::Client.new(limit: 100)
Change sleep duration on scrolling (default 3 seconds): client = KindleManager::Client.new(fetching_interval: 5)
Change max scroll attempts (default 20): client = KindleManager::Client.new(max_scroll_attempts: 30)
Renew the directory for downloading: create: true
Firefox: driver: :firefox
Login and password: login: 'xxx', password: 'yyy'
Output debug log: debug: true
- Limit the number of fetching books by date
Applications using this gem
- tsundoku 積読
- kindle_highlight app
- Let me know(create a pull request) if you create an app
After checking out the repo, run bin/setup
to install dependencies. Then, run rake spec
to run the tests. You can also run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
. To release a new version, update the version number in version.rb
, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/kyamaguchi/kindle_manager.
The gem is available as open source under the terms of the MIT License.