My python scripts. sorno is just a brand name that I use for my stuff. It's convenient to use that as a package name instead of the usual "org.xxx".
The source code of the whole project is in github: https://github.com/hermantai/sorno-py-scripts
PyPI page: https://pypi.python.org/pypi/sorno-py-scripts
All scripts support the "-h" or the "--help" option for documentation of the scripts. Often the documentation is in the __doc__ of the script, so take a look at that as well.
All scripts are prefixed with "sorno_" to avoid polluting the Scripts folder (in Windows, /usr/local/bin in *nix) of python or other binaries when this suite is installed.
This project also includes the sorno library.
If you don't have Python installed, you first need to install it from https://www.python.org/downloads/. You need a version at least 2.7 but lower than 3.0.
For the following commands, add sudo in front of the commands if you are getting permission error.
A Python package management system will make your life easier, so install pip by:
$ easy_install pip
If easy_install is not on your system, you can check how to install it in https://pythonhosted.org/setuptools/easy_install.html#installing-easy-install.
$ pip install sorno_py_scripts # note that the project name is in underscores, not dashes
$ easy_install sorno_py_scripts # note that the project name is in underscores, not dashes
You can install sorno-py-scripts from the source code by cloning the git repo:
$ git clone https://github.com/hermantai/sorno-py-scripts
Then cd to the sorno-py-scripts directory:
$ cd sorno-py-scripts
Install it:
$ python setup.py install
Your scripts should be installed at your $PATH. For nix system, they are usually in */usr/local/bin. For windows, they should be in a Scripts directory under the Python installation.
Make sure your scripts are in the PATH system environment variables.
Then you can run the scripts by simply invoking them from the command line console ($ is the prompt of the console):
$ sorno_gtasks.py -h
In the directory containing the ./test.sh file, then run it:
$ ./test.sh
You can run tests only for the sorno library:
$ ./test_sorno.sh
Or tests only for the scripts:
$ ./test_scripts.sh
Use -h or --help options for the scripts to get more detail documentation for each script.
A console alarm which uses the system bell as the alarm bell by default. You set how many seconds before the alarm goes off, not an absolute time in the future. After you respond to the bell (e.g. please "Enter" in the console after the system bell rings), it restarts the alarm and will ring again after your specified time. Use control-c to exit the alarm completely.
A script to scrape Amazon product reviews from the web page.
A script to scrape items from an Amazon wishlist. The script only works for wishlists which are "Public". You can change the settings by following the instruction in: http://www.amazon.com/gp/help/customer/display.html?nodeId=501094
Generate an appcache file to be used for html5 application cache of a web application. The goal is to make the whole web app cached, so the app can be run offline.
Attaches the actual time in human readable format for timestamps found in coming lines.
Example:
$ cat /tmp/abc once upon a time 1455225387 there is 1455225387 something called blah and 1455225387 then foo $ cat /tmp/abc |python scripts/sorno_attach_realdate.py once upon a time 1455225387(2016-02-11 13:16:27) there is 1455225387(2016-02-11 13:16:27) something called blah and 1455225387(2016-02-11 13:16:27) then foo
sorno_cloud_vision.py makes using the Google Cloud Vision API easier.
Doc: https://cloud.google.com/vision/docs
The script generates requests for the given photos, sends the requests to Cloud Vision, then puts the results into the corresponding response files.
Compresses all photos in a directory to jpg quality.
Downloads all items from all links from a URL.
Provides utilities to work with dropbox just like the official dropbox cli (http://www.dropboxwiki.com/tips-and-tricks/using-the-official-dropbox-command-line-interface-cli), but in a script instead of a REPL way. sorno_dropbox also has higher level features like copying directories recursively.
Sends a simple email with plain text
The script first tries to use your system Mail Transfer Agent(MTA) configured, otherwise, it prompts for login to use Gmail SMTP server.
Extracts Simon Property Group property information from its 10-K filings
Sample usage:
$ sorno_extract_spg_properties.py spg_10-k.html
If you get UnicodeEncodingError, you should prefix your command with "PYTHONIOENCODING=UTF-8". E.g:
$ PYTHONIOENCODING=UTF-8 sorno_extract_spg_properties.py html_file
Prints out a random fact for fun
Manages feeds stored in Feedly.
This script does not implement an oauth flow, so just get a developer token from https://developer.feedly.com/v3/developer to use this script.
Quickstart:
First, get a developer access token through https://developer.feedly.com/v3/developer, then set the environment variable SORNO_FEEDLY_ACCESS_TOKEN.
$ export SORNO_FEEDLY_ACCESS_TOKEN='YOUR ACCESS TOKEN HERE'Print all categories:
$ sorno_feedly.py categoriesPrint all feeds:
$ sorno_feedly.py categoriesPrint all entries, duplicated entries, and get prompted for marking duplicated entries to read:
$ sorno_feedly.py entries
Demos the use of Google Cloud BigQuery
The script can be run to get the result of a query.
You need to get the json credentials file before using this script. See https://developers.google.com/identity/protocols/application-default-credentials#howtheywork.
Quickstart:
sorno_gcloud_bigquery.py --google-json-credentials <your-json-credentials-file> "SELECT author,text FROM [bigquery-public-data:hacker_news.comments] where text is not null LIMIT 10"
Reference: https://cloud.google.com/bigquery/create-simple-app-api#authorizing
Demos the use of Google Cloud Pub/Sub.
The script can be run as a publisher or a subscriber for a Pub/Sub topic.
You need to get the json credentials file before using this script. See https://developers.google.com/identity/protocols/application-default-credentials#howtheywork.
Quickstart:
To run as a publisher:
sorno_gcloud_pubsub_demo.py --google-json-credentials <your-json-credentials-file> --publisherTo run as a subscriber:
sorno_gcloud_pubsub_demo.py --google-json-credentials <your-json-credentials-file> --subscriber
Reference: https://cloud.google.com/pubsub/configure
A command line client for accessing Google Docs. The API doc used to implement it is in https://developers.google.com/drive/web/quickstart/quickstart-python
You can search for a file and download its content (only if it's a doc).
A command line client for Google Drive. The API doc used to implement this is in https://developers.google.com/drive/web/quickstart/quickstart-python
Currently, you can upload files with the script.
Gets the remote location of a local file/directory from a local git repository.
Oftenly, you want to treat multiple lines as one chunk and see if it matches a regex. If it does, you want to print out the whole chunk instead of the only line that matches the regex. sorno_grepchunks lets you define what a chunk is by giving a chunk starting regex, that is, all the lines starting from the line that matches the regex and before the next match are treated as one chunk. You can then apply another regex to match against it.
A script version of Google Tasks
Prints the class dependency graph given a bunch of java source files.
Join the malls information in different csv files.
The first line of each csv file should be the headers. One of the header should be "Name".
Sample run:
sorno_join_malls_info_in_csv.py --columns-kept-last "Total Mall Store GLA" *.csv
sorno_ls.py is just like the Unix "ls" command
Merge pdfs
A script to prompt for choosing items generated from different sources, then print those items out. For example, if you have a script to generate common directories that you use, e.g. gen-fav-dir.sh, you can put the following in your .bashrc, assuming sorno_pick.py and gen-fav-dir.sh are in your PATH:
$ alias cdf='cd $(sorno_pick.py -c gen-fav-dir.sh)'
Then you can just type:
$ cdf
And you will be given a list of directories to "cd" to.
P.S. You probably want to set the alias to the following:
$ alias cdf='tmp="cd $(sorno_pick.py -c gen-fav-dir.sh)";history -s "$tmp";$tmp'
This ensures the history is inserted in a useful way, e.g. when you run "history", you see the actual command instead of just "cdf".
Downloads podcasts given a feed url.
The downloaded podcasts have useful file names (e.g contain the title of the podcast and prefixed by the published date).
Converts text format of protobufs to python dict.
The script launches ipython for you to play with the parsed python dict.
Prints the human readable date for timestamps
Example:
$ sorno_realdate.py 1455223642 1455223642000 1455223642000000 1455223642000000000 1455223642: 2016-02-11 12:47:22-0800 PST in s 1455223642000: 2016-02-11 12:47:22-0800 PST in ms 1455223642000000: 2016-02-11 12:47:22-0800 PST in us 1455223642000000000: 2016-02-11 12:47:22-0800 PST in ns
Saves images with reduced sizes.
Reduces the sizes of all images in a directory and its subdirectories by saving them with lower quality jpg format. The directory structure is preserved but the new directory is created with a timestamp suffix.
sorno_rename.py renames files given regex for matching names of the existing files and using backreferences for filenames to be renamed to.
Replaces constants with literal values for a thrift file except for the declaration. This is mainly for thrift compilers which cannot handle constants within lists or other collection structures.
Scrapes the 1000 pegs from http://www.rememberg.com/Peg-list-1000/
Fills up the disk space with a specific size of garbage data.
Gets stock quotes and other information for stock symbols.
The script can print real-time or close to real-time stock quotes, historical quotes, and also fundamental ratios for the stock (company).
Prints a summary of the code file.
It makes the layout of the code to be read easily. Currently it only supports python files.
Prints the top files in terms of sizes.
Prints the top files in terms of sizes under a directory or its subdirectories size
Batch posting tweets on Twitter
Before using the script, go to https://dev.twitter.com/oauth/overview/application-owner-access-tokens to get the necessary credentials.
Use Google Doc to edit your tweets, one line per tweet. You should not use naked links (i.e. each link should be associated with some text). Then "File" -> "Download as" -> "Web Page (.html zipped)".
Unzip the downloaded file. Then run the following command with the appropriate parameters. path_to_file should be the path to the html file you unzipped.
$ sorno_twitter_post_tweets.py --consumer-key consumer_key --consumer-secret consumer_secret --access_token-key access_token_key --access-token-secret access_token_secret --parse-tweets-from-file path_to_file
The script prints each tweet, and asks if you want to post the tweet indicated by "Tweet preview". Enter "y" if you want it posted, "n" otherwise.
For scripts like "sorno_gdoc.py", "sorno_gdrive.py" and "sorno_gtasks.py", a Google App project is required to account for the quota of using the API. You need to get an OAuth2 client id and secret for your Google App project, then export them as environment variables "GOOGLE_APP_PROJECT_CLIENT_ID" and "GOOGLE_APP_PROJECT_CLIENT_SECRET" respectively (replace "xxx" and "yyy" with your actual values) before running the script:
export GOOGLE_APP_PROJECT_CLIENT_ID='xxx' export GOOGLE_APP_PROJECT_CLIENT_SECRET='yyy'
You probably want to put the two lines above in your bashrc file.
You can get the oauth2 client id and secret by the following steps:
- Choose a Google App project or create a new one in https://console.developers.google.com/project
- After you have chosen a Google App Project, you then go to the tab "APIs & auth" on the left.
- Click on the APIs subtab, and search for the API needed for the script you want to use. The help page of the script tells you what API your project needs. For example, sorno_gtasks.py needs the Tasks API with the scope 'https://www.googleapis.com/auth/tasks'. Enable it.
- Go to the "Credentials" subtab, click "Add credentials", choose "OAuth 2.0 client ID", enter some information on the OAuth consent screen if prompted. In that screen, only email address and product name are required to be filled out. For the Application type, choose "Other".
- After the credentials is created, click on it and you should see your Client ID and Client secret there.
If you are getting some import error when running the script, make sure you have the newest Google API Client Library for Python. You can find the installation instruction here: https://developers.google.com/api-client-library/python/start/installation
A sample of a script can be obtained from python/script_template.py in https://github.com/hermantai/samples.
You can run the unit tests in the scripts/tests directory. First, set up the testing environment by running:
$ source setup_test_env.sh
If you have installed sorno-py-scripts in your machine, the sorno library from the installation is used instead of your local changes because of easy-install messing with the search path. In that case you need to either remove the egg manually or bump up the version and install it with your local changes to override the existing version.
Then you can run individual unit tests with:
$ python scripts/tests/test_xxx.py
The only deployment destinations for now is github and PyPI. In github, this project resides in the sorno-py-scripts project: https://github.com/hermantai/sorno-py-scripts
To deploy to PyPI, first install twine:
$ pip install twine
Then you can use the script to deploy to PyPI:
$ ./pypi_deploy_with_twine.sh
Use sudo if you encounter permission issues when running the commands.
Use the following if you get an error saying "twine cannot be found" even twine is on your PATH:
sudo env "PATH=$PATH" ./pypi_deploy_with_twine.sh
If twine does not work, use the old school:
$ ./pypi_deploy.sh