Skip to content

Latest commit

 

History

History
19 lines (12 loc) · 784 Bytes

README.md

File metadata and controls

19 lines (12 loc) · 784 Bytes

nutrition-data

The USDA provides a database of nutritional data, which is useful for food related software. It is structured as a zip of flat files and a PDF that describes the format.

This is a quick and dirty python script that converts that zip file to a sqlite database. To avoid hard-coding schemas, I try to parse the schema from the pdftotext output of the documentation.

To run this you will need the requests and sqlite3 python modules and the pdftotext executable.

The resulting database uses text fields for NDB_NO, per the specification in the PDF file. If you don't need an exact match for the upstream identifiers, which have leading zeros, you could convert these fields to integers.