Skip to content

A simple program to read filled PDF forms in a directory and write a CSV file with the field names as the headers and the form information as the values. The CSV file can readily be imported into MS Excel or Google Sheets

License

Notifications You must be signed in to change notification settings

acaird/pdfform2csv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdfform2csv: Turn Filled PDF Forms into a CSV file

This is a small tool that lets you choose a directory that contains filled PDF Form files, reads those files, and writes a CSV file with the form data from each file. The CSV file can then be opened using Microsoft Excel, Google Sheets, Apache OpenOffice or Libre Office Calc, or other spreadsheet or CSV reader programs.

The CSV File

The CSV file will have the format:

FileField 1Field 2Field 3etc
file01.pdfvalue 1value 2etc
file02.pdfvalue 1value 2etc
file03.pdfvalue 1value 2etc
data01.pdfvalue 1value 3etc

Where each of the Fields is taken from the name of the form field in the PDF file, and each of the values is taken from whatever the person filling the PDF file entered. It is not necessary that the PDF files all have the same form fields; that will results in some columns having data and others not.

The Graphical User Interface

If you double-click the program, it will start with a short welcome message; pressing OK on that will then show a file browser where you should choose the directory (or folder) that contains the filled PDF form files. After you choose it the CSV summary file will be written to the same directory (folder) as the PDF files, and a dialog box showing you the path of the summary file will be shown.

It looks like this:

MacOSWindows
images/ss-welcome-mac.pngimages/ss-welcome-win.png
images/ss-filedialog-mac.pngimages/ss-filedialog-win.png
images/ss-results-mac.pngimages/ss-results-win.png

The Command-line Interface

You can also run the same program from the Windows Command Prompt or the MacOS Terminal.app by typing the name of the program followed by the path to a directory (or folder) containing the PDF form files.

Logging and Errors

Everytime you run the program, it will write a file to $TMPDIR or $TMP or /tmp or /temp called pdfform2csv.log. If there are problems or you need help, look at that file, or use it to open an issue.

Development Notes

Building

Make a directory (or folder) called bin, first.

For MacOS

  • Build the binary
    	 go build -o pdfform2csv main.go
        
  • Make the application bundle structure
    	 mkdir -p bin/pdfform2csv/Contents/MacOS
    	 mkdir -p bin/pdfform2csv/Contents/Resources
        
  • Copy the binary pdfform2csv to bin/pdfform2csv/Contents/MacOS
  • Run python support/makeiconset.py support/pdfform2csv.png and copy the resulting icns file to bin/pdfform2csv/Contents/Resources
  • Put the file Info.plist from the support directory in bin/pdfform2csv/Contents/; the contents of that file are something like:
    	   <?xml version="1.0" encoding="UTF-8"?>
    	   <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
    	   <plist version="1.0">
    	   <dict>
    		   <key>CFBundleExecutable</key>
    		   <string>pdfform2csv</string>
    		   <key>CFBundleIconFile</key>
    		   <string>pdfform2csv.icns</string>
    		   <key>CFBundleIdentifier</key>
    		   <string>io.github.acaird</string>
    		   <key>NSHighResolutionCapable</key>
    		   <true/>
    		   <key>LSUIElement</key>
    		   <true/>
    	   </dict>
    	   </plist>
        

For Windows

  • edit the data in the winres/ directory if needed; replacing the icon files also, if needed
  • Using go-winres make the syso files
  • build the binary
    GOOS=windows GOARCH=amd64 go build -ldflags -H=windowsgui -i -o bin/pdfform2csv.exe main.go
        

Generic Notes

  • Packaging a Go Application for MacOS:

Actions

Note that the icons tend not to work at all. All of those steps to make them seem to be a waste of time. 🤷

About

A simple program to read filled PDF forms in a directory and write a CSV file with the field names as the headers and the form information as the values. The CSV file can readily be imported into MS Excel or Google Sheets

Resources

License

Stars

Watchers

Forks

Packages

No packages published