Skip to content

csvs (CSV Sql) is a command-line tool that simplifies working with CSV or TSV files by enabling SQL queries through an embedded SQLite engine. Binaries for macOS, Windows & Linux.

License

Notifications You must be signed in to change notification settings

koma-private/csvs

Repository files navigation

csvs

csvs (CSV Sql) is a command-line tool that simplifies working with CSV or TSV files by enabling SQL queries through an embedded SQLite engine. It is ideal for data analysts and developers who need SQL's flexibility to manage text-based data efficiently.

GitHub Release GitHub Tag Crates.io Version

Banner of executing csvs Banner of interactive mode Banner of interactive mode


Features

SQL Power for CSV Files

Run advanced SQL queries, including JOIN, GROUP BY, SUM(), or COUNT() on CSV data. Gain unparalleled flexibility to query, filter, sort, group, and combine data compared to traditional spreadsheet tools. csvs also supports regular expressions in SQL queries. Refer to Regular Expressions Document.

Automatic Encoding Detection

Eliminate encoding issues with automatic detection of character encodings. Avoid garbled text and broken queries effortlessly.

Decide Data Type for Each Column

csvs scans CSV rows to determine the most appropriate data types for SQLite tables. This dynamic analysis ensures compatibility and precision, even when handling nullable fields.

See Decide Data Type for Each Column and Validating Number Document for details.

Multi-File Handling

Combine data from multiple CSV or TSV files by creating a temporary SQLite database using --in-file. Easily perform SQL joins across files in seconds.

Common use cases:

  • Merging datasets from separate files.
  • Cross-referencing data using SQL joins.

Customizable Output

Export query results as:

  • CSV or TSV: Ideal for data sharing or further processing.
  • SQLite Database: Retain results as .db files for future queries.

Control delimiters, headers, and quoting styles to suit your needs.

Interactive Mode

Explore datasets interactively without specifying queries upfront. Features include:

  • Browsing imported tables.
  • Ad-hoc query execution.
  • Previewing and saving query results.

Multi-Statement Query Support

Execute multiple SQL statements in a single command. Transform and query data across multiple steps, with only the final result displayed.


Usage

  • Display help:
csvs --help

Interactive Mode

Start csvs in interactive mode when neither --query nor --source is specified. This mode allows you to:

  • View imported tables.
  • Preview table content.
  • Save query results to files interactively.

See Interactive Mode Guide

Animation of interactive mode

Command Options

See Command Options Guide

Animation of executing csvs


Example

  • Display version:
csvs --version
  • Select specific fields from ./data/address.csv and save the results to picked.csv:
csvs -i ./data/address.csv -q 'SELECT "city","town","phone" FROM "address.csv"' -o picked.csv
  • Process CSV data from STDIN:
csvs -q 'SELECT "city","town","phone" FROM "stdin"' < ./data/address.csv > picked.csv

or

cat ./data/address.csv | csvs -q 'SELECT "city","town","phone" FROM "stdin"' > picked.csv
  • Perform SQL joins across multiple files:
csvs -i ./left.csv -i ./right.tsv -q 'SELECT * FROM "left.csv" AS l JOIN "right.tsv" AS r ON l."name"=r."name"'
  • Leverage SQLite functions like UPPER(), COUNT(), etc., and export results to a SQLite database:
csvs -i people.csv -q 'SELECT "city",COUNT(*) FROM "people.csv" GROUP BY "city" ORDER BY COUNT(*) DESC' --out-database out.db
  • Start in interactive mode:
csvs -i MOCK_DATA.csv

SQL Query Notes

Mapping CSVs to Table Names

File names provided with --in-file map directly to SQLite tables (e.g. ./sample/address.csv becomes "address.csv", and data.2024.csv becomes "data.2024.csv").

Quoting Columns with Special Characters

Columns or table names with spaces, punctuation, or reserved words must be quoted. Example:

SELECT "first name", "last name"
FROM "contacts.csv"

--in-no-header Option

If specified, column names default to "c1", "c2", "c3", etc., for header-less CSV files.

Execute Multiple Statements in a Single Query

Separate SQL statements with semicolons to execute multiple queries in sequence. Only the result of the final query is displayed.

Example:

SELECT "first name"
FROM "contacts.csv";
SELECT "age"
FROM "contacts.csv"; 

Error Handling

See Error Handling Guide


Building csvs

See Build Guide


Limitations

  • Interactive mode cannot be invoked when CSV data is provided via STDIN. Use --in-file to specify CSV files instead.
  • Large files may require significant RAM since csvs loads entire files into memory when --out-database is not specified.
  • CSV files with names starting with sqlite_ cannot be used with --in-file due to SQLite's reserved naming convention.

Acknowledgments

csvs relies on the following open-source projects:

License

csvs is licensed under the MIT license.

About

csvs (CSV Sql) is a command-line tool that simplifies working with CSV or TSV files by enabling SQL queries through an embedded SQLite engine. Binaries for macOS, Windows & Linux.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages