Lowest Common Denominator I/O

"Everything is a list of dictionaries!"

LCDIO lets you read through the records in the same way for a variety of file formats:

csv
tsv
json
jsonl
parquet
field-separated text
SQLite
toml
yaml

Usage

For files with named columns: The file is a list of dictionaries (key is the column name).

import lcdio

file = lcdio.open('testdata/planets.parquet')
for row in file:
  print(f'Planet {row["name"]} is {row["distance"]} light-seconds away from the sun.')

For files without named columns: The file is a list of dictionaries (key is the column number).

import lcdio

file = lcdio.open('testdata/planets.csv')
for row in file:
  print(f'Planet {row[0]} is {row[1]} light-seconds away from the sun.')

Additional features

The rows returned by LCDIO are dict-like, but they have a few other features:

You can read multiple columns at a time (with slicing):

import lcdio

file = lcdio.open('testdata/planets.csv')
for row in file:
  print(f'The first two columns are {row[0:2]}')

This includes all the slicing syntax, such as skipping every other item (row[::2]), skipping the first one (row[1:]), skipping the last one (row[:-1]) etc.

You can use column index (and slices) even when the columns are named:

import lcdio

file = lcdio.open('testdata/planets.parquet')
for row in file:
  print(f'Planet {row[0]}')

If the row contains an array, you can access it by adding an argument:

>>> row['days_in_office']
['monday', 'wednesday']
>>> row['days_in_office', 0]
'monday'

If the row contains a JSON object, you can access members by adding arguments:

>>> row[0]
{'age': 30, 'secrets': {'password': 'foo', 'closet': '2 skeletons'}}
>>> row[0, 'secrets', 'password']
'foo'

Philosophy

This is not about making the most performant or the most featureful reader for these formats.

What this is about is making the library that is the easiest to use for reading a bunch of formats. You don't need to remember the names of all the specific libraries to import. You don't need to remember each of their syntax. You only need to remember one thing: "everything is a list of dictionaries."

Other options

the open method has a mode argument to tell it what format to read the file as (if it can't be guessed from the file extension, say), and a has_header argument to flag whether the csv has a header row or not.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
testdata		testdata
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
lcdio.py		lcdio.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lowest Common Denominator I/O

Usage

Additional features

Philosophy

Other options

About

Releases

Packages

Languages

License

jean-philippe-martin/lcdio

Folders and files

Latest commit

History

Repository files navigation

Lowest Common Denominator I/O

Usage

Additional features

Philosophy

Other options

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages