Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamped data type sniffing for CSV/TSV files #260

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Commits on Dec 8, 2022

  1. Highlights of the changes from

    https://gitlab.com/hjhornbeck/datapusher/-/commits/master
    
    Added pandas to the package requirements.
    Allowed PostgreSQL types to pass through to the underlying database.
    Wrote a routine to sanitize column names for CKAN/PostgreSQL.
    Added support for TSV files.
    Added code to drop a column pandas tends to add if there's a column with no name.
    Added automatically-generated descriptions for all columns in pandas_sniff_algorithm().
    Added a global variable to control what sort of description happens for text fields; this should allow machine-readable storage of category information while still being human-readable.
    Added dummy descriptions to make up for the lack of them in old_sniff_algorithm().
    
    Plus, all that has been synced up with some more recent commits.
    hjhornbeck committed Dec 8, 2022
    Configuration menu
    Copy the full SHA
    9e69458 View commit details
    Browse the repository at this point in the history

Commits on Dec 9, 2022

  1. Update jobs.py

    Added a minor tweak; an unnamed initial column is almost certainly an index column, but in any other position may be an artifact of a poorly-made header and not worthy of automatic deletion. The code now differentiates between these two scenarios.
    hjhornbeck authored Dec 9, 2022
    Configuration menu
    Copy the full SHA
    051c65c View commit details
    Browse the repository at this point in the history

Commits on Jun 19, 2023

  1. Merge pull request #1 from ckan/master

    Updating datapusher to 0.0.20.
    hjhornbeck authored Jun 19, 2023
    Configuration menu
    Copy the full SHA
    81efcf3 View commit details
    Browse the repository at this point in the history