Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix filename syntax in datapkg_to_sqlite #870

Closed
grgmiller opened this issue Dec 30, 2020 · 1 comment
Closed

fix filename syntax in datapkg_to_sqlite #870

grgmiller opened this issue Dec 30, 2020 · 1 comment
Labels
windows Issues specific to the Microsoft Windows operating system wontfix

Comments

@grgmiller
Copy link
Collaborator

I'm running the dev environment on my Windows 10 machine.

When running the datapkg_to_sqllite module, an OSError is raised when the script is checking if the sqllite database exists on line 154:

(pudl-dev) C:\Users\Greg\pudl_workspace>datapkg_to_sqlite datapkg/to_sqlite/eia/datapackage.json
2020-12-29 16:43:48 [    INFO] pudl:150 pudl_in=C:\Users\Greg\pudl_workspace
2020-12-29 16:43:48 [    INFO] pudl:151 pudl_out=C:\Users\Greg\pudl_workspace
Traceback (most recent call last):
  File "C:\Users\Greg\Anaconda3\envs\pudl-dev\Scripts\datapkg_to_sqlite-script.py", line 33, in <module>
    sys.exit(load_entry_point('catalystcoop.pudl', 'console_scripts', 'datapkg_to_sqlite')())
  File "c:\users\greg\github\pudl\src\pudl\convert\datapkg_to_sqlite.py", line 154, in main
    if not args.clobber and pathlib.Path(pudl_settings["pudl_db"]).exists():
  File "C:\Users\Greg\Anaconda3\envs\pudl-dev\lib\pathlib.py", line 1391, in exists
    self.stat()
  File "C:\Users\Greg\Anaconda3\envs\pudl-dev\lib\pathlib.py", line 1197, in stat
    return self._accessor.stat(self)
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: 'sqlite:\\C:\\Users\\Greg\\pudl_workspace\\sqlite\\pudl.sqlite'

If we look at the code at line 154, it is calling pathlib.Path(pudl_settings["pudl_db"]).exists():

pudl_settings["pudl_db"] is the following string: 'sqlite:///C:\\Users\\Greg\\pudl_workspace\\sqlite\\pudl.sqlite', which becomes 'sqlite:\\C:\\Users\\Greg\\pudl_workspace\\sqlite\\pudl.sqlite' when interpretedy by pathlib.

However, when you call .exists() it throws the OSError because of the sqlite:\\ prefix on the filepath.

I tried to fix this issue by editing line 154 in datapkg_to_sqllite.py to:

if not args.clobber and pathlib.Path(pudl_settings["pudl_db"].replace('sqlite:///','')).exists():

Which seemed to fix the issue, and allowed me to successfully create the database.

I could create a pull request for this change, although I was not sure what branch to merge this into. I'm also not sure if this edit would work on other OS's.

@zaneselvans
Copy link
Member

Hmm, yeah that looks like a bug. The sqlite:// prefix only makes sense if we're trying to make a database connection with SQLAlchemy. It seems weird that this ever worked, but maybe Path on unix systems knows how to tell the difference? I also kind of suspect that fixing this particular issue won't make it work for you since the database URL will still be a mix of windows and unix path elements, unless SQLAlchemy knows how to manage that. We really need to get the Docker image for local usage created.

zaneselvans added a commit that referenced this issue Jan 19, 2021
Attempts to address issue #870 which notes we're checking for the
existence of an SQLAlchemy database URL, not a file path, when looking
to see if we're clobbering an existing PUDL database.

Also fixes a bug in which the special case of doing "ALL" states in the
EPA CEMS processing was failing, since they were listed as dictionary
keys, instead of a list or tuple.
@zaneselvans zaneselvans added the windows Issues specific to the Microsoft Windows operating system label Aug 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
windows Issues specific to the Microsoft Windows operating system wontfix
Projects
None yet
Development

No branches or pull requests

3 participants