-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase input files extensions #100
Comments
Hi I imagine that by "adding more extensions" you mean not only to accept different extensions in the filename, but also to correctly parse these other data file formats, such as comma-separated values. If that is the case and you are willing to implement it, I say go ahead :) (I would try to review it) It would be a nice addition, especially considering that most people are used to working with .csv and not with tab-separated values. |
Yes. You are right. I meant to parse different file formats. I will do the
pull request :)
On Tue, Nov 28, 2017 at 9:37 AM Benjamin Maluenda ***@***.***> wrote:
Hi
I imagine that by "adding more extensions" you mean not only to accept
different extensions in the filename, but also to correctly parse these
other data file formats, such as comma-separated values. If that is the
case and you are willing to implement it, I say go ahead :) (I would try to
review it)
It would be a nice addition, especially considering that most people are
used to working with .csv and not with tab-separated values.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#100 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACIqFHxvn4WRUsBT6HYJovIP1zS2TOwTks5s7ETmgaJpZM4QpLnU>
.
--
Pedro Andrés Sánchez Pérez
|
Thanks Benjamin for being a better communicator than me! :) The use of .tab files was one of the more established paths with the pyomo codebase, and is a hold-over from their initial desire to match the AMPL file formats, although the formats are increasingly diverging. The pyomo DataPortal interface seemed most flexible for our purposes, but it didn't have everything we wanted and we've already written a wrapper around it to provide more features. It can be slow at parsing and assembling data, so it's not ideal, but its also code we don't have to maintain. If you can write support for csv inputs files, that seems dandy; there's a chance DataPortal already supports it in an undocumented way, but I haven't looked into that yet. As far as .tsv files go, I think they have the same conventions as a .tab file, but a different extension. If I'm correct, then you won't have to write any new code to support tsv; just pass a different file name to DataPortal.load() If you have to write new code to implement support in general, I'd suggest using pandas for reading files from disk, then stuffing the data into the DataPortal dictionary. Pandas works efficiently with a wide variety of file formats, is fairly well known, and is maintained and expanded by a broad community. We'll need a little bit of glue code to link pandas to DataPortal, but that shouldn't be too lengthy or difficult to maintain. If you look under the hood, a DataPortal object has a massive nested dictionary that stores everything it has read in, and I've read some Pyomo documentation saying you can add to that dictionary directly as long as you follow their conventions. I think some of the code for parsing partial load heat rates already manipulates that dictionary directly. Best of luck. https://software.sandia.gov/downloads/pub/pyomo/PyomoOnlineDocs.html#_data_input |
I think we could pretty easily support any file format that Pyomo allows. I'm not sure if .tsv is on that list. But I would actually be more in favor of just standardizing on .csv for both input and output. A few reasons for this:
As an amendment to my first point: I'm working on code to allow users to specify aliases for any input file from the command line (or in scenarios.txt or options.txt), e.g., |
@mfripp I wrote a quick patch to support csv files (in addition to tab) (02aa13d), but didn't change the rest of the code. Not sure if that commit would be better off in the 2.0.1 branch or master. That idea could be extended to support xlsx; it just needs to customize the header parsing code. Supporting other file formats (or direct DB connections) would require more thought for how to allow optional columns. Good points; a few comments:
If we had reason to stick with tab-separated-value, our best bet might be to write a new pyomo data plugin called tsv_table.py that was almost identical to csv, but with a different separator. Then submit a pull request. FTR, Pyomo's DataPortal now supports way more data formats than when we first wrote data loading code:
Also worth noting is documentation & official support for skipping DataPortal and directly using Python dictionaries. |
Release 2.0.5 transitions all input & output files to .csv. Well, all outputs except the trivial total_cost.txt that stores a single number and the results.pickle file that stores the solution in binary format. There's another option Any developer who wishes to use other input file formats for their modules may write new modules that use any allowed DataPortal format via standard calls to @pesap Does this address your issue? Please reply in the next month or two, or we may close this issue as part of housekeeping. Cheers, |
Hey @josiahjohnston
I was wondering if we could add more extensions for the input files such as *.csv, *.tsv, etc. I think this will give more flexibility for some users of switch. I can do the pull request for this. It is an easy feature implementation.
The text was updated successfully, but these errors were encountered: