Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasources #2959

Closed
wants to merge 62 commits into from
Closed

Datasources #2959

wants to merge 62 commits into from

Conversation

rolnico
Copy link
Member

@rolnico rolnico commented Mar 28, 2024

Please check if the PR fulfills these requirements

  • The commit message follows our guidelines
  • Tests for the changes have been added (for bug fixes / features)
  • Docs have been added / updated (for bug fixes / features)

Does this PR already have an issue describing the problem?
No

What kind of change does this PR introduce?
Bug fix + feature

What is the current behavior?
to be completed

What is the new behavior (if this is a feature change)?
Two types of DataSources now exists:

  • DirectoryDataSource (inherited by GzDataSource, Bzip2DataSource, ZstdDataSource, XZDataSource): considers files in a directory
  • AbstractArchiveDataSource (ZipDataSource, TarDataSource): considers files in an archive

Directory DataSources consider the following parameters:

  • A directory (where the files are located)
  • A base name (start of the file names to consider)
  • A source format (end of the file names to consider, excluding the compression format)
  • A compression format (files to use through the datasource should have this format)
  • An observer

Archive DataSources consider the following parameters:

  • A directory (where the archive is located)
  • A base name (start of the file names to consider)
  • A source format (end of the file names to consider, excluding the compression format)
  • An archive format (format of the archive: zip or tar)
  • A compression format (compression of the archive: zip, gz, xz, etc.)
  • The archive file
  • An observer

The parameter sourceFormat corresponds to the extension of the files the users want to consider in their datasources. For example, if sourceFormat == ".xiidm", only files ending with ".xiidm" will be considered. The value can be anything. If the parameter is empty, it won't be considered in the file listing.

The method listNames(String) now works like this:

  • For DirectoryDataSource, it lists the files located in the directory, starting with the base name, having the same source format and the same compression format as the datasource and respecting the regex given as parameter
  • For Archive DataSources, it lists the files located in the archive starting with the base name, having the same source format as the datasource and respecting the regex given as parameter

A DataSourceBuilder is now provided in addition to usual DataSourceUtil.createDataSource methods.

Note: when creating a datasource by giving a filename, for example via DataSourceUtil.createDataSource(Path directory, String fileName, DataSourceObserver observer), the different parameters are extracted from the filename. If the source format is empty or not a usual one (list defined in the new FileInformation class), a warning will be raised.

Does this PR introduce a breaking change or deprecate an API?

  • Yes
  • No

If yes, please check if the following requirements are fulfilled

  • The Breaking Change or Deprecated label has been added
  • The migration steps are described in the following section

What changes might users need to make in their application due to this PR? (migration steps)

Two types of DataSources now exists:

  • DirectoryDataSource (inherited by GzDataSource, Bzip2DataSource, ZstdDataSource, XZDataSource): considers files in a directory
  • AbstractArchiveDataSource (ZipDataSource, TarDataSource): considers files in an archive

Directory DataSources consider the following parameters:

  • A directory (where the files are located)
  • A base name (start of the file names to consider)
  • A source format (end of the file names to consider, excluding the compression format)
  • A compression format (files to use through the datasource should have this format)
  • An observer

Archive DataSources consider the following parameters:

  • A directory (where the archive is located)
  • A base name (start of the file names to consider)
  • A source format (end of the file names to consider, excluding the compression format)
  • An archive format (format of the archive: zip or tar)
  • A compression format (compression of the archive: zip, gz, xz, etc.)
  • The archive file
  • An observer

The parameter sourceFormat corresponds to the extension of the files the users want to consider in their datasources. For example, if sourceFormat == ".xiidm", only files ending with ".xiidm" will be considered. The value can be anything. If the parameter is empty, it won't be considered in the file listing.

Therefore, multiple classes have changed, please update your code if you used them directly:

Old classes New classes
Bzip2FileDataSource Bzip2DataSource
FileDataSource DirectoryDataSource
GzFileDataSource GzDataSource
TarDataSource
XZFileDataSource XZDataSource
ZipFileDataSource ZipDataSource
ZstdFileDataSource ZstdDataSource

Note: Their constructors are not the same so you will have to adapt your code to the new ones, depending on which DataSource you use.

ReadOnlyDataSource, ResourceDataSource and MultipleReadOnlyDataSource now extend AbstractReadOnlyDataSource.

The methods DataSourceUtil.createDataSource() have been reworked to be more consistent with each other and to provide more variations depending on the parameters you might want to use. Be careful regarding the parameters you use:

  • If you use only one String parameter, it will be considered as a fileName
  • If you use two, they will be considered as a baseName and a sourceFormat
  • If you use three, they will be considered as an archiveName, a baseName and a sourceFormat

For instance, new FileDataSource(path, "fileName") could be replaced by one of the followings:

  • new DirectoryDataSource(path, "fileName", "");
  • DataSourceUtil.createDataSource(path, "fileName", "").

Other information:

rolnico added 8 commits March 14, 2024 14:51
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
@rolnico rolnico self-assigned this Mar 28, 2024
rolnico added 12 commits March 28, 2024 15:06
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
@rolnico rolnico marked this pull request as ready for review April 15, 2024 12:03
@rolnico rolnico changed the title [WIP] Datasources Datasources Apr 15, 2024
@rolnico
Copy link
Member Author

rolnico commented Apr 15, 2024

Should solve #121 (except successive compressions)

@rolnico
Copy link
Member Author

rolnico commented Apr 15, 2024

Might solve #989 with some more documentation (will be done once the modifications are verified and accepted)

Signed-off-by: Geoffroy Jamgotchian <geoffroy.jamgotchian@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
rolnico added 14 commits June 17, 2024 15:37
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
…Datasource when possible

Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
# Conflicts:
#	cgmes/cgmes-conversion/src/test/java/com/powsybl/cgmes/conversion/test/export/SteadyStateHypothesisExportTest.java
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
rolnico added 6 commits June 21, 2024 11:19
…ncyWithDataSource parameter + remove checkConsistencyWithDataSource parameter in newInputStream

Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
…ArchiveDataSource

Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
Signed-off-by: Nicolas Rol <nicolas.rol@rte-france.com>
@jonenst jonenst mentioned this pull request Jun 26, 2024
5 tasks
Copy link

@rolnico
Copy link
Member Author

rolnico commented Jun 27, 2024

With the last modifications, using the following command in a folder containing the files from this archive MicroGrid.zip should create a new json file with the network from MicroGrid.xml (so it does not use the CGMES importer):
itools convert-network --input-file MicroGrid.xml --output-format JIIDM --output-file test.json

@jonenst
Copy link
Contributor

jonenst commented Jul 8, 2024

When trying to output to a zip, before we had

$ itools convert-network --input-file /tmp/foo.xiidm  --output-file /tmp/bar.zip --output-format XIIDM
Generating file /tmp/bar.zip:bar.xiidm... # the zip file contains a file named "bar.xiidm"

now we have

$ itools convert-network --input-file /tmp/foo.xiidm  --output-file /tmp/bar.zip --output-format XIIDM
Generating file /tmp/bar.zip:.xiidm... # the zip file contains a file named ".xiidm"

@flo-dup
Copy link
Contributor

flo-dup commented Aug 9, 2024

PR splitted in parts (see #3101, #3102, #3103)

@flo-dup flo-dup closed this Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking Change API is broken
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants