Reading.py Improvement suggestions for Maintainability #116

FelixMau · 2023-07-19T12:57:43Z

I am currently working on debugging a faulty data package. Trying to understand some parts of the reading.py is challenging for me. Therefore, I am collecting some improvement suggestions here.

Replacing the 51 Lines Dict comprehension with a normal loop and moreover replacing the resource function would make it easier to understand what is happening here.
Therefore I am suggesting changing the lines 185-236 to:

data["elements"] = {}
for e in (package.get_resource("elements") or empty).read(keyed=True):
    inputs = [p.strip() for p in e["predecessors"].split(",") if p]
    outputs = [s.strip() for s in e["successors"].split(",") if s]
    triples = []
    for parameter, value in parse(e.get("edge_parameters", "{}")).items():
        for i, source_target in enumerate(inputs + outputs):
            triple = (i, source_target, parameter, value)
            triples.append(triple)
    edges = {}
    for group, grouped_triples in groupby(sorted(triples), key=lambda triple: triple[0]):
        group_edges = {}
        for _, source_target, parameter, value in grouped_triples:
            if value is not None:
                group_edges[parameter] = value
        edges[group] = group_edges
    element = {
        e["name"]: {
            "name": e["name"],
            "inputs": {source: edges[i, source] for i, source in enumerate(inputs)},
            "outputs": {target: edges[i, target] for i, target in enumerate(outputs, len(inputs))},
            "parameters": dict(chain(parse(e.get("node_parameters", "{}")).items(), data["components"].get(e["name"], {}).items())),
            "type": e["type"],
        }
    }
    data["elements"].update(element)

Replacing assert statements with Error handling

The text was updated successfully, but these errors were encountered:

nailend · 2023-07-26T21:52:49Z

Same for components

nailend · 2023-07-26T21:59:41Z

@gnn did you write this parts and do you have any objections to this?

nailend · 2024-02-29T08:19:04Z

@Bachibouzouk reading.py is the main bottleneck. It's mostly a few quite complex dict-comprehensions, difficult to debug but faster than loops. For performance and simplicity gains, we would need to rework this file.

nailend mentioned this issue Jul 26, 2023

Write Datapackage.json sedos-project/data_adapter_oemof#31

Closed

7 tasks

nailend assigned FelixMau Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reading.py Improvement suggestions for Maintainability #116

Reading.py Improvement suggestions for Maintainability #116

FelixMau commented Jul 19, 2023

nailend commented Jul 26, 2023

nailend commented Jul 26, 2023

nailend commented Feb 29, 2024

Reading.py Improvement suggestions for Maintainability #116

Reading.py Improvement suggestions for Maintainability #116

Comments

FelixMau commented Jul 19, 2023

nailend commented Jul 26, 2023

nailend commented Jul 26, 2023

nailend commented Feb 29, 2024