Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading.py Improvement suggestions for Maintainability #116

Open
FelixMau opened this issue Jul 19, 2023 · 3 comments
Open

Reading.py Improvement suggestions for Maintainability #116

FelixMau opened this issue Jul 19, 2023 · 3 comments
Assignees

Comments

@FelixMau
Copy link
Collaborator

I am currently working on debugging a faulty data package. Trying to understand some parts of the reading.py is challenging for me. Therefore, I am collecting some improvement suggestions here.

  • Replacing the 51 Lines Dict comprehension with a normal loop and moreover replacing the resource function would make it easier to understand what is happening here.
    Therefore I am suggesting changing the lines 185-236 to:
data["elements"] = {}
for e in (package.get_resource("elements") or empty).read(keyed=True):
    inputs = [p.strip() for p in e["predecessors"].split(",") if p]
    outputs = [s.strip() for s in e["successors"].split(",") if s]
    triples = []
    for parameter, value in parse(e.get("edge_parameters", "{}")).items():
        for i, source_target in enumerate(inputs + outputs):
            triple = (i, source_target, parameter, value)
            triples.append(triple)
    edges = {}
    for group, grouped_triples in groupby(sorted(triples), key=lambda triple: triple[0]):
        group_edges = {}
        for _, source_target, parameter, value in grouped_triples:
            if value is not None:
                group_edges[parameter] = value
        edges[group] = group_edges
    element = {
        e["name"]: {
            "name": e["name"],
            "inputs": {source: edges[i, source] for i, source in enumerate(inputs)},
            "outputs": {target: edges[i, target] for i, target in enumerate(outputs, len(inputs))},
            "parameters": dict(chain(parse(e.get("node_parameters", "{}")).items(), data["components"].get(e["name"], {}).items())),
            "type": e["type"],
        }
    }
    data["elements"].update(element)
@nailend
Copy link
Collaborator

nailend commented Jul 26, 2023

Same for components

@nailend
Copy link
Collaborator

nailend commented Jul 26, 2023

@gnn did you write this parts and do you have any objections to this?

@nailend
Copy link
Collaborator

nailend commented Feb 29, 2024

@Bachibouzouk reading.py is the main bottleneck. It's mostly a few quite complex dict-comprehensions, difficult to debug but faster than loops. For performance and simplicity gains, we would need to rework this file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants