-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop < T541-FL] Import and export strategies #215
Conversation
* bump to 2.5 * try 2.2 * change ubuntu to 20.04 * revert to MG 2.3.0 * bump to mg 2.5.0 and update qm signatures
There are some things that will be added soon:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lot of work done here, good job! I checked mostly export, import and translators, didn't yet check tests. Will do that in next run
graph = transporter.export() | ||
""" | ||
|
||
def __init__(self, graph_type: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use graph_type as enum? This is easier to follow on different places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the method has to receive string since it is called directly by the user but I added comparison with the enum so I think it is good now.
gqlalchemy/vendors/memgraph.py
Outdated
@@ -35,17 +35,10 @@ | |||
) | |||
from gqlalchemy.vendors.database_client import DatabaseClient | |||
from gqlalchemy.graph_algorithms.query_modules import QueryModule | |||
from gqlalchemy.memgraph_constants import MG_HOST, MG_PORT, MG_USERNAME, MG_PASSWORD, MG_ENCRYPTED, MG_CLIENT_NAME, MG_LAZY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use here same techinque I mentioned in export folder in graph_transporter.py
from gqlalchemy.transformations.translators.dgl_translator import DGLTranslator | ||
from gqlalchemy.transformations.translators.nx_translator import NxTranslator | ||
from gqlalchemy.transformations.translators.pyg_translator import PyGTranslator | ||
from gqlalchemy.memgraph_constants import MG_HOST, MG_PORT, MG_USERNAME, MG_PASSWORD, MG_ENCRYPTED, MG_CLIENT_NAME, MG_LAZY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And same here for constants as in graph_transporeter.py
elif self.graph_type == "nx": | ||
self.translator = NxTranslator(default_node_label, default_edge_type, host, port, username, password, encrypted, client_name, lazy) | ||
else: | ||
raise ValueError("Unknown import option. Currently supported are DGL, PyG and Networkx.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Currently supported options are: DGL,..."
if features is not None: | ||
graph.edges[edge_triplet].data[feature_name] = features | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for edge_triplet, features_dict ... for feature_name, features ... translated_features = ... //it makes sense here to rename variable if features is None: //this way you are not indenting lines so much continue graph.edges[edge_triplet].data[feature_name] = translated_features
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
for source_node_id, dest_node_id, eid in zip(source_nodes, dest_nodes, eids): | ||
# Handle properties | ||
source_node_properties, dest_node_properties, edge_properties = {}, {}, {} | ||
# Copy source node properties | ||
source_node_properties = dict(map(lambda pair: (pair[0], to_cypher_value(pair[1][source_node_id])), node_src_label_properties.items())) | ||
source_node_properties[DGL_ID] = int(source_node_id) | ||
# Copy destination node properties | ||
dest_node_properties = dict(map(lambda pair: (pair[0], to_cypher_value(pair[1][dest_node_id])), node_dest_label_properties.items())) | ||
dest_node_properties[DGL_ID] = int(dest_node_id) | ||
# Copy edge features | ||
edge_properties = dict(map(lambda pair: (pair[0], to_cypher_value(pair[1][eid])), etype_properties.items())) | ||
edge_properties[DGL_ID] = int(eid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly, you are going over source_nodes, dest_nodes and eids. And for all those pairs, node_src_label_properties which I persume represents properties keys, is same for all source nodes? Why is that?
Couldn't it happen that some of nodes in source nodes have extra property? Or this is not stored at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes so node_src_label_properties
and node_dest_label_properties
are used in DGL
for storing properties. They save for each node type an array, where each element of the array contains data about one node. DGL
stores only numeric properties and I have imposed the restriction that only properties which are on all nodes, can be translated to DGL
and PyG
because both frameworks don't allow different situation.
self, | ||
graph: nx.Graph, | ||
host: str = "127.0.0.1", | ||
port: int = 7687, | ||
username: str = "", | ||
password: str = "", | ||
encrypted: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is host
, port
, username
, password
mentioned here again? Because we have it in init, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this method was from old times so I forgot to change this.
args=( | ||
process_queries, | ||
host, | ||
port, | ||
username, | ||
password, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And why are we here sending again host, port, username and password, don't we have it in init?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes same as above, it is old method.
if iter_node_label == node_label: | ||
for property_name, property_values in iter_node_properties.items(): | ||
if property_name != NUM_NODES: | ||
node_properties[property_name] = property_values[node_id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can reduce indenting here:
if iter_node_label != node_label: continue for property_name, property_values in iter_node_properties.items(): if property_name == NUM_NODES: continue node_properties[property_name] = property_values[node_id]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep
Just a general docs comment. I can see that you changed the location of the transformations. This can affect the current GQLA docs and it should be checked where and fix it. For example, it will probably change something here: https://memgraph.com/docs/gqlalchemy/how-to-guides/networkx. Besides that, please make sure to document at least something regarding the new features included in the next release. GQLA docs need to be regenerated and fixed, but by adding new content we at least don't lose knowledge and can help users. |
Yes, yes I have to start writing docs and editing existing transformation docs. Thanks for the ping! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments, overall good job
@@ -0,0 +1,68 @@ | |||
# Copyright (c) 2016-2022 Memgraph Ltd. [https://memgraph.com] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2023
|
||
class GraphTransporter(Transporter): | ||
"""Here is a possible example for using this module: | ||
>>> transporter = GraphTransported("dgl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GraphTransporter typo
default_node_label="NODE", | ||
default_edge_type="RELATIONSHIP", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put these 2 as well under constants.py
) -> None: | ||
"""Initializes GraphTransporter. It is used for converting Memgraph graph to the specific graph type offered by some Python package (PyG, DGL, NX...) | ||
Here is a possible example for using this module: | ||
>>> transporter = GraphTransported("dgl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo GraphTransporter
@@ -0,0 +1,13 @@ | |||
# Copyright (c) 2016-2022 Memgraph Ltd. [https://memgraph.com] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just don't forget to change everywhere 2023. I'm not sure if this can be done automatically but for now we can do it manually.
"""Produce cypher queries for data saved as part of thePyG graph. The method handles both homogeneous and heterogeneous graph. | ||
The method converts 1D as well as multidimensional features. If there are some isolated nodes inside the graph, they won't get transferred. Nodes and edges | ||
created in Memgraph DB will, for the consistency reasons, have property `pyg_id` set to the id they have as part of the PyG graph. Note that this method doesn't insert anything inside the database, | ||
it just creates cypher queries. To insert queries the following code can be used: | ||
>>> memegraph = Memgraph() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
watch for line length
default_node_label="NODE", | ||
default_edge_type="RELATIONSHIP", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extract
|
||
for row in rel_results: | ||
row_values = row.values() | ||
# print(f"Row values: {row_values}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete print
@@ -32,7 +32,7 @@ exclude = ''' | |||
''' | |||
|
|||
[tool.poetry.dependencies] | |||
python = "^3.7" | |||
python = "^3.8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this on purpose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@katarinasupe do we have any implications on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it is on purpose because of poe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Josipmrden I think all should be fine.
@as51340 - Data from Memgraph can now be imported from and exported to |
Description
Import and export to DGL, PyG and Nx.
Docs PR: memgraph/docs#693
Pull request type
Please delete options that are not relevant.
Related issues
Checklist:
######################################
Reviewer checklist (the reviewer checks this part)
######################################