在模型训练的时候如何进行特征的输入？ #53

Sunning2118 · 2023-08-31T06:49:52Z

和大家讨论一个问题：异构图是由多个同构图拼接而成，那么对于每个构成异构图的同构图来说，他们所需要的节点的数量是不一样的，这就导致模型所需要的特征不一样。那么，在模型训练的时候如何进行特征的输入？
1、对于每个子图独立重新编号，每个子图所输入的特征各不相同
2、对于每个子图来说，加入它所不具有的点，使每个子图所具有的节点数量都一样，然后使每个节点都具有自环性，目的是迎合dgl的要求，然后，每个同构图的区别是自己独特的边的集合

wlmnzf · 2023-09-10T00:49:15Z

这个目前大家有走通的方案吗，我正在尝试第一个方案

Sunning2118 · 2023-09-11T00:44:39Z

HAN模型中有dgl.metapath_reachable_graph这个函数可以将整图进行划分 Limin Wang @wlmnzf ***@***.***> 于2023年9月10日周日 08:49写道：

这个目前大家有走通的方案吗，我正在尝试第一个方案 — Reply to this email directly, view it on GitHub <#53 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ARUREZSSG3KM256TB3PAIBLXZUFBNANCNFSM6AAAAAA4FSOPRI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

"""This model shows an example of using dgl.metapath_reachable_graph on the original heterogeneous graph. Because the original HAN implementation only gives the preprocessed homogeneous graph, this model could not reproduce the result in HAN as they did not provide the preprocessing code, and we constructed another dataset from ACM with a different set of papers, connections, features and labels. """ import dgl import torch import torch.nn as nn import torch.nn.functional as F from dgl.nn.pytorch import GATConv class SemanticAttention(nn.Module): def __init__(self, in_size, hidden_size=128): super(SemanticAttention, self).__init__() self.project = nn.Sequential( nn.Linear(in_size, hidden_size), nn.Tanh(), nn.Linear(hidden_size, 1, bias=False), ) def forward(self, z): w = self.project(z).mean(0) # (M, 1) beta = torch.softmax(w, dim=0) # (M, 1) beta = beta.expand((z.shape[0],) + beta.shape) # (N, M, 1) return (beta * z).sum(1) # (N, D * K) class HANLayer(nn.Module): """ HAN layer. Arguments --------- meta_paths : list of metapaths, each as a list of edge types in_size : input feature dimension out_size : output feature dimension layer_num_heads : number of attention heads dropout : Dropout probability Inputs ------ g : DGLGraph The heterogeneous graph h : tensor Input features Outputs ------- tensor The output feature """ def __init__(self, meta_paths, in_size, out_size, layer_num_heads, dropout): super(HANLayer, self).__init__() # One GAT layer for each meta path based adjacency matrix self.gat_layers = nn.ModuleList() for i in range(len(meta_paths)): self.gat_layers.append( GATConv( in_size, out_size, layer_num_heads, dropout, dropout, activation=F.elu, allow_zero_in_degree=True, ) ) self.semantic_attention = SemanticAttention( in_size=out_size * layer_num_heads ) self.meta_paths = list(tuple(meta_path) for meta_path in meta_paths) self._cached_graph = None self._cached_coalesced_graph = {} def forward(self, g, h): semantic_embeddings = [] if self._cached_graph is None or self._cached_graph is not g: self._cached_graph = g self._cached_coalesced_graph.clear() for meta_path in self.meta_paths: self._cached_coalesced_graph[ meta_path ] = dgl.metapath_reachable_graph(g, meta_path) for i, meta_path in enumerate(self.meta_paths): new_g = self._cached_coalesced_graph[meta_path] semantic_embeddings.append(self.gat_layers[i](new_g, h).flatten(1)) semantic_embeddings = torch.stack( semantic_embeddings, dim=1 ) # (N, M, D * K) return self.semantic_attention(semantic_embeddings) # (N, D * K) class HAN(nn.Module): def __init__( self, meta_paths, in_size, hidden_size, out_size, num_heads, dropout ): super(HAN, self).__init__() self.layers = nn.ModuleList() self.layers.append( HANLayer(meta_paths, in_size, hidden_size, num_heads[0], dropout) ) for l in range(1, len(num_heads)): # HAN层 self.layers.append( HANLayer( meta_paths, hidden_size * num_heads[l - 1], hidden_size, num_heads[l], dropout, ) ) self.predict = nn.Linear(hidden_size * num_heads[-1], out_size) # 线性层 def forward(self, g, h): for gnn in self.layers: h = gnn(g, h) return self.predict(h)

Sunning2118 · 2023-09-11T00:45:03Z

请参考这个附件， https://github.com/dmlc/dgl/blob/master/examples/pytorch/han/model_hetero.py ，以及这个网页 Sunning Song ***@***.***> 于2023年9月11日周一 08:44写道：

…

HAN模型中有dgl.metapath_reachable_graph这个函数可以将整图进行划分 Limin Wang @wlmnzf ***@***.***> 于2023年9月10日周日 08:49写道： > 这个目前大家有走通的方案吗，我正在尝试第一个方案 > > — > Reply to this email directly, view it on GitHub > <#53 (comment)>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/ARUREZSSG3KM256TB3PAIBLXZUFBNANCNFSM6AAAAAA4FSOPRI> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> >

wlmnzf · 2023-09-12T07:15:51Z

@Sunning2118 okok,非常感谢！！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

在模型训练的时候如何进行特征的输入？ #53

在模型训练的时候如何进行特征的输入？ #53

Sunning2118 commented Aug 31, 2023

wlmnzf commented Sep 10, 2023

Sunning2118 commented Sep 11, 2023 via email

Sunning2118 commented Sep 11, 2023 via email

wlmnzf commented Sep 12, 2023

在模型训练的时候如何进行特征的输入？ #53

在模型训练的时候如何进行特征的输入？ #53

Comments

Sunning2118 commented Aug 31, 2023

wlmnzf commented Sep 10, 2023

Sunning2118 commented Sep 11, 2023 via email

Sunning2118 commented Sep 11, 2023 via email

wlmnzf commented Sep 12, 2023