-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not found recurrent layer in model files #6
Comments
As I mentioned in issue (#4), the current network only contain a CNN backbone, the fully connected (FC) layers acts as a decoder part. I test with a datasets only containing numbers, and the result shows that CNN (encoder) + FC(decoder) is better than the CNN (encoder) + [RNN + FC] (decoder) architecture, I believe the same results in Chinese character datasets. I don't test the current architecture for English character datasets. You can modify the class BLSTM(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(BLSTM, self).__init__()
self.rnn = nn.LSTM(input_size, hidden_size, bidirectional=True)
self.out = nn.Linear(hidden_size * 2, output_size)
def forward(self, input):
rnn_output, _ = self.rnn(input)
output = self.out(rnn_output) # [seq_len, batch, output_size]
return output
class CRNN(nn.Module):
def __init__(self, features, meta):
super(CRNN, self).__init__()
self.features = nn.Sequential(*features)
self.avgpool = nn.AdaptiveAvgPool2d((1, None))
self.encoder = BLSTM(meta['output_dim'], meta['hidden_dim'], meta['hidden_dim'])
self.decoder = BLSTM(meta['hidden_dim'], meta['hidden_dim'], meta['hidden_dim'])
self.classifier = nn.Linear(meta['hidden_dim'], meta['num_classes'])
self.meta = meta
def forward(self, x):
# x -> features
out = self.features(x)
# features -> pool -> flatten -> decoder -> softmax
out = self.avgpool(out)
out = out.permute(3, 0, 1, 2).view(out.size(3), out.size(0), -1)
out = self.classifier(self.decoder(self.encoder(out)))
out = F.log_softmax(out, dim=2)
return out |
I checked the network roughly, and I found it seems no recurrent layers like Bi-LSTM?
Is this repo another implementation for CRNN? I just see several CNN backbone and fully connected layers, but not found RNN layers.
The text was updated successfully, but these errors were encountered: