06.understand_sentiment情感分析中的问题 #804

freefreesea · 2019-09-06T08:57:08Z

def inference_program(word_dict):
data = fluid.layers.data(
name="words", shape=[1], dtype="int64", lod_level=1)

dict_dim = len(word_dict)
pred = dynamic_rnn_lstm(data, dict_dim, CLASS_DIM, EMB_DIM, LSTM_SIZE)
return pred

请问这个创建的data层为什么shape是[1]呢？
我看了之前的实例，这个data中的shape都代表的是输入数据的维度

The text was updated successfully, but these errors were encountered:

seiriosPlus · 2019-09-06T09:55:05Z

可以看看官方文档：
https://www.paddlepaddle.org.cn/documentation/docs/zh/1.5/api_cn/layers_cn/io_cn.html#data

freefreesea · 2019-09-09T07:02:16Z

我的意思是这个IMDB的数据读进来，明明维度是文字长度，并不是1，为什么可以这么表示呢？我查了imdb的reader，读进来的数据明明是128维的，为什么不用[128]呢？

freefreesea · 2019-09-09T07:03:07Z

是因为lstm网络的需要还是什么，现在卡在这里，很着急

freefreesea · 2019-09-09T07:04:22Z

这个实例是引用了imdb的数据集，利用RNN，LSTM模型进行了情感的分析，进行分类。但是在这一行，这个shape=1，文档并没有给出解释

seiriosPlus · 2019-09-09T07:08:03Z

这行代码是预测相关的输入代码，可以结合后2行来一起看：

data = fluid.layers.data(
        name="words", shape=[1], dtype="int64", lod_level=1)
emb = fluid.layers.embedding(
        input=data, size=[input_dim, emb_dim], is_sparse=True)

通过变长的词Id得到低维映射的词向量，既输入是词的 ID序列。

freefreesea · 2019-09-09T07:14:04Z

([5146, 43, 71, 6, 1092, 14, 0, 878, 130, 151, 5146, 18, 281, 747, 0, 5146, 3, 5146, 2165, 37, 5146, 46, 5, 71, 4089, 377, 162, 46, 5, 32, 1287, 300, 35, 203, 2136, 565, 14, 2, 253, 26, 146, 61, 372, 1, 615, 5146, 5, 30, 0, 50, 3290, 6, 2148, 14, 0, 5146, 11, 17, 451, 24, 4, 127, 10, 0, 878, 130, 43, 2, 50, 5146, 751, 5146, 5, 2, 221, 3727, 6, 9, 1167, 373, 9, 5, 5146, 7, 5, 1343, 13, 2, 5146, 1, 250, 7, 98, 4270, 56, 2316, 0, 928, 11, 11, 9, 16, 5, 5146, 5146, 6, 50, 69, 27, 280, 27, 108, 1045, 0, 2633, 4177, 3180, 17, 1675, 1, 2571], 0)

freefreesea · 2019-09-09T07:15:05Z

这是从reader中读取的第一组数据，id序列是指什么意思，为什么是1，还是没听懂- -

freefreesea · 2019-09-09T07:15:46Z

我理解中转化成id序列之后，一个样本的shape不应该是这些id的数量嘛

freefreesea · 2019-09-09T07:16:02Z

不知道哪理解错了，求大佬指点啊

seiriosPlus · 2019-09-09T07:22:03Z

shape=1 表示，输入的纬度就是1维的， lod_level=1 表示，这是一个1维的变长序列。

将这个1维的变长序列输入到 embedding layer后得到这些id对应的 size=[batch_size, emb_dim]的词向量。

freefreesea · 2019-09-09T07:25:37Z

明白了，谢谢大佬

freefreesea · 2019-09-09T08:56:17Z

您好，还想请教您一个问题我基于这个代码改了一个reader，读入了我本地的数据，数据是40万条，在test阶段会报错 Assertion label >= 0 && label < feature_size_ failed (The label is out of the range. 5 但是我假如把40万条数据中的前100条数据拿出来，就不会报错

AIpioneer assigned seiriosPlus Oct 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

06.understand_sentiment情感分析中的问题 #804

06.understand_sentiment情感分析中的问题 #804

freefreesea commented Sep 6, 2019

seiriosPlus commented Sep 6, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

seiriosPlus commented Sep 9, 2019 •

edited

Loading

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

seiriosPlus commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

06.understand_sentiment情感分析中的问题 #804

06.understand_sentiment情感分析中的问题 #804

Comments

freefreesea commented Sep 6, 2019

seiriosPlus commented Sep 6, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

seiriosPlus commented Sep 9, 2019 • edited Loading

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

seiriosPlus commented Sep 9, 2019

freefreesea commented Sep 9, 2019

freefreesea commented Sep 9, 2019

seiriosPlus commented Sep 9, 2019 •

edited

Loading