You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
n-gram模型的讲义中提到了在处理每一个句子的时候都需要加一个首尾标志(<start>,<end>),比如如下的两个句子,bigram model为例:
(1). <start> I am Sam <end>
(2). <start> Sam I am <end>
具体我有三个疑惑:
(1). 对于结尾符<end>,文中的解释为"To make the bigram grammar a true probability distribution. Without an end-symbol, the sentence probabilities for all sentences of a given length would sum to one. This model would define an infinite set of probability distribution, with one distribution per sentence length."我不是很明白,请问有没有更直观的解释或者参考的资料呢?
(2).对于起始符<start>,文中解释是为了"to give us the bigram context of the first word."起始符没有像结尾符一样在概率分布方面的作用吗?
(3). 对于n-gram,是否需要在首尾加上n-1个起始和结尾符,还是仅仅只需要添加一个就行了呢?
跪求解惑。。。
The text was updated successfully, but these errors were encountered:
n-gram模型的讲义中提到了在处理每一个句子的时候都需要加一个首尾标志(<start>,<end>),比如如下的两个句子,bigram model为例:
(1). <start> I am Sam <end>
(2). <start> Sam I am <end>
具体我有三个疑惑:
(1). 对于结尾符<end>,文中的解释为"To make the bigram grammar a true probability distribution. Without an end-symbol, the sentence probabilities for all sentences of a given length would sum to one. This model would define an infinite set of probability distribution, with one distribution per sentence length."我不是很明白,请问有没有更直观的解释或者参考的资料呢?
(2).对于起始符<start>,文中解释是为了"to give us the bigram context of the first word."起始符没有像结尾符一样在概率分布方面的作用吗?
(3). 对于n-gram,是否需要在首尾加上n-1个起始和结尾符,还是仅仅只需要添加一个就行了呢?
跪求解惑。。。
The text was updated successfully, but these errors were encountered: