Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

作者能不能讲讲斗地主DQN的设计原理 #13

Open
peterwangx opened this issue Aug 9, 2020 · 0 comments
Open

作者能不能讲讲斗地主DQN的设计原理 #13

peterwangx opened this issue Aug 9, 2020 · 0 comments

Comments

@peterwangx
Copy link

作者你好,我看了这个项目对应的论文,naive DQN的输入层对应的是什么数据呢?多少维度?我看到另外一篇deeprocket的论文,文章里面对斗地主的特征做了描述大概如下:
状态编号 状态内容 出牌动作
0 0:33
1 0:33;1:44;2:JJ 0:AA
2 0:33;1:44;2:JJ;0:AA;1:PASS;2:PASS 0:66

0:33;1:44;2:JJ 表示地主0号玩家出一对33,地主下家1号玩家出一对44,地主上家2号玩家出一对JJ
这两篇文章我还是不清楚输入层输入的数据是什么?有多少个输入节点?
隐藏层有多少层,每层多少节点,为什么是这个数?
输出层我想应该是当前要打的牌,可是怎么建立关联呢?
训练时输出层Q值最后要向目标Q值逼近,那么目标Q值是怎么计算的来的呢?
我看你文章里面有关于牌型的权重,作者是否是根据权重规则打牌得到目标Q值呢?
Category V'eight
Non
Solo MaxCard - 10
Pair MaxCard - 10
Trio MaxCard - 10
S qu ntial Solos MaxCard - 10 + 1
S quential Pairs MaxCard - 10 + 1
Sequential Trios Take None MaxC d- 1O +1
S quential Trios Tak One MaxCard - 10
S qu ntial Trios Tak Two MaxCard - 10
S qu ntial Trios S ries Tak On 1axCard - 3 + 1) / 2
Sequential Trios S ri s Take Two xCard-3+1)/2
Bomb MaxCard - 3 + 7
Four TakNO Solos (MaxCard - 3) / 2
Four Take Two Pairs (MaxCard - 3) / 2
Nuk 20
以上是我看作者的文章的疑问,作者能不能讲讲这些原理呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant