"DOM-Q-NET: Grounded RL on Structured Language" International Conference on Learning Representations (2019). Sheng Jia, Jamie Kiros, Jimmy Ba. [arxiv] [openreview]
Trained multitask agent: https://www.youtube.com/watch?v=eGzTDIvX4IY
Facebook login: https://www.youtube.com/watch?v=IQytRUKmWhs&t=2s
Need to download selenium & install chrome driver for selenium..
- Clone this repo
- Download MiniWoB++ environment from the original repo https://github.com/stanfordnlp/miniwob-plusplus
and copy miniwob-plusplus/html folder to miniwob/html in this repo - In fact, this html folder could be stored anywhere, but remember to perform one of the following actions:
- Set environment variable
"WOB_PATH"
to
file://"your-path-to-miniwob-plusplus"/html/miniwob
E.g. "your-path-to-miniwob-plusplus" is "/h/sheng/DOM-Q-NET/miniwob- Directly modify the
base_url
on line 33 of instance.py to
"your-path-to-miniwob-plusplus"/html/miniwob
In my case,base_url='file:///h/sheng/DOM-Q-NET/miniwob/html/miniwob/'
Experiment launch files are stored under runs
For example,
cd runs/hard2medium9tasks/
sh run1.sh
will launch a 11 multi-task (social-media
search-engine
login-user
enter-password
click-checkboxes
click-option
enter-dynamic-text
enter-text
email-inbox-delete
click-tab-2
navigation-tree
) experiment.
Item | Maximum number of items |
---|---|
DOM tree leaves (action space) | 160 |
DOM tree | 200 |
Instruction tokens | 16 |
Attribute | max vocabulary | Embedding dimension |
---|---|---|
Tag | 100 |
16 |
Text (shared with instructions) | 600 |
48 |
Class | 100 |
16 |
- UNKnown tokens
These are assigned to a random vector such that the cosine distance with the text attribute can yield 1.0 for the direct alignment.
Credit to Dopamine for the implementation of prioritized replay used in dstructs/dopamine_segtree.py