-
Notifications
You must be signed in to change notification settings - Fork 2
/
index.html
executable file
·352 lines (299 loc) · 18 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<title>CS275 - Artificial Life Project</title>
<!-- Font Awesome -->
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css">
<!-- Bootstrap core CSS -->
<link href="css/bootstrap.min.css" rel="stylesheet">
<!-- Material Design Bootstrap -->
<link href="css/mdb.min.css" rel="stylesheet">
<!-- Your custom styles (optional) -->
<link href="css/style.css" rel="stylesheet">
</head>
<body>
<main class="mt-5">
<h2 class="mb-5 font-weight-bold text-center">Simulation of Predator Prey Dynamics Using Deep Reinforcement Learning</h2>
<div class="container">
<hr class="my-5">
<!--Section: Report-->
<section id="group-detail" >
<style>
table {
border: 1px solid black;
border-collapse: collapse;
border-spacing:0;
padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;
}
th, td {
border: 1px solid black;
text-align: left;
padding:10px 5px;border-style:solid;border-width:1px;overflow:hidden;word-break:normal;
}
</style>
<!-- Heading -->
<h4 class="my-4 font-weight-bold">Group Details</h4>
<table class="table table-hover">
<thead>
<tr>
<th>Name</th>
<th>UID</th>
<th>Email</th>
</tr>
</thead>
<tbody>
<tr>
<td>Anoosha Sagar</td>
<td>605028604</td>
<td>anoosha.sagar@cs.ucla.edu</td>
</tr>
<tr>
<td>Maithili Bhide</td>
<td>104943331</td>
<td>maithili.bhide@cs.ucla.edu</td>
</tr>
<tr>
<td>Rahul Dhavalikar</td>
<td>205024839</td>
<td>rahul.dhavalikar@cs.ucla.edu</td>
</tr>
<tr>
<td>Akshay Sharma</td>
<td>504946035</td>
<td>akshaysharma23@cs.ucla.edu</td>
</tr>
</tbody>
</table>
</section>
</div>
</div>
<div class="container">
<div class="container">
<hr class="my-5">
<!--Section: Abstract-->
<section id="abstract" >
<!-- Heading -->
<h4 class="my-4 font-weight-bold">Abstract</h4>
<!--Grid row-->
<div class="row d-flex justify-content-center mb-4">
<p style="text-align:justify">
In this project we simulate predator-prey dynamics in a multi-agent environment where both the predator and the prey evolve together. Each of our agents is assigned a deep reinforcement learning model (DQN) trained using Tensorflow to determine its actions. The goal of each of agent is to maximize its reward in this mixed cooperative and competitive environment. We are using OpenAI gym to simulate the environment and have explored different scenarios with varying number of agents(eg. 1v1, 1v2, 2v1). We also introduced additional artefacts in the environment such as obstacles and food to observe the change in the dynamics of our predator-prey agents. Currently, our environment returns a list of observations based on the actions performed by the agents in constrained space. From our simulation of various scenarios, we observe that the green agents adopt strategies such as splitting up in order to confuse the red agent and effective use of its speed to gather food. Similarly, red agents evolve to create triangular formations and man-to-man marking strategies to corner the green agents and gain rewards for colliding with them.
</p>
</div>
</div>
<div class="container">
<hr class="my-5">
<!--Section: Overview-->
<section id="overview" >
<!-- Heading -->
<h4 class="my-4 font-weight-bold">Overview</h4>
<!--Grid row-->
<div class="row">
<div class="col-md-6 mb-4" align="center">
<img src="img/simple-scenario.png"/>
<figcaption>An Example of a Simple Scenario with 2 Green Agents and 1 Red Agent</figcaption>
</div>
<div class="col-md-6 mb-4" align="center">
<img src="img/complex-scenario.png"/>
<figcaption>An Example of a Complex Scenario with 2 Green Agents, 1 Red Agent, 1 Obstacle, and 2 Food Items</figcaption>
</div>
<p style="text-align:justify">
In the predator prey environment, slower (red) agents chase faster (green) adversaries. In addition, there are food items (blue) which the green agents wish to consume and obstacles (black) which can be used by the green agents to distract and confuse the red agents. Multiple agents can exist on either team and the goal of each agent is to maximize its own reward in this mixed cooperative and competitive setting. The environment returns a list of states, one per agent, once each agent performs an action. An agent’s state consists of the (1) agent’s position and velocity, (2) the relative position of any landmarks or food, (3) the relative position of all other agents. The agents receive a negative reward if they leave the arena and the environment is reset with the next episode beginning in a random configuration. This trains our agents to operate within the confines of the environment. An episode also ends and is reset if it has completed 3000 steps. The episodes do not end when the green and red agents collide, instead we keep a track of the number of collisions in the episode.
</p>
<p style="text-align:justify">
All red agents receive a positive reward if they are able to successfully intercept any green agent. The green agents receive a corresponding negative reward if they are intercepted by the red agents. To help training converge faster, we also enabled an L2 penalty based on the distance formula. Red agents are given a negative reward proportional to their distance from green agents to prevent them from drifting aimlessly during the early stages of training. Green agents are given a positive reward proportional to their distance from red agents to similarly help them train faster. Green agents additionally get a positive reward based on their distance from the food and a higher positive reward for actually eating the food. There is also a negative reward for the green agents at the boundary of the arena, so that the green agents do not exit the arena easily. The negative reward proportional to distance for the red agent is not only based on its own distance from green agent but is also dependent on the distances of other red agents from green agents. Similarly when any one of the red agent collides with a green agent, all the red agents receive the same positive reward. This kind of a reward system where the actions of one agent affect the rewards for other agents of its kind leads to formation of cooperative strategies as is discussed later and is evident from the simulation.
</p>
<p style="text-align:justify">
The action space consists of a list of actions, one per agent. Each agent’s action is described by 5 numbers, which represent whether an agent should stay put or move up/down/left/right (this is specified by the OpenAI Gym environment).
</p>
</div>
<div class="container">
<hr class="my-5">
<!--Section: Gallery-->
<section id="videos">
<!-- Heading -->
<h4 class="my-4 font-weight-bold">Video Demos</h4>
<!--Grid row-->
<div class="row">
<!--Grid column
<div class="col-md-6 mb-4"> -->
<!--Carousel Wrapper-->
<div id="video-carousel-example2" class="carousel slide carousel-fade" data-ride="carousel">
<!--Indicators-->
<!-- <ol class="carousel-indicators">
<li data-target="#video-carousel-example2" data-slide-to="0" class="active"></li>
<li data-target="#video-carousel-example2" data-slide-to="1"></li>
<li data-target="#video-carousel-example2" data-slide-to="2"></li>
<li data-target="#video-carousel-example2" data-slide-to="3"></li>
<li data-target="#video-carousel-example2" data-slide-to="4"></li>
<li data-target="#video-carousel-example2" data-slide-to="5"></li>
</ol> -->
<!--/.Indicators-->
<!--Slides-->
<div class="carousel-inner" role="listbox">
<!-- First slide -->
<div class="carousel-item active">
<!--Mask color-->
<div class="view">
<!--Video source-->
<video class="video-fluid" controls poster="img/poster-ddqn-1v1.png">
<source src="videos/ddqn-1v1-f.mp4" type="video/mp4" />
</video>
</div>
</div>
<!-- /.First slide -->
<!-- Second slide -->
<div class="carousel-item">
<!--Mask color-->
<div class="view">
<!--Video source-->
<video class="video-fluid" controls poster="img/poster-ddqn-1v2.png">
<source src="videos/ddqn-1v2-f.mp4" type="video/mp4" />
</video>
</div>
</div>
<!-- /.Second slide -->
<!-- Third slide -->
<div class="carousel-item">
<!--Mask color-->
<div class="view">
<!--Video source-->
<video class="video-fluid" controls poster="img/poster-ddqn-2v1.png">
<source src="videos/ddqn-2v1-f.mp4" type="video/mp4" />
</video>
</div>
</div>
<!-- /.Third slide -->
<!-- First slide -->
<div class="carousel-item">
<!--Mask color-->
<div class="view">
<!--Video source-->
<video class="video-fluid" controls poster="img/poster-ddqn-complex-1v1.png">
<source src="videos/ddqn-complex-1v1-f.mp4" type="video/mp4" />
</video>
</div>
</div>
<!-- /.First slide -->
<!-- Second slide -->
<div class="carousel-item">
<!--Mask color-->
<div class="view">
<!--Video source-->
<video class="video-fluid" controls poster="img/poster-ddqn-complex-1v2.png">
<source src="videos/ddqn-complex-1v2-f.mp4" type="video/mp4" />
</video>
</div>
</div>
<!-- /.Second slide -->
<!-- Third slide -->
<div class="carousel-item">
<!--Mask color-->
<div class="view">
<!--Video source-->
<video class="video-fluid" controls poster="img/poster-ddqn-complex-2v1.png">
<source src="videos/ddqn-complex-2v1-f.mp4" type="video/mp4" />
</video>
</div>
</div>
<!-- /.Third slide -->
</div>
<!--/.Slides-->
<!--Controls-->
<a class="carousel-control-prev" href="#video-carousel-example2" role="button" data-slide="prev">
<span class="carousel-control-prev-icon" aria-hidden="false" ></span>
<span class="sr-only">Previous</span>
</a>
<a class="carousel-control-next" href="#video-carousel-example2" role="button" data-slide="next">
<span class="carousel-control-next-icon" ></span>
<span class="sr-only">Next</span>
</a>
<a class="carousel-control-next" href="#video-carousel-example2" role="button" data-slide="next">
<span class="carousel-control-next-icon"></span>
<span class="sr-only">Next</span>
</a>
<a class="carousel-control-next" href="#video-carousel-example2" role="button" data-slide="next">
<span class="carousel-control-next-icon" ></span>
<span class="sr-only">Next</span>
</a>
<a class="carousel-control-next" href="#video-carousel-example2" role="button" data-slide="next">
<span class="carousel-control-next-icon" ></span>
<span class="sr-only">Next</span>
</a>
<!--/.Controls-->
</div>
<!--Carousel Wrapper-->
</div>
<!--Grid column-->
<!--Grid column
<div class="col-md-6"> -->
<!--Excerpt
<a href="" class="teal-text">
<h6 class="pb-1">
<i class="fa fa-heart"></i>
<strong> Lifestyle </strong>
</h6>
</a>
<h4 class="mb-3">
<strong>This is title of the news</strong>
</h4>
<p>Nam libero tempore, cum soluta nobis est eligendi optio cumque nihil impedit quo minus id quod maxime
placeat facere possimus, omnis voluptas assumenda est, omnis dolor repellendus et aut officiis
debitis aut rerum.</p>
<p>Nam libero tempore, cum soluta nobis est eligendi optio cumque nihil impedit quo minus id quod maxime
placeat facere possimus, omnis voluptas assumenda est, omnis dolor repellendus et aut officiis
debitis aut rerum.</p>
<p>by
<a>
<strong>Jessica Clark</strong>
</a>, 26/08/2016</p>
<a class="btn btn-primary btn-md">Read more</a>-->
</div>
<!--Grid column-->
</div>
<!--Grid row-->
<br><p> In case the videos are not working in the browser, <a href="videos/" target="_blank">link to the videos folder</a></p>
</section>
<!--Section: Gallery-->
<div class="container">
<hr class="my-5">
<!--Section: Report-->
<section id="report" >
<!-- Heading -->
<a href="Report.pdf" target="_blank"><h4 class="my-4 font-weight-bold">Link to the report</h4></a>
</section>
</div>
<div class="container">
<hr class="my-5">
<!--Section: Code-->
<section id="code" >
<!-- Heading -->
<h4 class="my-4 font-weight-bold">Link to the code</h4>
<br><a href="code/" target="_blank">Code Folder</a><br><br>
<b>Quick Links:</b>
<br><a href="code/ddqn.py" target="_blank">Our Deep Q-Network</a>
<br><a href="code/ddqn_run.py" target="_blank">Our Main file</a>
<br><a href="code/multiagent/scenarios/simple_tag_1v1.py" target="_blank">Sample scenario file(Simple 1v1)</a>
<br><br>
<b>Additional Links:</b>
<br>For any help, refer to the <a href="README.md" target="_blank">README</a>
<br><br>Link to our github repository: <a href="https://github.com/rahul-dhavalikar/dqn-predator-prey-dynamics" target="_blank">GitHub - Simulation of Predator-Prey Dynamics using Deep Reinforcement Learning</a>
<br><br><br><br><br><br><br>
</section>
</div>
</div>
</main>
<!-- SCRIPTS -->
<!-- JQuery -->
<script type="text/javascript" src="js/jquery-3.2.1.min.js"></script>
<!-- Bootstrap tooltips -->
<script type="text/javascript" src="js/popper.min.js"></script>
<!-- Bootstrap core JavaScript -->
<script type="text/javascript" src="js/bootstrap.min.js"></script>
<!-- MDB core JavaScript -->
<script type="text/javascript" src="js/mdb.min.js"></script>
</body>
</html>