-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathgettingstarted.html
152 lines (133 loc) · 5.41 KB
/
gettingstarted.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Norvig Web Data Science Award</title>
<meta name="description" content="">
<meta name="author" content="">
<!-- Le styles -->
<link href="assets/css/bootstrap.css" rel="stylesheet">
<link href="assets/css/nbwsa-2014.css" rel="stylesheet">
<!-- Le HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-36109664-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body>
<!-- Navbar
================================================== -->
<div class="navbar navbar-inverse navbar-fixed-top">
<div class="navbar-inner">
<div class="container">
<button type="button" class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="brand" href="./index.html">Norvig Award</a>
<div class="nav-collapse collapse">
<ul class="nav">
<li class="">
<a href="index.html">Home</a>
</li>
<li class="">
<a href="learnmore.html">Learn more</a>
</li>
<li class="">
<a href="apply.html">Apply</a>
</li>
<li class="active">
<a href="gettingstarted.html">Getting started</a>
</li>
<li class="">
<a href="submittingresults.html">Submit results</a>
</li>
<li class="">
<a href="faq.html">FAQ</a>
</li>
</ul>
</div>
</div>
</div>
</div>
<div class="jumbotron masthead">
<div class="container">
<h1>Norvig Web Data Science Award</h1>
<p>show what you can do with 3 billion web pages<small><br/>by <a href="http://www.surfsara.nl"><img alt="SURFsara" src="assets/images/sara.logo.png"></a> and <a href="http://www.commoncrawl.org"><img src="assets/images/commoncrawl-small.png" alt="CommonCrawl"></a></small></p>
</div>
</div>
<div class="container">
<div class="span9">
<section id="runvm">
<div class="page-header">
<h1><a href="vm.html">Step 1: Setting up your development environment >></a></h1>
</div>
<p>We provide a Virtual Machine that can be downloaded and used from your own laptop or PC.
This environment includes all tools you need for hacking on the Common Crawl dataset.</p>
<hr class="soften" />
</section>
<section id="examples">
<div class="page-header">
<h1><a href="examples.html">Step 2: Get a feel for Hadoop and for the data >></a></h1>
</div>
<p>Here we try to give you a feel for how Hadoop works for processing Common Crawl. We
do this using some MapReduce examples, and a Pig example.</p>
<hr class="soften" />
</section>
<section id="hacking">
<div class="page-header">
<h1><a href="hacking.html">Step 3: Start hacking >></a></h1>
</div>
<p>Now that you ran a few examples, it's time to start hacking on <em>your</em> ideas!</p>
<hr class="soften" />
</section>
<section id="hacking">
<div class="page-header">
<h1><a href="finalrun.html">Step 4: Do a final run on all data >></a></h1>
</div>
<p>So you did enough testing? Now you want to get down to the real work? Well then,
find out here how you can do a run on the complete dataset!</p>
<hr class="soften" />
</section>
</div>
<div class="span2" id="sponsors">
<ul class="thumbnails">
<li>
<a href="http://www.surfsara.nl" class="thumbnail">
<img src="assets/images/sara.logo.png" alt="SURFsara">
</a>
</li>
<li>
<a href="http://www.commoncrawl.org" class="thumbnail">
<img src="assets/images/cc.logo.png" alt="Common Crawl">
</a>
</li>
<li>
<a href="http://www.github.com" class="thumbnail">
<img src="assets/images/github.logo.png" alt="Github">
</a>
</li>
</ul>
</div>
</div>
<!-- Footer
================================================== -->
<footer class="footer">
<div class="container">
<p class="pull-right"><a href="#">Back to top</a></p>
<p>Design adapted from <a href="http://www.twitter.com">Twitter</a>’s <a href="http://twitter.github.com/bootstrap">Bootstrap</a> page</p>
</div>
</footer>
<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>
</body>
</html>