-
Notifications
You must be signed in to change notification settings - Fork 37
/
chapter1.Rmd
680 lines (499 loc) · 22.2 KB
/
chapter1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
---
title_meta : Chapter 1
title : Intro to basics
description : "In this chapter, you will take your first steps with R. You will learn how to use the console as a calculator and how to assign variables. You will also get to know the basic data types in R. Let's get started!"
attachments :
slides_link: https://s3.amazonaws.com/assets.datacamp.com/course/introduction_to_r/slides/ch1_slides_v2.pdf
--- type:VideoExercise lang:r xp:50 skills:1 key:1a1ba28cd5
## Meet R
*** =video_link
//player.vimeo.com/video/144351865
*** =video_hls
//videos.datacamp.com/transcoded/732_intro_to_r/v1/hls-ch1_1.master.m3u8
--- type:NormalExercise lang:r xp:100 skills:1 key:5714863ba8
## Your first R script
In the script on the right you should type R code to solve the exercises. When you hit the _Submit Answer_ button, every line of code in the script is interpreted and executed by R and you get a message that indicates whether or not your code was correct. The output of your submission is shown in the R console.
You can also execute R commands straight in the console. When you type in the console, your submission will not be checked for correctness! Try, for example, to type in `3 * 4` and hit Enter. R should return `[1] 12`.
*** =instructions
In the script, add another line of code that calculates the sum of 6 and 12, and hit the _Submit Answer_ button.
*** =hint
Simply add a line of R code that calculates the sum of 6 and 12, just like the example in the sample code!
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
3 + 4
```
*** =solution
```{r}
3 + 4
6 + 12
```
*** =sct
```{r}
test_output_contains("18", incorrect_msg = "Make sure to add a line of R code that calculates the sum of 6 and 12.")
success_msg("Awesome! See how the console shows the result of the R code you submitted?")
```
--- type:NormalExercise lang:r xp:100 skills:1 key:ab37530088
## Documenting your code
Adding comments to your code is extremely important to make sure that you and others can understand what your code is about. R makes use of the `#` sign to add comments, just like Twitter!
Comments are not run as R code, so they will not influence your result. For example, _Calculate 3 + 4_ in the script on the right is a comment and is ignored during execution.
*** =instructions
Add another comment in the script on the right, _Calculate 6 + 12_, at the appropriate location.
*** =hint
Simply add the line `# Calculate 6 + 12` above the R code that calculates 6 + 12.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
# Calculate 3 + 4
3 + 4
6 + 12
```
*** =solution
```{r}
# Calculate 3 + 4
3 + 4
# Calculate 6 + 12
6 + 12
```
*** =sct
```{r}
test_output_contains("7", incorrect_msg = "Do not remove the code that calculates 3 + 4.")
test_student_typed("# Calculate 3 + 4", not_typed_msg = "Do not remove the comment for the code that calculates 3 + 4.")
test_output_contains("18", incorrect_msg = "Do not remove the code that calculates 6 + 12.")
test_student_typed(c("# Calculate 6 + 12", "# calculate 6 + 12", "#Calculate 6 + 12", "#calculate 6 + 12",
"# Calculate 6+12", "# calculate 6+12", "#Calculate 6+12", "#calculate 6+12"),
not_typed_msg = "Make sure to add the comment: `# Calculate 6 + 12`")
success_msg("Great! Looks better, doesn't it?")
```
--- type:NormalExercise lang:r xp:100 skills:1 key:9d8a3d0b88
## R as a calculator
In its most basic form R can be used as a scientific calculator. Consider the following arithmetic operators:
- Addition: `+`
- Subtraction: `-`
- Multiplication: `*`
- Division: `/`
- Exponentiation: `^`
- Modulo: `%%`
The last two might need some explaining:
- The `^` operator raises the number to its left to the power of the number to its right: for example `3^2` equals 9.
- The modulo returns the remainder of the division of the number to the left by the number on its right, for example 5 modulo 3 or `5 %% 3` equals 2.
*** =instructions
- Type `2^5` in the script to calculate 2 to the power 5.
- Type `28 %% 6` to calculate 28 modulo 6.
- Click _Submit Answer_ and have a look at the R output in the console.
*** =hint
Another example of the modulo operator: `9 %% 2` equals `1`.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
# Addition
5 + 5
# Subtraction
5 - 5
# Multiplication
3 * 5
# Division
(5 + 5) / 2
# Exponentiation
# Modulo
```
*** =solution
```{r}
# Addition
5 + 5
# Subtraction
5 - 5
# Multiplication
3 * 5
# Division
(5 + 5) / 2
# Exponentiation
2 ^ 5
# Modulo
28 %% 6
```
*** =sct
```{r}
msg <- "Do not remove the examples that have already been coded for you!"
test_output_contains("5 + 5", incorrect_msg = msg)
test_output_contains("5 - 5", incorrect_msg = msg)
test_output_contains("3 * 5", incorrect_msg = msg)
test_output_contains("(5 + 5)/2", incorrect_msg = msg)
test_output_contains("2^5", incorrect_msg = "Have another look at the exponentiation. Read the instructions carefully.")
test_output_contains("28 %% 6", incorrect_msg = "Have another look at the use of the `%%` operator. Read the instructions carefully.")
success_msg("Nice one!")
```
--- type:MultipleChoiceExercise xp:50 skills:1 key:9d8819fb2e
## R's pros and cons
As Filip explained in the video, there are things that make R the awesome and immensely popular language that it is today. On the other hand, there are also aspects about R that are less attractive. Which of the following statements are true regarding this statistical programming language developed by Ihaka and Gentleman in the nineties?
1. As opposed to SAS and SPSS, R is completely open-source.
2. R is open-source, but it's hard to share your code with others since R uses a command-line interface.
3. It typically takes a long time for new and updated R packages to be released and made available to the public.
4. R is easy to use, but this comes at the cost of limited graphical abilities.
5. R works well with large data sets, if the code is properly written and the data fits into the working memory.
*** =instructions
- statements (1) and (2) are correct; the others are false.
- statements (1) and (4) are correct; the others are false.
- statements (1) and (5) are correct; the others are false.
- statements (2) and (4) are correct; the others are false.
- statements (3) and (5) are correct; the others are false.
*** =hint
Remember that your data has to fit in the working memory for R to be able to process it.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sct
```{r}
msg1 = "Remember that the fact that R uses a command-line interface, does not make it hard to share code. On the contrary, sharing your results becomes very straightforward because you can easily share R scripts."
msg2 = "R is the perfect tool for creating neat and insightful visualizations. Try again."
msg3 = "Great! Head over to the next exercise and get your hands dirty!"
msg4 = "R uses a command-line interface, which makes it very easy to share one's code. Also, R is very suitable for creating visualizations. Try again."
msg5 = "It's fairly straightforward to write, maintain and share R packages. Try again."
test_mc(3, feedback_msgs = c(msg1, msg2, msg3, msg4, msg5))
```
--- type:NormalExercise lang:r xp:100 skills:1 key:6b6fb4974c
## Variable assignment (1)
A variable allows you store a value or an object in R. You can then later use this variable's name to easily access the value or the object that is stored within this variable. You use `<-` to assign a variable:
```
my_variable <- 4
```
*** =instructions
Complete the code in the editor such that it assigns the value 42 to the variable `x` in the editor. Click 'Submit Answer'. Notice that when you ask R to print `x`, the value 42 appears.
*** =hint
Look at how the value 4 was assigned to `my_variable` in the exercise's assignment. Do the exact same thing in the editor, but now assign 42 to the variable `x`.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
# Assign the value 42 to x
x <-
# Print out the value of the variable x
x
```
*** =solution
```{r}
# Assign the value 42 to x
x <- 42
# Print out the value of the variable x
x
```
*** =sct
```{r}
test_error()
test_object("x",
undefined_msg = "Make sure to define a variable <code>x</code>.",
incorrect_msg = "Make sure that you assign the correct value to <code>x</code>.")
success_msg("Good job! Notice that R does not print the value of a variable to the console when you do the assignment. <code>x <- 42</code> did not generate any output, because R assumes that you will be needing this variable in the future. Otherwise you wouldn't have stored the value in a variable in the first place, right? Proceed to the next exercise!")
```
--- type:NormalExercise xp:100 skills:1 key:a5b8028834
## Variable assignment (2)
Suppose you have a fruit basket with five apples. You want to store the number of apples in a variable with the name `my_apples`.
*** =instructions
- Using `<-`, assign the value 5 to `my_apples` below the first comment.
- Type `my_apples` below the second comment. This will print out the value of `my_apples`.
- After clicking _Submit Answer_, have a look at the console: the number 5 is printed, so R now links the variable `my_apples` to the value 5.
*** =hint
Remember that if you want to assign a number or an object to a variable in R, you can make use of the assignment operator `<-`. Alternatively, you can use `=`, but `<-` is widely preferred in the R community.
*** =pre_exercise_code
```{r}
```
*** =sample_code
```{r}
# Assign the value 5 to the variable called my_apples
# Print out the value of the variable my_apples
```
*** =solution
```{r}
# Assign the value 5 to the variable called my_apples
my_apples <- 5
# Print out the value of the variable my_apples
my_apples
```
*** =sct
```{r}
test_object("my_apples", incorrect_msg = "Have you correctly assigned 5 to `my_apples`? Write `my_apples <- 5` on a new line in the script.")
test_output_contains("my_apples", incorrect_msg = "Have you explicitly told R to print out the `my_apples` variable to the console? Simply type `my_apples` on a new line.")
success_msg("Great! You could also use `=` for variable assignment, but `<-` is typically preferred.")
```
--- type:NormalExercise lang:r xp:100 skills:1 key:a0cb1bea96
## Variable assignment (3)
Every tasty fruit basket needs oranges, so you decide to add six oranges. You decide to create the variable `my_oranges` and assign the value 6 to it. Next, you want to calculate how many pieces of fruit you have in total. Since you have given meaningful names to these values, you can now code this in a clear way:
```
my_apples + my_oranges
```
*** =instructions
- Assign to `my_oranges` the value 6.
- Add the variables `my_apples` and `my_oranges` and have R simply print the result.
- Combine the variables `my_apples` and `my_oranges` into a new variable `my_fruit`, which is the total amount of fruits in your fruit basket.
*** =hint
`my_fruit` is just the sum of `my_apples` and `my_oranges`. You can use the `+` operator to sum the two and `<-` to assign that value to the variable `my_fruit`.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
# Assign 5 to my_apples
my_apples <- 5
# Assign 6 to my_oranges
# Add my_apples and my_oranges: print the result
# Add my_apples and my_oranges: assign to my_fruit
```
*** =solution
```{r}
# Assign 5 to my_apples
my_apples <- 5
# Assign 6 to my_oranges
my_oranges <- 6
# Add my_apples and my_oranges: print the result
my_apples + my_oranges
# Add my_apples and my_oranges: assign to my_fruit
my_fruit <- my_apples + my_oranges
```
*** =sct
```{r}
test_object("my_apples", incorrect_msg = "Do not change the assignment of the `my_apples` variable!")
test_object("my_oranges")
test_output_contains("my_apples + my_oranges",
incorrect_msg = "The output does not contain the result of adding `my_apples` and `my_oranges` (second instruction). Try again.")
test_object("my_fruit")
success_msg("Nice one! The great advantage of doing calculations with variables is reusability. If you just change `my_apples` to equal 12 instead of 5 and rerun the script, `my_fruit` will automatically update as well.")
```
--- type:NormalExercise lang:r xp:100 skills:1 key:6192f64167
## The workspace
If you assign a value to a variable, this variable is stored in the workspace. It's the place where all user-defined variables live. The command [`ls()`](http://www.rdocumentation.org/packages/base/functions/ls) lists the contents of this workspace.
```
a <- 1
b <- 2
ls()
```
The first two lines create the variables `a` and `b`. Calling [`ls()`](http://www.rdocumentation.org/packages/base/functions/ls) now shows you that `a` and `b` are in the workspace.
You can also remove variables from the workspace. You do this with [`rm()`](http://www.rdocumentation.org/packages/base/functions/rm). `rm(a)`, for example, would remove `a` from the workspace again. `rm(list = ls())`, which is used in the beginning of your script, clears everything from the workspace.
*** =instructions
- Create a variable, `horses`, equal to 3, and a variable `dogs`, equal to 7.
- List the contents of your workspace with [`ls()`](http://www.rdocumentation.org/packages/base/functions/ls) to see that indeed, these two variables are in there.
*** =hint
All you need is a combination of [`ls()`](http://www.rdocumentation.org/packages/base/functions/ls) and [`rm()`](http://www.rdocumentation.org/packages/base/functions/rm) at the right time. Give it a try and let the feedback messages guide you.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
# Clear the entire workspace
rm(list = ls())
# Create the variables horses and dogs
# List the contents of your workspace
```
*** =solution
```{r}
# Clear the entire workspace
rm(list = ls())
# Create the variables horses and dogs
horses <- 3
dogs <- 7
# Inspect the contents of the workspace again
ls()
```
*** =sct
```{r}
test_student_typed("rm(list = ls())", not_typed_msg = "Do not remove the line `rm(list = ls())`.")
test_object("horses")
test_object("dogs")
test_output_contains('c("dogs", "horses")',
incorrect_msg = "Make sure to inspect the objects in your workspace after creating `horses` and `dogs`.")
success_msg("Awesome! You can now build up and inspect your workspace, great!")
```
--- type:VideoExercise lang:r xp:50 skills:1 key:9f9019501e
## Basic Data Types
*** =video_link
//player.vimeo.com/video/138173888
*** =video_hls
//videos.datacamp.com/transcoded/732_intro_to_r/v1/hls-ch1_2.master.m3u8
--- type:NormalExercise lang:r xp:100 skills:1 key:1866cdd202
## Discover Basic Data Types
To get started, here are some of R's most basic data types:
- Decimal values like `4.5` are called **numerics**.
- Natural numbers like `4L` are called **integers**. Integers are also numerics.
- Boolean values (`TRUE` or `FALSE`) are called **logical**. Capital letters are important here; `true` and `false` are not valid.
- Text (or string) values are called **characters**.
Note how the quotation marks on the right indicate that `"some text"` is of type character.
*** =instructions
Change the value of the:
- `my_numeric` variable to `42`.
- `my_character` variable to `"forty-two"`. Note that the quotation marks indicate that `"forty-two"` is a character.
- `my_logical` variable to `FALSE`.
*** =hint
Replace the values in the script with the values that are provided in the exercise.
```
my_numeric <- 42
```
assigns the value 42 to the variable `my_numeric`.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
# What is the answer to the universe?
my_numeric <- 42.5
# The quotation marks indicate that the variable is of type character
my_character <- "some text"
# Change the value of my_logical
my_logical <- TRUE
```
*** =solution
```{r}
# What is the answer to the universe?
my_numeric <- 42
# The quotation marks indicate that the variable is of type character
my_character <- "forty-two"
# Change the value of my_logical
my_logical <- FALSE
```
*** =sct
```{r}
test_object("my_numeric",
incorrect_msg = "Make sure that you assign the correct value to `my_numeric.`")
test_object("my_character",
incorrect_msg = paste("Make sure that you assign the correct value to `my_character`.",
"Do not forget the quotes and beware of capitalization! R is case sensitive!"))
test_object("my_logical",
undefined_msg = "Please make sure to define a variable `my_logical`.",
incorrect_msg = "Make sure that you assign the correct value to `my_logical`.")
success_msg("Great work! Continue to the next exercise.")
```
--- type:NormalExercise lang:r xp:100 skills:1 key:c52153af0b
## Back to Apples and Oranges
Common knowledge tells you not to add apples and oranges. But hey, that is what you just did! The `my_apples` and `my_oranges` variables both contained a number in the previous exercise. The `+` operator works with numeric variables in R.
However, if you try to add a numeric and a character string, R will complain.
*** =instructions
- Click _Submit Answer_ and read the error message. Make sure you understand why this did not work.
- Adjust `my_oranges <- "six"` such that R knows you have 6 oranges and thus a fruit basket with 11 pieces of fruit. Click _Submit Answer_ again.
*** =hint
You have to assign the numeric value `6` to the `my_oranges` variable instead of the character value `"six"`. Notice how the quotation marks are used to indicate that `"six"` is a character.
*** =pre_exercise_code
```{r}
# no pec
```
*** =sample_code
```{r}
# Assign a value to the variable my_apples and print it out
my_apples <- 5
my_apples
# Assign a value to the variable my_oranges and print it out
my_oranges <- "six"
my_oranges
# New variable that contains the total amount of fruit
my_fruit <- my_apples + my_oranges
my_fruit
```
*** =solution
```{r}
# Assign a value to the variable my_apples and print it out
my_apples <- 5
my_apples
# Assign a value to the variable my_oranges and print it out
my_oranges <- 6
my_oranges
# New variable that contains the total amount of fruit
my_fruit <- my_apples + my_oranges
my_fruit
```
*** =sct
```{r}
test_object("my_apples", incorrect_msg = "Don't change the code that assigns 5 to `my_apples`.")
test_object("my_oranges", incorrect_msg = "Change the assignment of the `my_oranges` variable such that the code runs without errors.")
test_object("my_fruit",
undefined_msg = "Please make sure to define a variable `my_fruit`.",
incorrect_msg = "Make sure that you assign the correct value to `my_fruit`.")
test_output_contains("my_fruit", incorrect_msg = "The output does not contain the result of adding `my_apples` and `my_oranges`.")
success_msg("Awesome, keep up the good work!")
```
--- type:MultipleChoiceExercise lang:r xp:50 skills:1 key:7806ca24d2
## What's that data type?
When you added the variables containing `5` and `"six"`, you got an error due to a mismatch in data types. You can avoid such embarrassing situations by checking the data type of a variable beforehand:
```
class(my_var)
```
In the workspace (you can see what it contains by typing [`ls()`](http://www.rdocumentation.org/packages/base/functions/ls) in the console), some variables have already been defined. Which statement concerning these variables are correct?
*** =instructions
- `a`'s class is `integer`, `b` is a `character`, `c` is a `boolean`.
- `a`'s class is `character`, `b` is a `character` as well, `c` is a `logical`.
- `a`'s class is `numeric`, `b` is a `string`, `c` is a `logical`.
- `a`'s class is `numeric`, `b` is a `character`, `c` is a `logical`.
*** =hint
You can find out the data type of the `a` variable for example by typing `class(a)`. You can do similar things for `b` and `c`.
*** =pre_exercise_code
```{r}
a <- 42
b <- "forty-two"
c <- FALSE
```
*** =sct
```{r}
msg1 <- "`boolean` is not the class for logical values. Try again."
msg2 <- "`a` is of the class `numeric`, give it another go."
msg3 <- "`string` is not a class in R. `character` is!"
msg4 <- "Nice one. Let's step it up a notch and start coercing variables!"
test_mc(correct = 4, feedback_msgs = c(msg1, msg2, msg3, msg4))
```
--- type:NormalExercise lang:r xp:100 skills:1 key:c75fe45544
## Coercion: Taming your data
As Filip explained in the video, coercion to transform your data from one type to another is perfectly possible. Next to the [`class()`](http://www.rdocumentation.org/packages/base/functions/class) function and the `is.*()` functions, you can use the `as.*()` functions to force data to change types.
Take this example:
```
var <- "3"
var_num <- as.numeric(var)
```
`var`, a character string, is converted into a numeric using [`as.numeric()`](http://www.rdocumentation.org/packages/base/functions/numeric). The resulting numeric is stored as `var_num`.
*** =instructions
- Convert `var`, a logical, to a character. Assign to resulting character string to the variable `var_char`.
- Inspect the class of `var_char` by using [`class()`](http://www.rdocumentation.org/packages/base/functions/class) on it.
*** =hints
Use the [`as.character()`](http://www.rdocumentation.org/packages/base/functions/character) function to convert `var` to a character.
*** =pre_exercise_code
```{r}
```
*** =sample_code
```{r}
# Definition of var
var <- TRUE
# Convert var to a character: var_char
# Display the class of var_char
```
*** =solution
```{r}
# Definition of var
var <- TRUE
# Convert var to a character: var_char
var_char <- as.character(var)
# Display the class of var_char
class(var_char)
```
*** =sct
```{r}
test_error()
msg <- "Do not remove or change the definition of the variable `var`."
test_object("var", undefined_msg = msg, incorrect_msg = msg)
test_function("as.character", "x",
not_called_msg = "Make sure to call the function [`as.character()`](http://www.rdocumentation.org/packages/base/functions/character) to convert `var` to a character.",
incorrect_msg = "Have you passed the correct variable to the function [`as.character()`](http://www.rdocumentation.org/packages/base/functions/character)?")
test_object("var_char")
test_function("class", "x",
not_called_msg = "Make sure to call the function <code>class()</code> to inspect the class of <code>var_char</code>.",
incorrect_msg = "Have you passed the correct variable to the function <code>class()/<code>?")
success_msg("Bellissimo!")
```