Just like R, you can use Python in .Rmd
files! Here we import
our
libraries
import pandas as pd
Let’s use pd.read_csv
to load the titanic data and view the top of the
data set:
titanic = pd.read_csv("data/titanic.csv")
titanic.head()
## pclass survived ... body home.dest
## 0 1 1 ... NaN St Louis, MO
## 1 1 1 ... NaN Montreal, PQ / Chesterville, ON
## 2 1 0 ... NaN Montreal, PQ / Chesterville, ON
## 3 1 0 ... 135.0 Montreal, PQ / Chesterville, ON
## 4 1 0 ... NaN Montreal, PQ / Chesterville, ON
##
## [5 rows x 14 columns]
Passing the python data frame into ggplot
using py$data_frame
syntax
to make my scatterplot (what can I say, I love ggolot2
…).
## Warning: Removed 264 rows containing missing values (geom_point).
ggplot2::ggplot(py$titanic, aes(x = age, y = fare)) +
geom_point()
Here we find the destination of the first passenger:
first_dest = titanic["home.dest"][0]
The destination of the first passenger is St Louis, MO.