-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: Include bikeshare example database in manual #596
Changes from 13 commits
c307e1d
5ab1ad1
2f7e082
59c99b7
4aa33e8
7e95f32
4c42f25
cd58abe
3dbc96d
2db74db
21b43e2
080dfb1
4778a70
4fde9e6
e914564
6d09db8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
--- | ||
title: Bikeshare Example Database | ||
summary: Install the Bikeshare Example Database | ||
--- | ||
|
||
# Bikeshare Example Database | ||
|
||
Examples used in the TiDB manual use [System Data](https://www.capitalbikeshare.com/system-data) from | ||
Capital Bikeshare, released under the [Capital Bikeshare Data License Agreement](https://www.capitalbikeshare.com/data-license-agreement). | ||
|
||
## Downloading all data files | ||
|
||
The system data is available [for download in .zip files](https://s3.amazonaws.com/capitalbikeshare-data/index.html) organized per year. Downloading and extracting all files requires approximately 3GB of disk space. To download all files for years 2010-2017 using a bash script: | ||
|
||
``` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Pls add |
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove this blank line. |
||
mkdir -p bikeshare-data && cd bikeshare-data | ||
|
||
for YEAR in 2010 2011 2012 2013 2014 2015 2016 2017; do | ||
wget https://s3.amazonaws.com/capitalbikeshare-data/${YEAR}-capitalbikeshare-tripdata.zip | ||
unzip ${YEAR}-capitalbikeshare-tripdata.zip | ||
done; | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Remove this blank line, too. |
||
``` | ||
|
||
## Load data into MySQL | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. MySQL or TiDB? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree it should be TiDB - I've changed it. |
||
|
||
The system data can be imported to MySQL using the following schema: | ||
|
||
``` | ||
CREATE DATABASE bikeshare; | ||
USE bikeshare; | ||
|
||
CREATE TABLE trips ( | ||
trip_id bigint NOT NULL PRIMARY KEY auto_increment, | ||
duration integer not null, | ||
start_date datetime, | ||
end_date datetime, | ||
start_station_number integer, | ||
start_station varchar(255), | ||
end_station_number integer, | ||
end_station varchar(255), | ||
bike_number varchar(255), | ||
member_type varchar(255) | ||
); | ||
``` | ||
You can import files indivudally using the example `LOAD DATA` command here, or import all files using the bash loop below: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. indivudally -> individually |
||
|
||
``` | ||
LOAD DATA LOCAL INFILE '2017Q1-capitalbikeshare-tripdata.csv' INTO TABLE trips | ||
FIELDS TERMINATED BY ',' ENCLOSED BY '"' | ||
LINES TERMINATED BY '\r\n' | ||
IGNORE 1 LINES | ||
(duration, start_date, end_date, start_station_number, start_station, | ||
end_station_number, end_station, bike_number, member_type); | ||
``` | ||
|
||
### Import all files | ||
|
||
To import all `*.csv` files into TiDB in a bash loop: | ||
|
||
``` | ||
for FILE in `ls *.csv`; do | ||
echo "== $FILE ==" | ||
mysql bikeshare -e "LOAD DATA LOCAL INFILE '${FILE}' INTO TABLE trips FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);" | ||
done; | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please delete the extra space both before and after this sentence "Downloading and extracting all files requires approximately 3GB of disk space." Just leave one necessary space.