Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Load Airline Data using RSQLite #63

Open
dhjessicajeong opened this issue Feb 12, 2020 · 4 comments
Open

Unable to Load Airline Data using RSQLite #63

dhjessicajeong opened this issue Feb 12, 2020 · 4 comments

Comments

@dhjessicajeong
Copy link

I'm trying to load the airlines data and am unable to load the full data -- I'm guessing the link in the etl_extract.R file is broken?

I've demonstrated below what the resulting airlines dataset looks like after loading it using the airlines and etl packages. It manages to successfully get the carrier of the flight, but gives no information on the actual flight (i.e., flight delays, departure and arrival times, cancellation status, etc.).

`reprex::reprex({
suppressPackageStartupMessages(library(mosaic))
suppressPackageStartupMessages(library(airlines))
suppressPackageStartupMessages(library(RSQLite))
airlines <- etl("airlines")
airlines %>%
etl_create(years = 2017, months = 12) %>%
etl_cleanup()

class(airlines)
summary(airlines)
src_tbls(airlines)
})
`
In the vignette for airlines, after running the etl_create() fucntion, the resulting database should include planes, airports, carriers, and flights. (Linked here).

However, the reprex example above only creates a database with one table, carrier. Looking into this a little closer, I noticed that the link called in the etl_extract.R function does not work. That being said, are there any suggestions on loading the data / troubleshooting for this error?

Thanks in advance for any guidance!

@nicholasjhorton
Copy link
Collaborator

When I try to run the reprex I get the same output (and a number of warnings and errors).

airlines <- etl("airlines")
No database was specified so I created one for you at:
/var/folders/1x/rkngly0d3lzczzmqdt5zvccm0000gn/T//RtmpmWb7FT/filec9945119ecce.sqlite3
airlines %>%

  • etl_create(years = 2017, months = 12) %>%
  • etl_cleanup()
    Could not find schema initialization script
    Parsed with column specification:
    cols(
    Code = col_character(),
    Description = col_character()
    )
    Parsed with column specification:
    cols(
    X1 = col_double(),
    X2 = col_character(),
    X3 = col_character(),
    X4 = col_character(),
    X5 = col_character(),
    X6 = col_character(),
    X7 = col_double(),
    X8 = col_double(),
    X9 = col_double(),
    X10 = col_double(),
    X11 = col_character(),
    X12 = col_character(),
    X13 = col_character(),
    X14 = col_character()
    )
    Warning: 353 parsing failures.
    row col expected actual file
    6982 X10 a double \N '/private/var/folders/1x/rkngly0d3lzczzmqdt5zvccm0000gn/T/RtmpmWb7FT/raw/airports.dat'
    6983 X10 a double \N '/private/var/folders/1x/rkngly0d3lzczzmqdt5zvccm0000gn/T/RtmpmWb7FT/raw/airports.dat'
    6984 X10 a double \N '/private/var/folders/1x/rkngly0d3lzczzmqdt5zvccm0000gn/T/RtmpmWb7FT/raw/airports.dat'
    6985 X10 a double \N '/private/var/folders/1x/rkngly0d3lzczzmqdt5zvccm0000gn/T/RtmpmWb7FT/raw/airports.dat'
    6986 X10 a double \N '/private/var/folders/1x/rkngly0d3lzczzmqdt5zvccm0000gn/T/RtmpmWb7FT/raw/airports.dat'
    .... ... ........ ...... ......................................................................................
    See problems(...) for more details.

Error: Columns 13, 14 cannot have NA as name

src_tbls(airlines)
[1] "carriers"

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] RSQLite_2.2.0 airlines_0.2.2.9015 etl_0.3.8.9000 mosaic_1.5.0.9001 Matrix_1.2-18
[6] mosaicData_0.17.0 ggformula_0.9.3 ggstance_0.3.3 ggplot2_3.2.1 lattice_0.20-38
[11] dplyr_0.8.4

loaded via a namespace (and not attached):
[1] nlme_3.1-144 fs_1.3.1 usethis_1.5.1 bit64_0.9-7 lubridate_1.7.4
[6] devtools_2.2.1 httr_1.4.1 rprojroot_1.3-2 tools_3.6.0 backports_1.1.5
[11] R6_2.4.1 DBI_1.1.0 lazyeval_0.2.2 colorspace_1.4-1 withr_2.1.2
[16] tidyselect_1.0.0 gridExtra_2.3 prettyunits_1.1.1 processx_3.4.1 leaflet_2.0.3
[21] bit_1.1-15.1 curl_4.3 compiler_3.6.0 cli_2.0.1 rvest_0.3.5
[26] xml2_1.2.2 desc_1.2.0 ggdendro_0.1-20 mosaicCore_0.6.0 scales_1.1.0
[31] readr_1.3.1 callr_3.4.2 stringr_1.4.0 digest_0.6.23 pkgconfig_2.0.3
[36] htmltools_0.4.0 sessioninfo_1.1.1 dbplyr_1.4.2 fastmap_1.0.1 htmlwidgets_1.5.1
[41] rlang_0.4.4 rstudioapi_0.11 shiny_1.4.0 farver_2.0.3 generics_0.0.2
[46] crosstalk_1.0.0 magrittr_1.5 Rcpp_1.0.3 munsell_0.5.0 fansi_0.4.1
[51] lifecycle_0.1.0 stringi_1.4.5 yaml_2.2.1 MASS_7.3-51.5 pkgbuild_1.0.6
[56] blob_1.2.1 grid_3.6.0 promises_1.1.0 ggrepel_0.8.1 crayon_1.3.4
[61] splines_3.6.0 hms_0.5.3 ps_1.3.0 pillar_1.4.3 pkgload_1.0.2
[66] glue_1.3.1.9000 downloader_0.4 remotes_2.1.0 vctrs_0.2.2 tweenr_1.0.1
[71] httpuv_1.5.2 testthat_2.3.1 gtable_0.3.0 purrr_0.3.3 polyclip_1.10-0
[76] tidyr_1.0.2 assertthat_0.2.1 ggforce_0.3.1 mime_0.9 xtable_1.8-4
[81] broom_0.5.4 later_1.0.0 tibble_2.1.3 memoise_1.1.0 ellipsis_0.3.0

@mrouhana22
Copy link

Are there any updates on this front? I am also unable to load the full data as the link has expired.

@nicholasjhorton
Copy link
Collaborator

@beanumber do you know if there is an alternative source for the flight data? Or is the package no longer effectively usable?

@nicholasjhorton
Copy link
Collaborator

Is this the same link that needs updating (as for nycflights13)?

tidyverse#50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants