analyze a real-world large data set involving flight data and use the advanced features of ggplot to produce graphs of this data.
In previous modules, the data used in your projects came from small built-in R data sets or were generated by using probability distributions. We are now ready to use real-world, not manufactured, data. The data set nycflights13 contains data collected on domestic flights out of New York City in 2013. It contains 336,776 records, each with 19 variables. This is a big data set.
Analyze the nycflights13 data to determine which day of the week has the longest average delay time. Do the delay times vary by airport? What about weekday?
If you have not done so already, use the directions given in Chapter 13 of this module’s readings to install the nycflights13 data set and the R tidyverse package.