Plot One
theme_set(theme_minimal(12))
plot1 <- qplot(x = Bkg_Subclass, y = farePerPerson,
data = cleared_data, geom = 'boxplot',
color = Bkg_Subclass) +
coord_cartesian(ylim = c(0, 15000))+
ggtitle('by Booking Subclass') +
xlab('Booking Subclass') +
ylab('Fares (in AUD)') +
theme(legend.position = 'none')
plot2 <- qplot(Bkg_Subclass, data = cleared_data, fill = Season ) +
ggtitle('Number of booking by booking subclass') +
xlab('Booking Subclass') +
ylab('Number of booking')
grid.arrange(plot2, plot1, ncol = 1)
Description One
Passengers prefer to buy the lowest price of tickets,however , the median of net revenue is also lowest.The proportion of season2 in almost every booking subclass is the highest, passengers likely go to travel in Season 2.
Plot Two
theme_set(theme_minimal(12))
plot1 <- qplot(x = month_pnr_create, y = farePerPerson,
data = cleared_data, geom = 'boxplot',
fill = month_pnr_create) +
coord_cartesian(ylim = c(0, 2500))+
ggtitle('by month(create booking)') +
xlab('Month(Create booking)') +
ylab('Fares (in AUD)') +
theme(legend.position = 'none')
plot2 <- qplot(farePerPerson, data = cleared_data, binwidth = 400,
color = month_pnr_create, geom = 'density') +
coord_cartesian(xlim = c(0, 2000))+
guides(color = guide_legend(title = 'Month(create booking)', reverse = F)) +
xlab('Fare/person (AUD)') +
ylab('Density') +
ggtitle('Density of Fare/person (AUD) by Month(create booking)')
grid.arrange(plot1, plot2, ncol = 1)
Description Two
It finds that the relationship between months and price of the fare. For the peak periods, the price will increase with higher variance.The mean of fare is under AUD 1000 in the whole year.(except January) and the all distribution of fare is positive skew.
Plot three
theme_set(theme_minimal(16))
qplot(x=netPerPerson, y=farePerPerson,
color=Bkg_Subclass, data=remove_NA_net)+
geom_point(alpha = 0.5, position = 'jitter') +
ggtitle('Fare by Net revenue and booking subclass') +
theme(plot.title = element_text(size = 16))
Description Three
This graph shows the relationship between fare and net revenue booking subclass.Obviously ,the price of fare is higher, the net revenue is higher. Also, there are two recognised lines on the graph that mean there is at least two formula for calculating fare by net revenue for different situations.
Reflection
The data set contains booking information on almost 90 thousand transactions from around 2015. I started by understanding the individual variables in the data set and created a linear model to predict net revenue of ticket, and then I explored interesting questions and leads as I continued to make observations on plots. Eventually , I explored the number of passengers and the amount of net revenue/fare per passenger across many variables. At first , I was wonder why so many bookings created in February and October, it is because the tickets in these two months are the cheapest in a year.It is easy to understand that the booking class is mainly class V/L/S/M/K , which concentrate on economy class with the lowest median of fare.Also, it is strong demand for travel originating from HongKong ,benefited from the weakness of the Euros and the Australian dollars in the first half of 2015.It reflects demand on regional routes is strong, particular in economy class. There was a strong economy class demand on long-haul routes.For this, I strongly recommend increase flights to the popular destination over the peak period.Also, using larger aircraft such as Boeing 777-300ER on the popular flight a day will increase capacity.



沒有留言:
張貼留言