Big Data in Aviation Research and Practice


Anming Zhang
University of British Columbia

April 2018

Big data is a term that describes the voluminous and complex data – both structured and unstructured – that inundates a business on a day-to-day basis. But the sheer amount of data available is not so important as how organizations elect to use them. Big data should be analyzed for insights that lead to better decisions and strategic business moves.

In the aviation sector, large scale data on, e.g., passenger, cargo, airport, route, country, aircraft, seat class, and fares can be captured, stored, and processed. Subsets of these data are now increasingly available from open or semi-open sources. Analytics on these can improve decision making for, among other areas, airline pricing (revenue management), demand forecasting, route profitability analysis and optimization, and aviation safety. While academic research on big data’s applications to this sector is rapidly emerging, the majority of the published papers appear to focus on technical/computer science aspects rather than empirical managerial usage.

In this column, I illustrate two recent big data applications that focus on policy/business strategy implications:

1. Airport catchment: The catchment area of an airport is traditionally determined by either distance or by a legally defined locale (city/county) in which the airport is situated. For passengers, airport accessibility is a measure of the costliness of reaching an airport using available transportation systems (Reggiani et al., 2015). The costs depend, in addition to distance, on the availability as well as the quality of land-side transportation systems. In a recent study, Sun et al. (2017) attempted to estimate airport accessibility based on big data. By using scalable techniques in storing and indexing big data and computing free-flow road travel time, as well as the public transit time, from grid cells to an airport, they converted the raw data into a “product”: the estimation of the airport’s catchment area. They illustrate the method by applying the analysis to China, which involves more than 10 million grid cells with population data.

One implication of Sun et al.’s analysis is for airport competition (Lijesen and Behrens, 2017). Effective competition of an airport may come not only from airports in the same city/county, but also from airports closely located in “neighbour cities/counties” based on relevant travel times. In other words, their analysis has useful implications for how one should define a “multiple airport region” (Reggiani et al., 2015). Another important implication is related to the evaluation of airport expansion. For instance, when a new airport is constructed (and is thus added to the existing airport network), its economic benefit to the local community from increased air connectivity depends on its land-side accessibility relative to that of other airports in the multiple airport region, as well as to that of alternative transport infrastructures, such as high-speed rail stations.

2. Air connectivity: While the land accessibility discussed above is concerned about the accessibility of airport of passengers outside the airport network, air accessibility is concerned about the connectedness of an airport to other airports within the network. A recent study by Zhu et al. (2017) examined air connectivity of intercity passenger transportation in China. It utilized data from several open or semi-open sources. For instance, the data for flight frequencies were obtained from the website of FEEYO. This is the largest online platform specializing in air passenger service and flight data analysis in China. The flight information also contains stops and on-time performance rates. The authors analysed the data and measured the connectivity using a dynamic weighted model, which has its root in the NetScan model first developed by Veldhuis (1997) for airports. With the big data, Zhu et al. (2017) were able to extend the model beyond air connectivity to city connectivity which includes both air and rail connectivity. They considered both quality and quantity of the connections of two transport modes (in a follow-up study, the authors further allow for inter-modal connections), and obtained several results that have important policy implications.

These studies demonstrate that emerging data science/technologies – i.e., (open) big data in conjunction with well-designed algorithms – empowers researchers to more efficiently perform accessibility analysis at an unprecedented level of accuracy and scale.

They have also raised numerous avenues for future research. Future studies could investigate accessibility beyond free-flow, for instance, by taking into account real travel times as observed by historical traffic reports or even online services. Another potential avenue is to combine the above applications into a more complete big data-induced accessibility/connectivity measure.

Lijesen, M. and C. Behrens (2017), “The spatial scope of airline competition,” Transportation Research Part E: Logistics and Transportation Review, 99, 1-13.

Reggiani, A., P. Nijkamp and D. Lanzi (2015), “Transport resilience and vulnerability: The role of connectivity,” Transportation Research Part A: Policy and Practice, 81, 4-15.

Sun, X., S. Wandelt and A. Zhang (2017), “Comparative accessibility of Chinese airports and high-speed railway stations: A high-resolution, yet scalable, framework based on Open Data,” available at SSRN-id3033166.

Veldhuis, J. (1997), “The competitive position of airline networks,” Journal of Air Transport Management, 3(4), 181-188.

Zhu, Z., A. Zhang and Y. Zhang (2017), “Connectivity of intercity passenger transportation in China: A multi-modal and network approach,” Journal of Transport Geography, in press.

Category: columns