Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Extracting airline emission KPIs from sustainability reports using large language models (LLMs)

Martín-Domingo, Luis orcid logoORCID: 0000-0003-2052-5712, Fernandez, Jaime B. orcid logoORCID: 0000-0001-9774-3879, Efthymiou, Marina orcid logoORCID: 0000-0001-8611-5973 and Intizar Ali, Muhammad orcid logoORCID: 0000-0002-0674-2131 (2025) Extracting airline emission KPIs from sustainability reports using large language models (LLMs). Transportation Research Interdisciplinary Perspectives, 33 . p. 101599. ISSN 2590-1982

Abstract
The extraction of environmental Key Performance Indicators (KPIs) from airline sustainability reports is essential for assessing environmental sustainability metrics and regulatory compliance within the European aviation sector. Manual extraction from extensive, unstructured documents is laborious and often inconsistent. This study systematically investigates the potential of advanced Large Language Models (LLMs) –specifically − GPT-4.0, o3- mini, and Deepseek R1- to automate the extraction of emissions-related KPIs from the 2023 sustainability reports of 16 publicly traded European airline groups. Utilizing the Perplexity platform, the research contrasts manual expert extraction with automated approaches, exploring various models, prompt strategies, and data formats. Results indicate that the accuracy of LLM extraction depends significantly on prompt specificity. Attempts to extract data from unstructured documents without guidance yielded low accuracy. However, incorporating explicit KPI terms into prompts increased accuracy from below 30% to above 70%. The format of the data source was also influential, with HTML formats producing superior extraction results compared to PDFs. Despite ongoing challenges in standardizing data and extracting precise KPI metrics, the findings demonstrate that LLMs can substantially streamline environmental, social and governance (ESG) data collection when prompt engineering and source standardization are prioritized. This study represents a novel, interdisciplinary approach by combining advances in large language models (LLMs) with expertise in environmental, social, and governance (ESG) analysis within the aviation sector, offering empirical benchmarking of LLM performance in real-world regulatory contexts. Recommendations for LLM integration into ESG analysis workflows are provided, and future research directions for advancing automation in sustainability reporting are discussed.
Metadata
Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:LLMs; ESG; KPIs; GHG Emissions
Subjects:Business > Commerce
Business > Industries
DCU Faculties and Centres:DCU Faculties and Schools > DCU Business School
Research Institutes and Centres > INSIGHT Centre for Data Analytics
Publisher:Elsevier Ltd
Official URL:https://www.sciencedirect.com/science/article/pii/...
Copyright Information:Authors
ID Code:31517
Deposited On:10 Sep 2025 10:17 by Gordon Kennedy . Last Modified 10 Sep 2025 10:17
Documents

Full text available as:

[thumbnail of 1-s2.0-S2590198225002787-main.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0
2MB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record