Stapleton, Adam
ORCID: 0000-0003-1233-211X
(2025)
Explainable Machine Learning for Knowledge Discovery in Environmental Science. PhD thesis, Dublin City University.
PhD thesis, Dublin City University.
Abstract
With climate change increasingly impacting communities and around the globe, understanding the Earth system has never been more vital. The many petabytes of data that are generated from remote sensing efforts such as satellite observations and ground-based measurement networks have made the fields of environmental science and Earth system science ripe for the application of machine learning. The fundamental question that drives the research presented in this thesis is how ML models can be meaningfully integrated into environmental science workflows to enhance our understanding and modelling of complex Earth system processes. This thesis demonstrates through three applications the utility of machine learning for the modelling of complex systems in environmental science. In addition these works demonstrate that explanations of the models enable the discovery new relationships within these systems purely from data with little or no prior knowledge. The first application is to the partitioning of evapotranspiration into its components, evaporation and transpiration by predicting the evaporation component from data where only the total evapotranspiration is measured. The data used are from four wetlands in California and served as a preliminary study that introduces the framework for the methodology that is used for the other studies. A key finding uncovered by simple model explanations is that methane flux, a feature whose relationship with evapotranspiration is not generally examined, may contribute to further biophysical process understanding. The second application is to the local-scale modeling of the atmospheric boundary layer height in central Amazonia. This study found gradient boosted ensemble models using all available features to perform best. A modified recursive feature elimination algorithm identified minimal sets of 5–7 surface measurements sufficient for accurate boundary layer height prediction, demonstrating potential for wider spatial monitoring using cost-effective sensors. The study revealed previously unrecognized variables that strongly contributed to boundary layer height predictions, such as deeper soil temperature measurements (40 cm). The final application is to the large-scale modelling of gross primary productivity in primary forest the Amazon basin utilising clustering methods to separate similar regions and understand the complex relationship between climatic variables with gross primary productivity in those regions. Shapely explanations enabled deeper understanding of the relationships discovered by the machine learning models, in particular identifying the vulnerable peripheral regions with increased variability in gross primary productivity under the influence of deforestation and degradation pressures.
Metadata
| Item Type: | Thesis (PhD) |
|---|---|
| Date of Award: | 3 December 2025 |
| Refereed: | No |
| Supervisor(s): | Roantree, Mark and Eichelmann, Elke |
| Uncontrolled Keywords: | Environmental science, climate change, environmental modelling, XAI, Amazon rainforest, evapotranspiration, atmospheric boundary layer, biosphere-atmosphere exchange |
| Subjects: | Computer Science > Machine learning |
| DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
| Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 License. View License |
| Funders: | Research Ireland through the Research Ireland Centre for Research Training in Machine Learning (18/CRT/6183) |
| ID Code: | 32168 |
| Deposited On: | 14 Apr 2026 14:16 by Adam Stapleton . Last Modified 14 Apr 2026 14:16 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0 48MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record