Jaillant, Lise ORCID: 0000-0002-2680-4571 and Caputo, Annalina ORCID: 0000-0002-7144-8545 (2022) Unlocking digital archives: cross‑disciplinary perspectives on AI and born‑digital data. AI & Society . ISSN 0951-5666
Abstract
Co-authored by a Computer Scientist and a Digital Humanist, this article examines the challenges faced by cultural herit-
age institutions in the digital age, which have led to the closure of the vast majority of born-digital archival collections. It
focuses particularly on cultural organizations such as libraries, museums and archives, used by historians, literary scholars
and other Humanities scholars. Most born-digital records held by cultural organizations are inaccessible due to privacy,
copyright, commercial and technical issues. Even when born-digital data are publicly available (as in the case of web
archives), users often need to physically travel to repositories such as the British Library or the Bibliothèque Nationale de
France to consult web pages. Provided with enough sample data from which to learn and train their models, AI, and more
specifically machine learning algorithms, offer the opportunity to improve and ease the access to digital archives by learn-
ing to perform complex human tasks. These vary from providing intelligent support for searching the archives to automate
tedious and time-consuming tasks. In this article, we focus on sensitivity review as a practical solution to unlock digital
archives that would allow archival institutions to make non-sensitive information available. This promise to make archives
more accessible does not come free of warnings for potential pitfalls and risks: inherent errors, "black box" approaches that
make the algorithm inscrutable, and risks related to bias, fake, or partial information. Our central argument is that AI can
deliver its promise to make digital archival collections more accessible, but it also creates new challenges - particularly in
terms of ethics. In the conclusion, we insist on the importance of fairness, accountability and transparency in the process of
making digital archives more accessible.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Uncontrolled Keywords: | Digital Humanities; Born-Digital Archives; Privacy; Copyright; Sensitivity Review; Ethics |
Subjects: | Computer Science > Artificial intelligence |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing Research Institutes and Centres > ADAPT |
Publisher: | Springer Verlag |
Official URL: | https://doi.org/10.1007/s00146-021-01367-x |
Copyright Information: | © 2022 The Authors. |
Funders: | Arts and Humanities Research Council Grant number AH/V002341/1, Irish Research Council under Grant Agreement No. IRC/ V002341/1, Science Foundation Ireland through the SFI Research Centres Programme, European Regional Development Fund (ERDF) through Grant # 13/RC/2106_P2. |
ID Code: | 26788 |
Deposited On: | 22 Mar 2022 13:10 by Annalina Caputo . Last Modified 20 Apr 2023 17:34 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0 679kB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record