The 28th International Conference on Theory and Practice of Digital Libraries

TPDL 2024

LJUBLJANA, SLOVENIA
SEPTEMBER 24 - 27



TPDL 2024 is an in-presence event

The IMPACT Workshop is free for all participants, but registration for the conference is required.

About TPDL

TPDL is an international forum focused on digital libraries and their associated technical, practical, and social issues. The conference encompasses the many meanings of the term “digital libraries,” embracing the whole spectrum of the LAM (Library, Archive, and Museum) community; operational information systems with all manner of digital content; new means of selecting, collecting, organizing, and distributing digital content; and theoretical models of information media, including document genres and electronic publishing.

In 2024, TPDL is expanding its scope to prominently include Document Analysis/Recognition and Information Retrieval, acknowledging the vital role of those research areas in the creation (by means of digitization and information extraction from heterogeneous sources), access, discovery, and dissemination of digital content. This includes exploring innovative approaches to document image analysis and recognition, search algorithms, data retrieval, user engagement, and personalized content delivery within digital libraries, making these two areas central themes for this year’s conference.


Participants. Representatives from academia, cultural heritage institutions, government, industry, research communities, research infrastructures, and others are invited to participate in this annual conference. The conference draws from various research areas including computer science, information science, data science, librarianship, archival science and practice, museum studies and practice, technology, social sciences, cultural heritage, digital humanities, and the scientific communities.

Location. TPDL 2024 is hosted by the National and University Library of Slovenia (NUK) and will take place in Best Western Premier Hotel Slon, Ljubljana, from 24 to 27 September 2024. This is an in-person event. This choice does not exclude the possibility of following talks online, but authors of accepted papers are strongly encouraged to come and present in person. We aim to encourage discussion formally after a paper presentation and informally during social events and coffee breaks.

A selection of the best papers will be invited to submit an extended version to the International Journal of Digital Libraries.

This event is supported by the Europeana Research Community, Mikrografija and Ministry of Culture of Slovenia and cooperated by the Coalition for Networked Information.

Europeana Network Association logo.
Mikrografija logo.
Ministry of culture logo.
Coalition for Networked Information logo.

Keynotes

Christine L. Borgman

Distinguished Research Professor and Presidential Chair in Information Studies, Emerita
Department of Information Studies, University of California, Los Angeles
https://christineborgman.info/ 
christine.borgman@ucla.edu

Title: Libraries, Digital Libraries, and Data: 30 years of Development in Central Europe

Abstract: The late 1980s and early 1990s were turbulent times in Central Europe, as the Berlin Wall and the Kremlin fell, and war erupted in Yugoslavia. Lifting the ‘Iron Curtain’ revealed crumbling infrastructure, both physical and technological. Inside most of the elegant national, university and public library buildings of Central Europe were card catalogs and minimal information technology. The few online catalogs were based on local technologies and served local communities. Digital library development in Central Europe was most advanced in the former Yugoslavia, and of these states, Slovenia was most advanced in cataloging and networking. Thirty years hence, as Slovenia celebrates the 250th anniversary of the NUK, this is an opportune moment to reflect on the evolution of digital libraries. In the early 1990s, concerns focused on technical standards such as MARC and Unicode, shared catalogs, implementing modern systems, networking, developing local knowledge and expertise, and retrofitting old buildings. Merging, migrating, and sustaining access to bibliographic records were core challenges. Concerns of today focus on distributed access to digitized and born-digital resources, interoperability, and open access to knowledge. Creating, managing, exploiting, and sustaining access to research data and to cultural heritage materials are among the challenges ahead. This talk will explore lessons learned from 1990s work to automate libraries in Central and Eastern Europe that inform theory and practice in digital libraries for the 21st century. .

Bio: Christine L. Borgman conducts research in scientific data practices and information policy. Her publications in information studies, computer science, communication, and law include more than 250 journal articles, scholarly papers, and three award-winning books from MIT Press: Big Data, Little Data, No Data: Scholarship in the Networked World (2015); Scholarship in the Digital Age: Information, Infrastructure, and the Internet (2007); and From Gutenberg to the Global Information Infrastructure: Access to Information in a Networked World (2000). For most of the 1990s, Prof. Borgman was engaged in digital library research, policy, and practice in Central and Eastern Europe. She was a Fulbright Scholar in Hungary, conducted research for the World Bank and other agencies, and served as a Board Member of the Open Society Institute’s Regional Library Program. A Fellow of the American Association for the Advancement of Science and the Association for Computing Machinery, she has held visiting posts at Oxford, Harvard, and several European institutions. Professor Borgman is a member of the Library of Congress Scholars Council and the Board of Directors of the Electronic Privacy Information Center. Her honors and awards include the Paul Evan Peters Award from the Coalition for Networked Information, Association for Research Libraries, and EDUCAUSE; Award of Merit and the Research in Information Science Award, both from the Association for Information Science and Technology; and a Legacy Laureate of the University of Pittsburgh.

Maarten de Rijke

University of Amsterdam, Amsterdam, The Netherlands

m.derijke@uva.nl

Title: Searching for Climate Impact

Abstract: Climate change is a far-reaching, global phenomenon that will impact many aspects of our society. The evidence base for observed climate impacts is expanding, and the wider climate literature is growing exponentially. Systematic reviews and systematic maps offer structured ways to collectively identify and describe this evidence while maintaining transparency, attempting to ensure comprehensiveness and reduce bias. The IPCC Working Group II (WGII) assesses the impacts of climate change, from a world-wide to a regional view of ecosystems and biodiversity, and of humans and their diverse societies, cultures and settlements.

In the talk I will start by explaining what the information need is that the IPCC WGII has. I will then describe where we are with the main methodology that we have for addressing this need, systematic reviews. I will discuss what the available resources are, which include technologies and repositories that might be put to work in this setting. Finally, I will highlight some of the challenges that come with putting these technologies to work for IPCC assessments.

Bio: Maarten de Rijke is a Distinguished University Professor of Artificial Intelligence and Information Retrieval at the University of Amsterdam. His research is focused on designing and evaluating trustworthy technology to connect people to information, particularly search engines, recommender systems, and conversational assistants. He is also a co-founder and the scientific director of the Innovation Center for Artificial Intelligence (ICAI), a national collaboration between academic, industrial, governmental, and societal stakeholders aimed at talent development, research, and impact in AI.

Maud Ehrmann

Digital Humanities Laboratory, École Polytechnique Fédérale de Lausanne (EPFL)

https://people.epfl.ch/maud.ehrmann
maud.ehrmann@epfl.ch

Title: Media Archives Across Borders – The Impresso Projects

Abstract: The availability of newspaper and radio archives in machine-readable formats has improved preservation, accessibility, and opened up new opportunities for automatic processing and exploration. In the case of newspapers in particular, text and image processing techniques are now being used to enrich collections with semantic annotations, enabling deeper content exploration. Despite these advances, current digital portals still fall short of meeting the needs of historical research. Exploration frameworks remain fragmented, confined to digital archive silos with country-based institutional portals, and digital media silos, where enrichments and exploration capabilities are typically limited to single language and media type. Moreover, these portals often offer only passive exploration of static collections, whereas historical research requires iterative comparison and association of multiple objects of study. As digital tools increasingly shape all phases of historical research, historians are also calling for new methods and tools to critically analyse data, tools, and interfaces.

Impresso - Media Monitoring of the Past is an interdisciplinary research project that aims to pioneer new approaches to exploring media archives for historical research. In its first phase (2017-2020), the project developed a scalable infrastructure for Swiss and Luxembourg newspapers, featuring a powerful search interface that helped popularise the use of text mining-based enrichment for the retrieval and exploration of newspaper articles - now almost a standard practice. The second phase, starting in 2023, broadens the scope and envisions a comprehensive connection between media archives, aiming to enable the joint exploration of historical newspaper and radio content across temporal, linguistic, and national boundaries in order to support data-driven historical research in transmedia and transnational perspectives.

This talk will introduce Impresso 2 and review the evolution from the first to the second project. We will discuss the specific challenges to connecting newspaper and radio from legal, processing, historical, and design perspectives, our efforts to adapt text mining and exploration tools to historical material derived from different modalities, and our approach to conducting comparative and data-driven historical research using semantically enriched sources, accessible through both graphical and API-based interfaces.

Bio: Maud Ehrmann is a research scientist and lecturer at the EPFL Digital Humanities Laboratory in Lausanne. She holds a PhD in Computational Linguistics from the University of Paris 7 Denis Diderot and her research interests span natural language processing and digital humanities, with special focus on historical document processing, information extraction and knowledge representation. With backgrounds in both NLP and the humanities, she particularly enjoys working in interdisciplinary contexts and often acts as an intermediary between computer scientists, humanity scholars, engineers and representatives of cultural heritage institutions. In recent years, she has focused on content mining of historical newspapers with, among others, the project Impresso - Media Monitoring of the Past projects and the HIPE evaluation campaigns.

Workshop: the IMPACT Workshop

From Theory to Practice: Integrating State-of-the-art Digitization Research Into Day-to-day Digitization Practices and Operations

Audience. International digitization community, which IMPACT inclusively defines as anyone who self-identifies through personal interest, professional practice or research expertise in digitization of historical (textual) materials.

Impact logo.

Program of the Workshop, September 24

09:00 - 09:30
Registration
09:30 - 09:40
Welcome and Introduction
09:40 - 10:00
Sustaining and Sharing Digitization Knowledge: An Impact White Paper
10:00 - 10:45
From Theory to Practice: Responding to the White Paper Challenges
How to create a workflow in the SSH Open Marketplace? 
A Collections as Data Workflow
10:45 - 11:15
Coffee
11:15 - 11:45
Developing, optimising, and sustaining flexible infrastructures for digitisation
Towards a joined up Digitisation Workflow at the British Library
Andrew Longworth, Digitisation Workflow Manager, The British Library, UK
11:45 - 12:45
Creation of a general digitisation workflow in the Open Social and Sciences Marketplace
Step 1: Discussion of the digitisation workflow at the British Library & challenges identified in the Impact White Paper
  • What are the most interesting/challenging steps?
  • Which are the most relevant / important for you and your organisation?
12:45 - 13:00
Wrap-up
13:00 - 14:00
Lunch
14:00 - 15:30
Creation of a general digitisation workflow in the Open Social and Sciences Marketplace
Step 2: What is already available in the Marketplace?
  • Finding items related to the workflow steps
  • Identify what is missing
  • Adding new/updating items in the Marketplace
15:30 - 16:00
Coffee
16:00 - 17:30
Creation of a general digitisation workflow in the Open Social and Sciences Marketplace
Step 3: Preparing an outline for the digitisation workflow
  • What could (some) steps of the workflow be?
  • Using/assigning related items identified in step 2 to each step
  • Adding new/updating items in the Marketplace
17:30 - 17:45
Wrap-up

Existing Workflows, Tools and Datasets in the SSH Open Marketplace

Ideas Extracted from the IMPACT White Paper

Access the full white paper here: Sharing and Sustaining Digitisation Knowledge: an IMPACT White Paper

  • From ‘legacy digitization’ to ‘state-of-the-art’: bridging the gap.
  • Documenting digitization workflows.
  • Establishing a common interdisciplinary research agenda for cultural heritage, digital humanities, as well as computer and data science, on digitization.
  • It explores four key themes: digitization quality and standards; datafication of digitized cultural heritage; digitization workflows and tools and digitization community management.
  • Quality, usefulness and the coverage of the collections.
  • Moreover, with the arrival of initiatives such as GLAM Labs, Collections as Data, Data Spaces, Collaborative Cloud for Cultural Heritage, Open Science and EOSC the need for (high-quality) digitization and datafication of cultural heritage has significantly increased (Candela, 2023).
  • With the emergence of the Common European Data Space for Cultural Heritage and the European Collaboration Cloud for Cultural Heritage, there is an urgent need to further optimize, upscale and embed the use of these technologies into the day-to-day operations of cultural heritage institutions in combination with building the necessary capacities and to proactively contribute to such initiatives. Previous work has addressed the current understanding of what a European data space for cultural heritage should provide and takes as an inspirational example the work of innovation labs in galleries, libraries, archives and museums (Dobreva, et al., 2022).
  • Underdeveloped infrastructures for support of access and re-use of the digitized material.
  • “Copyright and licensing of these digitized objects.”
  • Developing, optimizing and sustaining a flexible infrastructure for digitization.

Keynote speakers:

Sally Chambers

Sally Chambers recently joined The British Library as Head of Research Infrastructures Services. She combines this role with her work as a member of the DARIAH, the Digital Research Infrastructure for the Arts and Humanities Board of Directors, where she focuses on the development of sustainable services and FAIR dataset portfolios in the context of the European Open Science Cloud (EOSC), the common European Data Space for Cultural Heritage and the Cultural Heritage Cloud. Her previous roles include: Digital Humanities Research Coordinator at the Ghent Centre for Digital Humanities, Ghent University in Belgium; DATA-KBR-BE project coordinator at KBR, Royal Library of Belgium which facilitates data-level access to KBR’s digitised and born-digital collections for digital humanities research and  Secretary-General of DARIAH-EU, based in the Göttingen Centre for Digital Humanities, Germany. She has been an active participant in the international Galleries, Libraries, Archives and Museums (GLAM) Labs community, and a co-author of Open a GLAM Lab. This workshop is organised in the framework of the IMPACT Centre of Competence for Digitisation where Sally has been since June 2021.

Gustavo Candela

Gustavo Candela is a Lecturer in Computer Science at the University of Alicante. His main areas of research interest are Semantic Web and Collections as data. He holds a PhD in Computer Science from the University of Alicante. He has authored several publications concerning Linked Open Data, Collections as as data and Jupyter Notebooks in libraries. He is involved in the International GLAM Labs Community and co-authored the "Open a GLAM Lab" book (https://glamlabs.pubpub.org/). He is part of the Management Board of the Impact Centre of Competence (https://www.digitisation.eu/). He completed several research fellowships at the National Library of Scotland, the Institute of Literary Research of the Polish Academy of Science and the Poznan Supercomputing and Networking Center.

Tomasz Parkoła

Tomasz Parkoła is the Head of Digital Libraries and Knowledge Platforms Department at Poznań Supercomputing and Networking Center (Poznań, Poland) where he manages research & development teams responsible for digital humanities infrastructure (http://ehum.psnc.pl/en/main-page/), products and services for digital libraries and cultural heritage (https://dingo.psnc.pl/) as well as Europeana-accredited Polish metadata aggregator FBC (https://fbc.pionier.net.pl/). He has been involved in national and international research and development projects with main themes on data access & processing, long-term preservation, digitization workflows, data aggregation & interoperability and e-infrastructures (e.g. IMPACT, SCAPE, SSHOC, European DS4CH, Dariah.lab Poland). He is a board member in the IMPACT Centre of Competence, a co-head of DARIAH-EU's Virtual Competence Centre on E-infrastructure and a Product Board member of the Open Preservation Foundation. He was programme committee member for iPRES, DATeCH and DARIAH AE. He is an author or co-author of several dozens of scientific and popular science publications. He has PMP®, UX-PM, PMI ACP, ITIL v4 Foundation and ITIL DPI professional certifications.

Organization

Program Chairs

Apostolos Antonacopoulos

University of Salford,
United Kingdom

Annika Hinze

University of Waikato,
New Zealand

Benjamin Piwowarski

CNRS / Sorbonne Université, France

Nicholas Vanderschantz

University of Waikato,
New Zealand

Short Paper Chairs

Mickaël Coustaty

Associate professor in computer sciences,
L3i laboratory, La Rochelle Université, France

Francesco Gelati

Head of Section Digital Services and Internal Consultancy, University Archives, Universität Hamburg, Germany

Giorgio Maria Di Nunzio

Associate Professor in Computer Engineering IIIA Lab, University of Padova, Italy

Organization Committee

Ines Vodopivec

Deputy Director for Library Operations and Services, NUK

Zoran Krstulović

Head of Digital Library of Slovenia, NUK

Alenka Kavčič Čolić

Senior Researcher, NUK

Bakir Toskić

Head of Information Technology services, NUK

Damjana Vovk

Head of Education, Development and Conselling, NUK

Nina Bricelj

Head of Director's Cabinet, NUK

Mojca Šavnik

Digitization coordinator,
Digital Resources Department, NUK

Venue

The conference TPDL 2024 will be held at Best Western Premier Hotel Slon in Ljubljana, the capital city of Slovenia.

Best Western Premier Hotel Slon
  • Slovenska cesta 34
  • 1000 Ljubljana
  • Slovenia

About the Host

National and University Library of Slovenia (NUK) organizes the TPDL2024 as one of the important conferences in the series of national and international events to be organized by our institution on its 250th anniversary in 2024. By organizing TPDL2024 in Ljubljana we tend to celebrate not only the historical perspective of our existence, but also digitized and digitally born written cultural heritage growing ever stronger in the last two decades.

NUK is the Slovenian national library, the central state library and the university library of the University of Ljubljana, the national aggregator of e-content for Europeana and home of the Digital library of Slovenia. As GLAM institution it collects, documents, preserves and archives the written cultural and scientific heritage of the Slovenian nation. It provides ready access to knowledge and culture of the past and present Slovenian generations. In collaboration with national and international libraries, it enables access to the world’s written cultural and scientific heritage. In the process of creating new knowledge, it helps its users to search, select, evaluate and use information resources in different formats, forms and languages. Its collections and services support scholarly and scientific work of the Ljubljana University and other higher education institutions.

NUK is also a research institution, hosting research and IT development center, it is a center of knowledge aimed at lifelong education of the Slovenian people, and at raising their cultural and educational level and information literacy skills. Through research, development and educational activities in the field of librarianship, information science and book history, the library is actively co-shaping Slovenian library system, and makes significant contributions to theoretical and practical knowledge of library and information science.

Pogled iz zraka na Narodno in univerzitetno knjižnico in na okoliške stavbe.

About Ljubljana

Ljubljana is the capital and largest city of Slovenia, its cultural, scientific and administrative center. It has a rich history which visitors can explore through its landmarks, such as Ljubljana castle, the medieval city center or the Roman archaeological sites. The city has a lively atmosphere and is full of open air events, music and markets and it features a diverse culinary scene with local, Mediterranean and international cuisine. There are numerous art galleries and museums to enjoy, including the National Gallery, Modern Gallery, Cukrarna, National Museum, Ethnographical Museum and the City Museum along with important archives - the National Archive, City Archive, and Historical Archive.

Arriving to Ljubljana

Travelling information

By airplane: Ljubljana airport; Trieste and Zagreb are also very close, as well as Venice. All these airports have shuttles to Ljubljana city center.
Book your official TPDL airport shuttle here!

By train or bus (eg. FlixBus): international connections are available.

Accomodation

Ljubljana has become a major conference and congress city with hotels on every corner. Due to its small city center all hotels and possible venues are in reach of 5 min walk.

Best Western Premier Hotel Slon, 4-star, 144 EUR/night

City Hotel Ljubljana,
4-star, 144 EUR/night

Ibis Styles Ljubljana Centre,
3-star, 100 EUR/night

Occidental Ljubljana,
4-star, 130 EUR/night

Contact
Ines Vodopivec, PhD.
+386 41 251 400
tpdl2024@nuk.uni-lj.si
  • National and University Library
  • Turjaška ulica 1
  • 1000 Ljubljana
  • Slovenia