Forum Numerica - Marieke van Erp: Data Sciencing Big Old Archives: Computational Humanities Research One Interdisciplinary Collaboration at a Time


September 19
3pm-4 pm
Inria Research Center, room Euler bleu and online on Webex

 

A joint WIMMICS (i3S-Inria) / Forum Numerica seminar

Abstract

The mass digitisation of historical archives has brought humanities research into the realm of big data. However, working with historical data is not for the faint of heart: data can be incomplete, warped through digitisation artefacts, or difficult to understand as language and society have changed. But it can also be immensely rewarding as it provides a window on the past that can help us understand today's society better. In this talk, I will discuss how my team is adapting NLP tools designed for contemporary data to historical use cases and how we deal gaps in data, quality issues and other challenges to make computational methods better suited to humanities use cases.

About the speaker

Marieke van Erp is a Language Technology and Semantic Web expert engaged in interdisciplinary research. She holds a PhD in computational linguistics from Tilburg University and has worked on many (inter)national interdisciplinary projects such as the FP7 NewsReader project, the H2020 Odeuropa project and the Dutch Research Council’s CLARIAH project. Since 2017, she has been leading the Digital Humanities Research Lab at the Royal Netherlands Academy of Arts and Sciences Humanities Cluster. She is one of the founders and scientific directors of the Cultural AI Lab, a collaboration between 8 research and cultural heritage institutions in the Netherlands aimed at the study, design and development of socio-technological AI systems that are aware of the subtle and subjective complexity of human culture. In January 2023, she was awarded an ERC Consolidator project that will investigate how language and semantic web technologies can improve the creation of knowledge graphs supporting humanities research.
http://dhlab.nl