Deep Dive: Data provenance in the age of AI


In this era of unprecedented technological innovation, we find ourselves at a crossroads where artificial intelligence not only reshapes industries but also redefines the essence of data itself. As AI systems become increasingly integral to our daily lives, from self-driving cars and virtual personal assistants to medical diagnoses and financial predictions, understanding and ensuring the trustworthiness of the data underpinning these systems has never been more critical.
Throughout this event, we will engage with insights from experts, striving to provide a comprehensive view of the present and future of data provenance. Paul Groth will discuss Provenance for Data-Centric AI followed by a discussion on how we are facing these challenges. We will also hear from Bas Testerink on how the National Police Lab AI is approaching this topic.



Room 002, Marinus Ruppert building, University of Utrecht

Leuvenlaan 21, 3584 CE Utrecht, The Netherlands


To attend this event, please complete one of the following registration forms:

  • to attend in person please complete this form.
  • to attend online please complete this form.



Provenance for Data-Centric AI – Paul Groth

It is increasingly recognized that data is a central challenge for AI systems – whether training an entirely new model, discovering data for a model, or applying an existing model to new data. Given this centrality of data, there is need to provide new tools that are able to help data teams create, curate and debug datasets in the context of complex machine learning pipelines. Further, it is imperative that the results of AI systems document what data they use and how they used that data. In this talk, I discuss how core data provenance techniques can support these tasks. Additionally, I show how to use emerging data provenance standards (e.g. Content Authenticity Initiative) to document the provenance of the results of AI systems.


About ICAI Deep-Dive:
ICAI Deep-Dive is a community meetup that focuses on technical questions and facilitates an in-depth discussions in a specific field of AI where multiple labs are working. Common challenges in research are shared across community members and labs. The aim is to stimulate knowledge exchange, creating new insights, and build bridges by sharing lessons learned, experiences, and having an open discussion.
These series have the aim to find solutions for the common challenges and issues in the community.

  • 00


  • 00


  • 00


  • 00



19 Oct 2023


15:00 - 17:30




Room 002, Marinus Ruppert building, University of Utrecht
Leuvenlaan 21, 3584 CE Utrecht, The Netherlands