ICAI Interview with Martijn Kleppe: gaining insights much quicker by combining AI and Humanities

Martijn Kleppe is a trained historian who collaborates with AI scientists. Kleppe: ‘When I was doing research as a historian, I could analyze twenty books in a month. For the computer this is a matter of seconds.’

Martijn Kleppe

Martijn Kleppe is one of the founding members of Cultural AI Lab and the head of the Research Department of the KB National Library of the Netherlands.

The Cultural AI Lab is a joint effort of the heritage institutions Rijksmuseum, Institute for Sound and Vision, the KB and knowledge institutions CWI, KNAW, UvA, VU and TNO.

What problems within the humanities can be solved with AI?

‘The biggest challenge that we face within the humanities is scale. There is a shift happening right now within historical research from close reading, which we have done for centuries, to distant reading, where we use a computer to detect patterns in all sorts of humanities data. In books, newspapers, television programs, artworks, social media outlets etcetera. This is the biggest opportunity that we have right now, but also the biggest challenge because we have to rely on other competences to make this kind of research possible.’

How is the interest in AI from the cultural world?

‘It is really gaining momentum right now. There is an ecosystem evolving with partners from the culture and media domain – like the Rijksmuseum, the National Archive – and the creative industry, that are interested in applying AI within their services or processes. And several members of our lab recently founded the working group ‘Culture & Media’ within the National AI Coalition with all sorts of cultural and media partners.’

Does the humanities also influence AI?

‘Yes, it creates new academic research questions. Most of the algorithms within the AI domain have been trained with new and high quality data. But in the datasets of the heritage institutions there is 200 year old data. Digitized newspapers from the end of the nineteenth century with a very low quality for example. And newspapers from colonial Indonesia and Suriname with a completely different vocabulary. Bringing in those kinds of datasets within the AI domain offers new perspectives and questions on polyvocal data. How can we handle these kinds of data and how can we improve them? My experience so far is that the computer scientists involved in our project love these new questions.’

At the ICAI meetup on July 8, the lab will speak about contentious words in cultural heritage collections. How does the lab approach this?

‘Handling issues like that, is the essences of our research. We try to answer the question: How can you detect bias in descriptions of artifacts of museum collections? And can you also help the museum by giving them suggestions for other kind of words? Especially last year, with the Black Lives Matter movement, we saw how relevant these questions were. How do we deal with the past? It is a technical, but also really a societal and ethical challenge.‘

Boek uit de KB collectie

As a historian, what attracts you in AI?

‘I have to ask new types of questions. Instead of doing source criticism on books, newspapers or television programs, now I have to do source criticism on algorithms. At the KB we have the Delpher.nl platform for example, where you can search in millions of digitalized texts from Dutch newspapers, books and magazines. In order to search efficiently, you have to understand the basis of the algorithm behind it. What I also really like about the AI research is the teamwork. Traditionally humanities scholars are more solitary researchers. But to be able to collaborate with other disciplines you have to be vulnerable and develop yourself.’

Does it happen that you get a new perspective on heritage collections because of the AI research?

‘Yes, for sure. During my PhD research I wrote an article about the first moment in time when a Dutch photograph was published in a Dutch newspaper. That research was based on manually going through a selection of newspapers. But then four years later I participated in a research project with a historian who was one of the firsts to start applying Computer Vision on historical newspapers. He run an algorithm that went through all the newspapers and immediately said: ‘Martijn, you were completely wrong. The first photograph was published earlier than the moment you mention in your paper.’ That was fantastic. Science is always about gaining new insights. When I did my research, those techniques did not exist yet. We now gain insights much quicker than we did before.’

On July 8, 2021, Cultural AI Lab will talk about making AI ‘culturally aware’ during the Lunch at ICAI Meeting. Want to join? Sign up here.

ICAI Interview with Sebastian Schelter: Tackling the data management questions for machine learning

One of the major problems in machine learning right now is managing real world data. Sebastian Schelter: ‘In the real world you have new data every day, every minute or every second. A small mistake is quickly made. And then that small mistake can have devastating consequences for the model.’

Sebastian Schelter is Lab Manager of AIRLab Amsterdam and has a Joint Appointment position. He is funded by both Ahold Delhaize and the University of Amsterdam. The AI for Retail (AIR) Lab Amsterdam is a joint UvA-Ahold Delhaize industry lab.

On the next Lunch-at-ICAI meeting on June 17, AIRLab will talk about the challenges in machine learning. What is the biggest challenge?

Schelter: ‘There are some classical machine learning problems in retail and e-commerce that we address at AIRLab, such as forecasting and recommendation. But by far the biggest problem right now, is the data management question for machine learning. A lot of these problems relate to handling the data and occur when you build real systems and real applications around machine learning algorithms. This is something that is often overlooked when people talk about AI. It’s one of the biggest issues in machine learning right now, but nobody likes to talk about it.’

Why don’t people talk about this?

‘Because it is a very difficult problem. It is hard to study in academia, because academics don’t have access to real systems. One of the advantages of AIRLab is that we can look at these problems because we collaborate with companies. That’s a unique situation. And another reason is that this problem lies at the intersection of the data management and the machine learning community. Problems at the intersections of fields are always more difficult to study because you have to bring together people from different expertise’s.’

How does AIRLab tries to tackle this problem?

‘In the real world you have new data every day, every minute or every second. And the data might change because the world changes and systems change. So very often, before you can actually feed data into a machine learning model, you have to preprocess it, join together different datasets, clean it, filter it, and convert the format. A small mistake is quickly made. And then that small mistake can have devastating consequences for the model. So what we are doing, is building tools to make it easier for data scientists to find these problems and fix them.‘

You have a Joint Appointment position since February 2020. How is this working out?

‘I’m one day per week at Ahold and four days at UvA. Within this setup I get the advantages from both worlds. I’ve been working in similar setups for a longer time already, going back and forth between academia and industries. I like research, but I also like to build things that get used in the real world. I think it is very valuable for computer scientists to take a step out of the lab. Then you find a lot of interesting problems that you wouldn’t have found otherwise.’

What are the challenges that come with this position?

‘You need to bring together the business and the academic side. You have to look for problems that are academically important and interesting, but that also have a business value in a certain amount of time.’

Where do you see Airlab in three years?

‘The PhD students have already started to write great research papers. We’ve had some really good results recently, which I can’t tell you yet. I’m convinced that by the end of the five-year-period of AIRLab, we will have developed a set of technologies and solutions that have real world impact and will be used by our partner companies.’

On June 17, 2021, AIRLab Amsterdam will address the data management challenges for machine learning during the Lunch at ICAI meeting. More info and sign up here.

ICAI: Challenges and ambitions from two perspectives

Three years ago a couple of scientific researchers had an idea: stimulating AI talent in the Netherlands with bottom-up innovation and a lot of relevant stakeholders. ICAI was born. Now ICAI is growing significantly. It has 24 labs and is aiming at 40 to 50 labs by the end of 2022. What has ICAI achieved so far? And what are the challenges and ambitions? A view on ICAI from the perspective of lab manager Elvan Kula and scientific director Maarten de Rijke.

Elvan KulaLab Manager and PhD student at AI for Fintech Research lab (a collaboration of TU Delft and ING)

“It would be a nice

opportunity to perform studies

across ICAI labs.

Accomplishments

‘The first year of our lab was focused on bootstrapping AI for Fintech Research, which involved setting up the tracks, hiring people, organizing publicity and establishing awareness within ING. We have successfully set up a lab with 10 PhD students. In the beginning we had to make the stakeholders at ING aware or our lab and how we contribute to the company. Now, a year later, I feel like we have reached the sweet spot within the organisation where the stakeholders come to us with their own research ideas and ask if they can collaborate.’

Organizational challenge

‘As a lab manager I bring many different groups of collaborators together. I identify research opportunities, in close collaboration with engineers, researchers, students and professors from ING and TU Delft. One of the main challenges is managing important logistical aspects of the research projects in our lab. While research can be very unpredictable, our stakeholders at ING do want to have a clear plan on the deliverables and the timeline. As the lab manager, I plan and manage these expectations to deal with the unpredictability in research.’

Future challenge

‘One of the main challenges of our lab is scaling AI’s impact across the bank. The current research projects in our lab focus on standalone use cases that create impact for a specific team or department at ING. In the upcoming years, we want to work towards diffusing and scaling AI throughout the bank. Achieving results at scale requires us to deal with some technical challenges related to legacy systems and the fragmentation of data.’

Innovation

‘The main advantage of doing research within industry is that you get access to real world problems and large amounts of real world data. It allows us to do research that has practical applications and that is truly impactful. In the context of ING there are close to 14 million customers, 15.000 engineers in more than 600 teams. We have the opportunity to do research that helps thousands of people. Another benefit is that we work closely with a lot of people at ING that have much experience in the world of industry and business. We can learn a lot from them and they learn a lot from us.’

Expectations

‘Advances in AI are redefining the way the financial services sector is using data analytics and new technologies. With millions of customers and thousands of employees worldwide, the expectation is that AI will play an increasingly important role in ING’s business and operations. As the lab manager, I want to continue to strengthen the partnership between ING and TU Delft to support the ongoing transformation of the bank.’

Focus of ICAI

‘It would be a nice opportunity to perform studies across ICAI labs. The PhD students in our lab work on a range of topics, such as software analytics, data integration and fairness in machine learning, that are relevant to other companies as well. It would be very interesting to replicate our research in other ICAI labs and see how our findings relate to the results obtained at other companies.’


Maarten de RijkeCo-founder and Scientific Director of ICAI

“What I want to leave behind is

the attitude that making relevant

technological progress is a

shared responsibility.

Accomplishments

‘I’m really proud of the energy that ICAI has been able to generate and continues to generate. We are a supersmall team and we want to stay small. But by now, there are 24 labs and over 150 researchers throughout the country involved. And it’s all their initiative. We just facilitate it. The labs have created new ways of working, new ways of tackling problems and new types of teams. ICAI has a minimal but important set of requirements. First: Take care of your talent, the PhD students. And second: Take care of your environment, so share the knowledge and publish openly. And people do that in really creative ways. With training programs for professionals for example. Or with big industrial lab setups.’

Organizational challenge

‘Our dream, based on bottom-up innovation happening, has been an experiment. You think up a format and you adjust it as you go along. The challenges had to do with: how big should this be? How can we manage this? How do we organize communication? How do we make sure that it’s as open as we want it to be, while also providing enough benefits for early stage investors?’

Future challenge

‘The first step within ICAI was to get the resources so that we can attract and train talent. Now we’re at the stage that we need to think about how we can retain the talent. As our first labs begin to graduate their PhD students, we want those PhD students to find their next step somewhere in the country. For that we have created the Launch Pad. We want to open up the window so that they all see that plenty of interesting opportunities are nearby: in industry, at start-ups, NGO’s, government, academia. Their talent and expertise are needed everywhere.’

Innovation

‘The new thing that we do with ICAI, is making innovation and high-risk investment a shared responsibility. What we want to change is that companies too are investing in high-risk early stage research that is in a very low technology readiness level. Innovation is not just something governments need to think about. We all need to think about this. This whole development is also about making sure we have enough capability and capacity so that we can build and come up with innovative solutions to tough problems in the Netherlands. It’s about building and maintaining a decent level of technological autonomy. And that goes against big developments of the last decade that were focused on outsourcing. But I think that for this kind of AI technology where so many answers are still unknown, you have to experiment yourself. Because you have to have enough knowhow and talent. Otherwise you’re going to end up making big mistakes.’

Focus of ICAI

‘We have always had a strong focus on technological and economic impact. We still continue to have that focus, but in the long run our goals aren’t technological. They aren’t economic. They are societal. Our technological ambitions are aligned with the United Nation Sustainable Development Goals (SDG’s). These societal goals are extremely hard and require high-risk investments by all stakeholders involved: government, industry, knowledge institutes, society. This has been in the ICAI DNA from the start. Our labs on AI and health or AI for retail or agriculture obviously contribute to these SDG’s. But the same holds true for our labs that focus on AI for better machine perception, with less data and higher precision. And we recently opened our Responsible AI Lab, the Civic AI Lab and the Cultural AI Lab. That’s where we’re heading. Empowering teams of private and public partners to help address those big challenges with the knowledge and talent that we develop.’

Text: Reineke Maschhaupt

ICAI Interview with Jesse Scholtes: Making an impact in the real world

Program manager of FAST LAB, Jesse Scholtes, makes sure the collaboration between the researchers of TU Eindhoven and their five industry partners runs smoothly. Scholtes: ‘My main role is to manage the expectations and create a win-win situation.’

FAST LAB (new Frontiers in Autonomous Systems Technology) has joined ICAI this week. The lab is in its fourth year of research and creates smart industrial mobile robots that can deal with sudden obstacles in environments like farms, airports and oil & gas sites. The researchers of Eindhoven University of Technology work together with the industry partners Rademaker, ExRobotics, Vanderlande, Lely and Diversey.

Jesse Scholtes

How is it to work with so many different partners in one lab?

Scholtes: ‘Academia and industry are different worlds. Our industry partners want to implement the technology into their products as soon as possible. They are short-term driven. The academic world wants to come up with the best idea and the best way of solving something. For me it’s important to manage the expectations continuously, be very transparent about what we do and how that will turn into a benefit for our partners.’

How do you make this work?

‘Two things are important. One is the realization of all parties that it’s a shared investment. If the companies would develop this research on their own, it would cost them a lot more. The second thing we did from the start is make sure the involved companies are not each other’s competitors. That’s the biggest prerequisite for success. The companies share the same kind of R&D questions, but are active in a different domain. If you take away the potential commercial risk, then people open up, start to talk, share ideas and learn from each other.’

How do you translate that into concrete results?

‘In the first year the researchers spend a lot of time at the industry partners to understand what their challenges are. One of the core things that we have adopted is the end-of-year-demonstration. Researchers bring their new ideas, implement them into the systems of the industry partner and test them in a real world environment. Here we can see the results of what we’ve made and we can also steer the project to a different direction if needed.’

What are you most proud of regarding FAST LAB?

‘That we created a very open friendship and constructive partnership with our researchers and the partners. They really came together as a team, working on the same topics and helping each other. That cannot be taken for granted.’

Where do you see FAST LAB in the next few years?

‘FAST LAB will continue for one more year. But we are working on a successor with existing partners and probably with some new partners. There is a lot of interest. I hope we can create an ecosystem of companies and university-researchers working together and creating this type of win-win situation. As long as we continue to do that, we can continue this cycle and provide much needed continuity in development of novel ideas.’

On the ICAI Lunch Meetup of January 21, 2021, Jesse Scholtes will present the FAST LAB. More info and sign up here.

ICAI interview with Georgios Vlassopoulos: Strengthening the friendship between AI and humans

Georgios Vlassopoulos is a PhD student at KPN Responsible AI Lab, located in Den Bosch. He works on explainability systems for AI-models. Vlassopoulos: ‘If we want artificial intelligence predicting stuff, we should be able to explain why it makes a certain decision. Without this transparency, AI could be dangerous.’

Georgios Vlassopoulos

What is your research about?

‘My algorithm tries to explain to the user why the AI-system has made a certain decision. I’ll give an example. KPN uses natural language processing on texts looking for complaints of customers. How do you explain to the user why the system has categorized certain texts? What is the decision of the computer based on? In this case, my algorithm tries to learn the semantics that people use in complaints to use in the explanation of the model.’

‘You can expand this to many domains. Say that a doctor uses AI to detect cancer. The doctor only sees the prediction of the model, so if the patient has cancer or not. The patient must be very eager to know why the computer has made a certain decision. With my algorithm, I would learn the attributes, like the shape of a tumour, to the system and build an understandable explanation based on these attributes.’

How do you approach your research?

‘Let’s stick to the KPN example. For a large amount of texts the classifier would say: I’m not sure if it’s a complaint or not. I focus on the decision boundary, which is the set of datapoints for which the classifier is completely uncertain. All the classification information is encoded in this decision boundary, which is very complex. My approach is to train a simple model which mimics the behaviour of only a certain part of this complex information. And this can be communicated to the user.’

Why is your research different from other methods?

‘The explanations of current popular explaining models can be misleading. When you use these methods on high dimensional data, e.g. images, they treat every pixel as an individual feature. My position is that you cannot build a proper explanation based on pixels. I introduced a different framework that scales well for high dimensional data. And the explanations become more humanlike.’

Why is your research important?

‘In a data-driven world it is very important for AI to become friends with human beings. People should be able to understand why an AI-system makes a certain decision. If a bank classifies its customers with an AI-system on whether they are fit to receive a loan or not, then they should be able to inform the customers why they are accepted or rejected.’

What are the main challenges you face doing this research?

‘It’s like you’re looking for aliens. There is no ground truth. The problem is that you don’t really have an accuracy measure. If we take the medical example, a doctor can say that an explanation from the system is close to his intuition. But how can you prove that this is actually correct? I need to design the experiments carefully and still everything can go wrong. Sometimes I have to repeat an experiment multiple times.’

What are you most proud of?

‘The fact that I have made something that works. And it has good chances to be published in a top conference. The final answer will come in January 2021. But I’m already proud that high impact scientists say that my work is good.’

In this months’ Lunch at ICAI Meetup on Transparency and Trust in AI on December 17 Georgios Vlassopoulos will discuss his research. Sign up here.