|
minute read
Meemoo, Flemish Institute for Archives, plays a pivotal role in preserving and disseminating the cultural heritage of this region in Belgium. Working with its cultural sector partners, meemoo digitises diverse heritage items, from newspapers and glass plates to audiovisual materials and Flemish masterpieces.
Part of meemoo’s mission is to enhance the accessibility and usability of these digitised items by enriching them with metadata labels. Faced with vast volumes of material, meemoo knew such an undertaking would only be feasible with the help of AI. And that’s why they turned to Sopra Steria’s specialist AI and data experts.
With Sopra Steria’s support, they embarked on their first large-scale AI initiative in 2023. This first project focused on using AI to apply metadata to accurately label millions of people, places, and organisations across 170,000 hours of video and audio content.
To learn more, we spoke to meemoo's archiving manager, Matthias Priem, and Sopra Steria's AI and data science director, Kimberly Hermans.
Could you provide an overview of the GIVE metadata project and its goals?
Matthias: In recent years, we have digitised and archived a vast mass of audiovisual material at meemoo. These digital archives are often difficult to search due to a lack of descriptions (or metadata labels) but adding these manually is extremely time-consuming.
So, we aimed to apply automatic, AI-driven descriptions to all the audiovisual materials while maintaining the highest ethical and privacy standards. The metadata lets us establish links between names, recognised entities, places and other archives or external sources.
So, can you give an example of how this works?
Matthias: For instance, if a person appears in videos in both parliament archives and museum archives, such as the opening of an exhibition, we will use the same label to identify them. Using Wikidata links, we can further enrich the content by linking to that person's political party or birthplace.
How did meemoo navigate the ethical dimensions of such a project?
Matthias: Being a government-funded organisation operating in the ethically charged cultural heritage space, we sought legal advice and established checklists to identify and mitigate risks. Additionally, we formed an ethics committee and conducted workshops to ensure the exploration of ethical questions from various professional perspectives.
What about legal issues?
Matthias: We made sure the work is compliant with GDPR and found the DPIA a useful touchstone to check this against. Alongside that, we investigated whether there are biases in the models we used and what we could or could not do with regards to face recognition. Many parties contributed to this process: legal experts, people that would be recognized in the videos, archivists, and legal and ethical consultants. Based on the work done during the project, we created a practical legal and ethical framework to help archivists with the everyday use of AI, such as determining whether a person can be added to the reference set for face recognition.
Kimberly, why was AI necessary in this project? How does your expertise fit in?
Kimberly: The sheer scale of the task made AI necessary. Within this project's scope alone, there were 170,000 hours of footage across 127 archives. It would have taken a human over 19 years to listen to all the audio, let alone transcribe and annotate it. So, we used speech recognition to convert spoken words into computer-readable, time-coded text. We employed entity recognition on the transcripts to tag individuals, locations, and organisations. And we applied face recognition technology to tag the faces of specific people in the public domain.
How did Sopra Steria assist meemoo on the technical side?
Kimberly: Our team helped by researching and assessing commercially available AI tools that met the stringent quality standards required. We facilitated discussions with vendors and built the necessary pipelines to connect meemoo's data to the platforms. Regarding face recognition, our team created a custom-made tool based on open-source models because there was no appropriate solution on the market. However, the challenge wasn't just in the algorithms; the hard part was integrating them into an end-to-end solution that worked in the specific context of cultural heritage.
This sector has high ethical standards, ranging from complex copyright and digital rights issues to interoperability and scalability requirements. This necessitated collaboration with experts from various fields, including archivists, historians, ethics experts, and more. Everyone's expertise was vital in making the AI solution successful.
How did Sopra Steria address potential biases in the facial detection system?
Kimberly: During working groups, feedback highlighted the need to investigate the risks of potential biases. However, instead of exploring every possible bias, we took a practical approach. Our analysis indicated that, in this case, facial detection and recognition were most likely to introduce unwanted biases. Therefore, we precisely defined what aspects needed validation. The group selected 20 people from the reference set, and we validated whether the system correctly detected and recognised those individuals across 30 videos.
Matthias, what were the outcomes of the GIVE project?
Matthias: The results were outstanding. We identified 3.3 million faces and around 6.5 million named entities and transcribed 560 million words in different languages. In other words, we accomplished a task that would have taken humans decades in just a few months. Additionally, we developed an interface with restricted access for partner organisations, creating a secure space to explore the data and its possibilities. Our partners are enthusiastic about the outcome, too. In 2023 alone, when the project was live, we received three prestigious awards. We will set up further AI projects to enrich Flemish heritage in the coming years.
To learn more about the technologies used in the project, including the custom-built facial recognition tool, watch the video bettwen Matthias and Kimberly.
Discover why AI is nothing without you
At Sopra Steria, we believe AI’s true potential is unlocked with human collaboration. By blending human creativity with advanced AI technology, we empower people to address society’s most pressing challenges—from combating disease to mitigating climate change—while helping our clients achieve their digital transformation goals.
We emphasize critical thinking and education to ensure AI upholds core human values like respect and fairness, minimizing ethical risks. Together, we’ll create a future where AI inspires positive impact and enhances human brilliance. That's why we believe that AI is nothing without you!
DISCOVER MORE