GIVE update: digitisation and high-speed metadata creation
26 Apr 2023The GIVE project (Coordinated Initiative for Flemish Heritage Digitisation) has made rapid progress over recent months. The project, which involves mass digitisation and metadata enrichment, will end later this year. Read about how we’re preparing thousands of newspapers, glass plates, masterpieces and hours of archival content for future (re)use.
To find out what’s happening behind the scenes, follow the project on our social media channels. You can expect updates on milestones and digitisation processes, and learn about how we’re adding and enriching metadata automatically.
Knowyourcarrier.com for photos launched!
Have you already used our identification tool, knowyourcarrier.com? This website has been up and running for identifying old video and audio content since 2018. But now you can also go there to find out about your photographic materials, get tips on how to preserve and digitise them, and discover whether they have any heritage value. This expansion to include photographs has been made possible with help from photographic experts and is part of the GIVE project.
The digitisation process is running smoothly
The process of immortalising Flemish masterpieces in 2D, 3D and gigapixel has been underway for some time. At the end of last year, we let you know that the digitisation phase for newspapers, glass plates and masterpieces on paper and parchment was about to begin. Four months later, our digitisation partners are now busy digitising, photographing and scanning. We have made great progress already, but there is still a lot more work to do.
Newspapers on track
Around a quarter of the total number of newspapers have already been digitised. We’re consistently monitoring the quality of each page to ensure they meet our strict requirements, and so far, this has always been the case.
Flemish masterpieces: a big challenge
Digitisation partner GMS finished configuring a custom-made digitisation set-up for the Flemish masterpieces on paper and parchment last week. This set-up ensures we can digitise of a wide range of valuable artefacts as safely as possible. The handwriting of Guilliam Caudron from the Aalst city archives was first to go in front the lens. We will then relocate the set-up to digitise the remaining 39 valuable masterpieces on paper and parchment in ten other locations.
The photography and 3D scanning of paintings, prints and sculptures from museums and churches is no longer in its infancy: we have already digitised over 80% of all the works. But it still remains a challenging task. Some valuable artefacts are suspended as high five meters, while others require assistance from a professional art handling firm.
We ventured into unfamiliar territory in this project: how to make a 3D copy of a sculpture? We are now happy to be in a position to share the knowledge we gained.
Registration in its final phase
We’re digitising an impressively large volume of content in our Primeur newspaper project (together with Flanders Heritage Library) and the GIVE glass plates project. Carefully preparing and registering all these thousands of newspapers and glass plates is an important intermediate step to ensuring a smooth mass digitisation process later on. We started this process in February last year, and are now on the final stretch. We are right on track to safely transport the final newspapers and photos.
We would once again like to thank all of our content partners and colleagues who have dedicated months to this painstaking work!
Interested in damage registration? Flanders Heritage Library conducted research into the state of historical newspapers in Flemish institutions. They confirm the importance of newspaper digitisation on the basis of five case studies.
Enriching audio and video content with metadata: where do we stand?
We store a vast amount of digitised and born-digital content in the meemoo archive system, but a lack of or inadequate annotations make it difficult to search, so it cannot easily be reused. Manually adding metadata to thousands of hours of video and audio is not realistic, which is why we’re turning to methods within the realm of artificial intelligence and machine learning.
Detecting and recognising faces in videos
Many ethical issues arise when you let an application such as this loose on a large amount of content. We are therefore building a robust framework together with Knowledge Centre Data & Society and several stakeholders. A second session in January provided a lot more relevant input, including about the role of our content partners. More details to follow at the end of the project!
Converting audio and video into searchable transcripts
In addition to matching faces to names, using speech recognition to create metadata (speech-to-text or STT) and named-entity recognition (NER) are also on the agenda. What does this entail? An external service translates audio files into ready-made text (transcripts), from which we can later extract relevant names of places, people, organisations and other entities.
For speech recognition, we decided to purchase an existing service and launched a public tender. We assessed all the proposed solutions that we received in terms of price and quality, and compared the different options with content that we transcribed manually in order to remain objective. We ultimately selected Speechmatics as our partner and are currently putting the finishing touches on integrating their service into our architecture.
A few more final adjustments, and the three applications will be able to start enriching 160,000 hours of audio and 120,000 hours of video files from the meemoo archive system...
What next?
The GIVE project will be completed by the end of 2023. We will then gradually make the digitised masterpieces, newspapers and glass plates accessible on our own platforms, and on our participating partners’ platforms if they wish. We’re also preparing the generated metadata so that it can be re-used easily and efficiently.