From physical to digital archive: VIAA’s end-to-end reporting
Imagine you could follow the route a tape or cassette takes from the very beginning – tucked away in a cardboard box – to the satisfying end – when the content is stored neatly and securely with metadata on the servers. It’s called end-to-end reporting (from start to finish), and at meemoo we’ve turned it into a real art form.
Digitisation always starts with the content partner’s registration in our system. Every item is given its own unique code – a PID or persistent identifier – to be used throughout the entire process. After being digitised, the files are sent to a meemoo data centre where they’re imported into the archive system via the ingest process.
This whole process generally runs very smoothly, but there’s potential for something to go wrong in each stage, which is why we use lots of control mechanisms to detect possible faults (such as files being digitised but not sent, files only partially being sent, or metadata not being entered correctly).
The real challenge is being able to perform these checks on such a large scale, however: new files are being digitised every day, and just one delivery in a digitisation project can easily consist of thousands of files. And there’s often several digitisation projects running at the same time. The archive infrastructure components therefore need to be developed in a way that makes this kind of reporting as simple as possible. Checking files as they’re being imported and subjecting them to visual reporting tools quickly gives us an insight into anything going wrong in the process.
The graph below, for example, shows the situation at time of writing, early 2019: there are currently 232 files that have already been digitised but not yet correctly archived; a number have failed; and others are still in transit to the VIAA data centre. VIAA is responsible for following up these faults. We make sure that the failed files are picked up again, and keep a close eye on the others to make sure they’re actually delivered.
Separate from the import reports, there’s also a second check when the digitisation project ends to verify that all the digitised files have been successfully archived at VIAA.
Lots of data warehouse work was carried out in 2018 to simplify this kind of reporting even further. This involves collecting all the relevant data from the process of registering, digitising, archiving and publishing via the Archive for Education. This piece of technology will make our checks much more efficient, and opens doors to other automations in the future.