Why do repositories quickly become so complex? One answer is simply scope creep – repositories have roles in dissemination, research information management and curation and, facing these three ways, it is inevitable that the demands placed upon them mushroom. Without wanting to open up any arguments around SOA or RESTful approaches, one answer is to go back to Cliff Lynch’s 2007 description of the institutional repository as “a set of services that a university offers to the members of its community for the management and dissemination of digital materials created by the institution and its community members”. This approach seems to be having a revival.
The California Digital Library (CDL) is charged with providing a environment that enables those at the University of California to curate digital assets effectively. Rather than adopting a single solution, they have pioneered an approach based on “micro-services”. In this approach, the fundamental curatorial functions are disaggregated and provided by a managed and defined set of discrete services. They claim that this increases the flexibility of the environment, its ability to exploit changing technologies, and enables it to develop sufficient complexity to deal with evolving demands without becoming baroque. The approach has also been adopted at Northwestern University and Penn State in the US. The topic was of considerable interest at the recent Scholarly Infrastructure Technical Summit (SITS) meeting.
It’s an approach followed in several current projects, including Hydra. The discussion at the SITS meeting seemed to focus in part on the degree to which such micro-services can be standalone, as some of the CDL ones can be seen, or require that certain assumptions can be made about the environment in which they will be used, as in Hydra (Fedora). In reporting on the SITS meeting, Dave Challis notes that “I’m not convinced the specs for these are well defined enough for general purpose use yet”. There may be useful lessons from initiatives such as the e-Framework on the circumstances in which such definitions are feasible.
Relatedly, perhaps, it was interesting to hear Chuck Humphrey (Head of the Data Library, University of Alberta) speak at the recent SPARC Repositories conference describing the approach taken in Canada whereby a distributed OAIS environment is being established based on discrete services deployed across the country. Previous JISC work such as the Sherpa DP and PRESERV projects explored some of the options a few years ago, and these lessons may be worth revisiting in the light of the micro-services discussions.
There is probably some further learning to be done about what constitutes a viable, usable and sustainable micro-service and, with real examples out there now to use, there is a chance that people’s experiences of providing and using them will be shared.