A repository of university committee papers
The KCL Committee Zone project is one of the Start Up and Enhancement projects in the JISC repositories and preservation programme. The project is drawing to a close and has developed a repository to store the agendas, minutes and papers that are produced for the various committees of King’s College London.
The project held a dissemination event on the 10th where the repository was demonstrated. I think a few points from their demonstration are worth highlighting.
- The metadata scheme chosen is a reduced set of the e-government metadata standard, based on qualified dublin core.
- The repository only uses 11 metadata fields for the items stored in the repository. This means that the deposit process is only one screen and is not a great burden on the committee administrators who will be depositing. This lightweight approach is the result of working very closely with the committee administrators using focus groups and one on one sessions.
- The metadata collected is contextual information, not about search as the repository has a powerful full text search.
- Advocacy for this type of repository is still important. KCL committee zone have got good buy in so far due to the way they have involved the committee administrators in the project but there is still more advocacy required. This was highlighted by someone from the audience who pointed out that their similar project had struggled for buy in.
The other speakers at their dissemination event came from the BSI and from Islington council, they were both using complex document management systems to manage their committees. These presentations were very interesting as both seemed to focus strongly on the services offered to their staff and fitting or improving existing workflows. Both were using commercial content management systems and it seems that repository work in the HE environment could benefit from studying the workflow tools that they can offer.
Harvesting usage data?
I was talking with a researcher the other day who said that, despite his institution mandating deposit of research papers in his institutional repository, he didn’t comply - prefering to deposit in an international subject repository. Naturally, I asked him ‘why?’. He said that it was because he wanted each of his papers to be in one, and only one, place on the web, so that he could get accurate download statistics for it. Obviously, we’re aware in the JISC IE team of the various arguments on this topic, and we’ve funded a piece of work to look at the practical ways in which subject and institutional repositories might work together, which could address this issue among others. We’ve also funded various projects on repository statistics, such as ‘Interoperable Repository Statistics’ (which has developed a tool that repository managers can use to analyse and share statistics) and an ongoing small piece of work on harmonising article-level usage data formats. There is also MESUR and other projects in this space.
However, in the real world, it is likely that copies of some research papers are likely to be at various places on the web, and we wondered whether a tool could be built that used fuzzy matching to identify copies that were probably the same paper, some means of querying the servers on which they sat to get download data, and a reliable way of then aggregating that data into some acceptable statistics. Is that an important use case? Is feasible to build something that addresses it?
What’s the relationship (if any) with name authority services (see the JISC pilot Names project) or persistent identifiers (see the JISC Resourcing Identifier Interoperability for Repositories - RIDIR demonstrator)?
Bringing repositories to the attention of university senior managers
There are two new JISC briefing papers on repositories. One is concerned with the benefits of managing and sharing learning objects, the other with managing and sharing research outputs.
JISC and UUK are sending these papers to senior managers in universities next week. The papers should arrive on desks on Monday 16th of June. With any luck, the briefing papers will pique some interest in repositories or at least make sure the concept is familiar to senior managers.
This may represent an opportunity for capitalising on this familiarity or interest with further advocacy directed at senior managers about repository services, policies or projects.
The recipients are likely to be:
- Vice Chancellors,
- DVC Academic,
- DVC Research,
- University Secretary,
- Deans of Schools
Plus some of:
- Records Manager,
- Dean of the Graduate Research School,
- Director of ICT Systems,
- Director of Library& Information Services,
- Director Academic Enterprise,
- Principal Lecturer Pathfinder E-Learning (central post)
The briefing papers can be found on the JISC website:
Learning objects: http://www.jisc.ac.uk/publications/publications/elearningrepositoriesbpv1.aspx
Research: http://www.jisc.ac.uk/publications/publications/researchrepositoriesbpv1.aspx
ORE@JISC
With the release of the beta OAI-ORE specification this week, I thought it was worth highlighting some of the JISC work in the UK that is contributing to this initiative. Two short projects are looking to experiment with ORE and feed back into its development. The FORESITE project at Liverpool, run by Rob Sanderson, has produced ORE resource map descriptions of the JSTOR collection (1.8 million full text articles), and will also ORE-enable the DSpace repository platform, depositing the JSTOR-ORE collection into DSpace using the SWORD protocol. The Theorem project, based at Cambridge and run by Jim Downing, is looking at etheses, both representing ‘ideal’ born-digital theses as ORE resource maps, and looking at workflows around these. This project is working closely with the Integrated Content Environment (ICE) developed by Peter Sefton at the University of Southern Queensland, Australia, to create an authoring and management environment that produces and handles chemistry theses as born-digital objects, with live links to data, and so on. This work complements an international project led in the UK by Chris Awre, and involving partners from the UK, Netherlands, Germany and Denmark, which is looking to get some international agreement on a complex object format for theses, drawing from the ORE specifications, but building on specifications currently used, such as x-metadiss in Germany. Given the relative simplicity of doctoral theses – they have limited versioning issues for example – and the pressing need in many countries to automate the thesis workflow, it may be that theses become an early ORE adopter.
Using Repositories for Learning and Teaching: Can we find a recipe for success?
I attended a JISC Repositories and Preservation Programme meeting, but for a change I was able to sit back and learn rather than run around stressed as the entire event was designed and organised by DRaW, one of the projects in the start up and enhancement strand of the programme.
This was the first of 6 programme meetings that will be delivered by the projects rather than programme managers and, in my opinion, it was a roaring success.
You can read summaries of the day produced by Nick Sheppard of the Leeds Met Repository project and Julian Beckton of the Lirolem project. The day started with 6 quickfire introductions to the Lirolem, Circle, DRaW, YSJ Digirep and Faroes projects and an overview of the issues with learning and teaching repositories from Andrew Rothery (of the University of Worcester) and Phil Barker (of JISC CETIS). We also had two impromtu introductions to the POCKET and Edspace projects. It ws interesting to note that all of these projects were adopting a different approach to the implementation of a repository:
- Julian Beckton presented the Lirolem project. They have developed an collaborative working space for architects that has a deposit facility straight into the repository and the repository itself has an interesting interface for displaying compound objects.
- Steve Burholt presented the Circle project. It has made the repository itself largely invisible and have focused on building interfaces for specific purposes such as basic search and a really nice search and reuse interface to the repository from inside their VLE.
- Sarah Hayes talked about the DRaW project. They are focusing on consulting with staff to produce a service that they would find useful as they have had a learning and teaching repository for 3 years but have not experienced significant usage. This compares badly to the research repository that has been in place for less than a year and is getting much heavier usage.
- Helen Westmancoat from the YSJ Digirep project has focused on a number of specific collections in their institution, this has had the benefit of promoting widespread interest amongst the other parts of the instiution.
- Dave Millard spoke about the Faroes project. They have adopted a very lightweight approach, focused on hosting an academics learning assets and providing web 2.0 features on the platform. They have chosen a minimal approach to metadata and shifted the focus of the repository from metadata to the object.
- Sarah Malone got up to talk about POCKET and how it is turning existing learning materials into Open Content using the Open University’s OpenLearn platform.
- Debra Morris introduced Edspace, which is a large institutional exemplar project that is aiming to develop a sustainable solution that is firmly embedded in institutional culture and infrastructure.
The afternoon session focused on using the experiences of the delegates to try and prepare a list of recommendations for people implementing a learning and teaching repository. The outcomes of this discussion will be turned into a document that can be shared. This will complement the structured guidelines for starting a learning and teaching repository produced by the CD-LOR project.
The slides from the event, the audio recordings and the recommendations will all be available from the DRaW website in due course.
I learned an awful lot at the event and it was really gratifying that the tone of discussions became more optimistic as the day wore on. From my conversations with delegates the day was really useful to them and I am looking forward to the remaining events in this series (see more details in this earlier post). I think that this type of event will complement the more traditional JISC programme meeting in a way that is beneficial to JISC programme objectives and to the projects.
Research data curation
Back last year, following the Digital Curation Conference in Washington DC, JISC and the Andrew J Mellon Foundation hosted an international workshop to discuss and suggest where the international priorities are for research and development work supporting academic research data curation. It’s taken a while for the notes to become available, for which I apologise, but here they are:
Priorities for research data curation workshop 2007
(I realise this is a PDF file, which won’t please everyone, but shrunk the filesize by over an order of magnitude from MS Word)
The starting point for the workshop was a recognition that, while research data orients largely by (sub)discipline, the way in which infrastructure is developed and funded is often oriented nationally, or even around institutions. Some way is needed to square these two. I have to confess that, on the day, I wasn’t sure we’d made a lot of progress, but in drafting the notes I changed my mind somewhat. Certainly, Peter Murray-Rust seemed to identify the academic department infrastructure as a key point where intervention could serve both that department and the wider goal of data curation and sharing. The photos of flip chart diagrams are perhaps not easy to read or understand, but suggest a distinctive place for libraries and repositories.
Greg Crane’s Perseus project anticipated some of the topics that were covered later - notably how to design an infrastructure that is sustainable and yet adaptive - there are a few ideas in the notes. there are also a few ideas about how the problem space might be broken down so that an international approach can be taken, though this remains difficult. With luck and effort, JISC’s and other UK ‘data’ work will join up with that in the US (eg the NSF Datanet programme), Australia (Australian National Data Service), etc, and these notes will help us do that.
Many thanks to the workshop participants, listed at the end of the notes.
ReStore workshop
I attended a very interesting workshop for the ReStore project last week. The project is run by Southampton’s ESRC National Centre for Research Methods and is investigating the use of a repository to host and maintain orphan web resources.
The problem that the project is addressing is that very useful web resources are produced by research projects. However when the project funding stops the maintenance of the resources often stops. This means that the resources start to decay, broken links flourish and the usefulness of the resource deteriorates quickly.
ReStore aims to address this problem by accepting suitable resources after a review process and then hosting and curating the sites with a mixture of automated and manual processes.
The project is funded by ESRC and aims to produce a prototype repository that curates a few web resources that have been produced by other ESRC projects.
The workshop was chiefly concerned with introducing the project and discussing some of the major issues such as technical challenges, IPR and sustainability. The presentations from the day can be downloaded from the project website: http://www.ncrm.ac.uk/restore/slides/. These include some mockups of the proposed system and an overview of the proposed review and curation process.
The project’s work on development of a long-term strategy for ESRC in sustaining on-line resources will be very relevant to JISC.
The technical challenges in hosting a range of resources that may all use different software and hardware are significant and it may be better in the short term to use Amazon Web Services or a similar service to host the sites and avoid a large hardware bill.
The costs of preserving research data
There’s a new report on the JISC website, authored by Neil Beagrie, Julia Chruszcz and Brian Lavoie. It looks at how much it costs to preserve research data and, perhaps as importantly, how institutions and others could calculate this. There are lots of reasons why this report is likely to have an impact - looking after research data is potentially costly, and yet it is important that - as a community - we make reasonable decisions about what should be preserved and how. Perhaps unsurprisingly (at least for those who already do this for a living), it seems the cost of ingesting the data forms the largest cost in the curation lifecycle, but at the same time the evidence shows that correcting badly ingested data later is even more costly, so the figures probably suggest that there is a positive cost/benefit calculation here. There is potential for developing the methodology here into a tool, and there could also be potential for some join-up with the Data Audit Framework.
Repositories and Preservation Programme Synthesis
We are proposing to undertake a synthesis of the repositories and preservation programme which will support action. This means that the outputs need to be targeted at decision makers with additional information for those that will have to implement the decisions.
We have taken as a starting point the idea that decision makers are most likely to take note of what we are saying if repositories or preservation address problems that they are already worried about, and that many of these will stem from government, funding council or similar policies which they have to implement.
We have identified policies, decision makers who are concerned with them and ways in which we think that repositories or preservation can help.
We are aware that there will be other policies out there that we should be considering, that there may be other ways in which repositories or preservation could help and there may be other people we need to address.
We would very much welcome comments and thoughts on our thinking so that we can take it forward and start the synthesis.
Please comment either by posting comments or by email to Tom Franklin who is leading on this (tom@franklin-consulting.co.uk).
Research
The Research Excellence Framework is of concern to many at the moment including senior managers, research managers, researchers and librarians. We believe that it is likely that institutional repositories will make collection of the relevant information easier and cheaper and will support whatever metrics are likely to be selected. It is also possible that open access repositories will lead to research being found more easily and therefore cited more widely. This also supports increasing research recognition.
Funding mandates from funding bodies such as research councils and Wellcome can be addressed through the use of required repositories (such as UK Pubmed Central), but through the use of suitable institutional repositories that support things like embargo periods.
Community and business engagement requires that information is made accessible to those that might effective use of it. Institutional repositories may assist here.
Teaching and learning
Cost reduction may be achieved through better sharing of learning materials, including learning objects, this will be of interest to both managers and teachers who need to then implement and make use of repositories, but contributors will also have to think about using appropriate standards. Integration with the VLE would also enable the most current version of materials to be easily accessible.
Quality assurance of courses, especially franchised courses for instance between a university and FE colleges is of concern to senior managers and teachers and could be supported by making learning resources available across the group through use of repositories.
Many institutions and their managers are concerned with retaining control over the IPR of their learning materials, institutional repositories for learning objects offer one way of controlling access effectively.
Information services and libraries
All managers and Staff are concerned with meeting their legal and Contractual requirements including self-deposit / open access and being able to enforce embargoes. Institutional repositories can help with these issues.
Help wanted
Are these the most important drivers?
Are there other drivers that we should consider?
Have we correctly identified the key audiences who can help to identify these things?
Posted by: Tom Franklin
Repositories Support Project (RSP) Workshop
I attended a very useful workshop last week which was run by the Repositories Support Project (RSP). About 50 people were there representing around 30 organisations and there were presentations on the following initiatives:
JULIET
RoMEO
OpenDOAR
ROAR
The Depot
JORUM
EThOS
OAISter and BASE
Intute Repository Search
There was also time for some discussion and this highlighted a few issues that might be worth flagging up.
Bill Hubbard (RSP) commented on the lack of success in the U.S. of the Open Access Mandate at the National Institutes of Health. (see Open Access news article for background: http://www.earlham.edu/~peters/fos/2007/12/oa-mandate-at-nih-now-law.html). There appears to be only a 5% compliance rate at the moment so that obviously hasn’t worked! Bill made the point that this clearly reinforces the notion that the most important factor in improving repository deposit rates is not telling people ‘they must’, but to ensure that deposit is an integral part of the scholarly workflow.
(Obviously it’s not all about quantity, the material in these repositories has to be high quality and JISC is commissioning some work that will investigate techniques to help determine the quality of that deposited material).
Another point Bill made … It’s worth remembering that the amount of research that should be going into repositories is very substanstial. 6 out of 7 UK Research Councils have an archiving policy, and 36 out 38 Russell Group/1994 universities (which account for more than 80% of HE sector research done in the UK) have repositories.
It was good to see some of the repository stats reporting tools that are available in ROAR (Registry of Open Access Repositories - http://roar.eprints.org/)
One of repository manager participants at the event said that she recently had a conversation with an academic who was much more impressed with the information about repositories that he could see in OpenDOAR and ROAR than he was with the idea that his own institution had a fully operational and well stocked DSpace repository. We talked about the quality of advocacy materials that were available for repository managers to ’sell’ their systems and wondered if more could be done.
Some Other issues/comments from participants …
* JULIET & RoMEO were very useful resources. More should be done to develop API’s for both of these so that information could be embedded into institutional repository (IR) interfaces.
* The diversity of information and resources for HE IR managers was confusing. There should be a ‘one stop shop’.
* SWORD looks really interesting. Multiple deposit could improve the versioning problem where 4 different authors of a single paper are all putting separate (and potentially different) copies into their IRs.
* Is Intute more important for librarians than academics?
* The focus on colour-coded Open Access types is confusing and unhelpful. Green/white/Gold etc.
* Perhaps when talking about copyright issues, there should be more information about what IS possible rather than what isn’t. Copyright is not an issue that a lot of people want to engage with and some clear enabling advice would be good.
* IR managers on the whole had not started to grapple with preservation issues in a methodical way
These are just some of the notes I jotted down and the RSP will be reporting on the workshop in detail. But a very useful session - highly recommended for anyone in the repository field - particularly those who are fairly new to the area.
forthcoming events - http://www.rsp.ac.uk/events/