The JISC Preservation of Web Resources Workshop (PoWR)

The first JISC-PoWR workshop took place on Friday (27th June 2008) at Senate House Library, University of London and was attended by over 30 people from a wide range of professional groupings, including the Web management and Records Management communities. The workshop was entitled ‘Preservation of Web Resources: Making a Start’ and considered how delegates could begin to consider including Web resources in their preservation strategy. There was much interest in the case study presented by the University of Bath which illustrated the differing perspectives held by the web and records management communities. Bringing together these communities is something the project is seeking to address.The main presentations are now available for download:
http://jiscpowr.jiscinvolve.org/2008/06/30/workshop-1-resources-available/

 Posted by: Neil Grindley

Using Repositories for Learning and Teaching: Can we find a recipe for success?

I attended a JISC Repositories and Preservation Programme meeting, but for a change I was able to sit back and learn rather than run around stressed as the entire event was designed and organised by DRaW, one of the projects in the start up and enhancement strand of the programme.

This was the first of 6 programme meetings that will be delivered by the projects rather than programme managers and, in my opinion, it was a roaring success.

You can read summaries of the day produced by Nick Sheppard of the Leeds Met Repository project and Julian Beckton of the Lirolem project. The day started with 6 quickfire introductions to the Lirolem, Circle, DRaW, YSJ Digirep and Faroes projects and an overview of the issues with learning and teaching repositories from Andrew Rothery (of the University of Worcester) and Phil Barker (of JISC CETIS). We also had two impromtu introductions to the POCKET and Edspace projects. It ws interesting to note that all of these projects were adopting a different approach to the implementation of a repository:

The afternoon session focused on using the experiences of the delegates to try and prepare a list of recommendations for people implementing a learning and teaching repository. The outcomes of this discussion will be turned into a document that can be shared. This will complement the structured guidelines for starting a learning and teaching repository produced by the CD-LOR project.

The slides from the event, the audio recordings and the recommendations will all be available from the DRaW website in due course.

I learned an awful lot at the event and it was really gratifying that the tone of discussions became more optimistic as the day wore on. From my conversations with delegates the day was really useful to them and I am looking forward to the remaining events in this series (see more details in this earlier post). I think that this type of event will complement the more traditional JISC programme meeting in a way that is beneficial to JISC programme objectives and to the projects.

Research data curation

Back last year, following the Digital Curation Conference in Washington DC, JISC and the Andrew J Mellon Foundation hosted an international workshop to discuss and suggest where the international priorities are for research and development work supporting academic research data curation. It’s taken a while for the notes to become available, for which I apologise, but here they are:
Priorities for research data curation workshop 2007

(I realise this is a PDF file, which won’t please everyone, but shrunk the filesize by over an order of magnitude from MS Word)

The starting point for the workshop was a recognition that, while research data orients largely by (sub)discipline, the way in which infrastructure is developed and funded is often oriented nationally, or even around institutions. Some way is needed to square these two. I have to confess that, on the day, I wasn’t sure we’d made a lot of progress, but in drafting the notes I changed my mind somewhat. Certainly, Peter Murray-Rust seemed to identify the academic department infrastructure as a key point where intervention could serve both that department and the wider goal of data curation and sharing. The photos of flip chart diagrams are perhaps not easy to read or understand, but suggest a distinctive place for libraries and repositories.

Greg Crane’s Perseus project anticipated some of the topics that were covered later - notably how to design an infrastructure that is sustainable and yet adaptive - there are a few ideas in the notes. there are also a few ideas about how the problem space might be broken down so that an international approach can be taken, though this remains difficult. With luck and effort, JISC’s and other UK ‘data’ work will join up with that in the US (eg the NSF Datanet programme), Australia (Australian National Data Service), etc, and these notes will help us do that.

Many thanks to the workshop participants, listed at the end of the notes.

ReStore workshop

I attended a very interesting workshop for the ReStore project last week. The project is run by Southampton’s ESRC National Centre for Research Methods and is investigating the use of a repository to host and maintain orphan web resources.

The problem that the project is addressing is that very useful web resources are produced by research projects. However when the project funding stops the maintenance of the resources often stops. This means that the resources start to decay, broken links flourish and the usefulness of the resource deteriorates quickly.

ReStore aims to address this problem by accepting suitable resources after a review process and then hosting and curating the sites with a mixture of automated and manual processes.

The project is funded by ESRC and aims to produce a prototype repository that curates a few web resources that have been produced by other ESRC projects.

The workshop was chiefly concerned with introducing the project and discussing some of the major issues such as technical challenges, IPR and sustainability. The presentations from the day can be downloaded from the project website: http://www.ncrm.ac.uk/restore/slides/. These include some mockups of the proposed system and an overview of the proposed review and curation process.

The project’s work on development of a long-term strategy for ESRC in sustaining on-line resources will be very relevant to JISC.

The technical challenges in hosting a range of resources that may all use different software and hardware are significant and it may be better in the short term to use Amazon Web Services or a similar service to host the sites and avoid a large hardware bill.

Repositories Support Project (RSP) Workshop

I attended a very useful workshop last week which was run by the Repositories Support Project (RSP). About 50 people were there representing around 30 organisations and there were presentations on the following initiatives:
JULIET
RoMEO
OpenDOAR
ROAR
The Depot
JORUM
EThOS
OAISter and BASE
Intute Repository Search

There was also time for some discussion and this highlighted a few issues that might be worth flagging up.

Bill Hubbard (RSP) commented on the lack of success in the U.S. of the Open Access Mandate at the National Institutes of Health. (see Open Access news article for background: http://www.earlham.edu/~peters/fos/2007/12/oa-mandate-at-nih-now-law.html). There appears to be only a 5% compliance rate at the moment so that obviously hasn’t worked! Bill made the point that this clearly reinforces the notion that the most important factor in improving repository deposit rates is not telling people ‘they must’, but to ensure that deposit is an integral part of the scholarly workflow.

(Obviously it’s not all about quantity, the material in these repositories has to be high quality and JISC is commissioning some work that will investigate techniques to help determine the quality of that deposited material).

Another point Bill made … It’s worth remembering that the amount of research that should be going into repositories is very substanstial. 6 out of 7 UK Research Councils have an archiving policy, and 36 out 38 Russell Group/1994 universities (which account for more than 80% of HE sector research done in the UK) have repositories.

It was good to see some of the repository stats reporting tools that are available in ROAR (Registry of Open Access Repositories - http://roar.eprints.org/)

One of repository manager participants at the event said that she recently had a conversation with an academic who was much more impressed with the information about repositories that he could see in OpenDOAR and ROAR than he was with the idea that his own institution had a fully operational and well stocked DSpace repository. We talked about the quality of advocacy materials that were available for repository managers to ’sell’ their systems and wondered if more could be done.

Some Other issues/comments from participants …

* JULIET & RoMEO were very useful resources. More should be done to develop API’s for both of these so that information could be embedded into institutional repository (IR) interfaces.

* The diversity of information and resources for HE IR managers was confusing. There should be a ‘one stop shop’.

* SWORD looks really interesting. Multiple deposit could improve the versioning problem where 4 different authors of a single paper are all putting separate (and potentially different) copies into their IRs.

* Is Intute more important for librarians than academics?

* The focus on colour-coded Open Access types is confusing and unhelpful. Green/white/Gold etc.

* Perhaps when talking about copyright issues, there should be more information about what IS possible rather than what isn’t. Copyright is not an issue that a lot of people want to engage with and some clear enabling advice would be good.

* IR managers on the whole had not started to grapple with preservation issues in a methodical way

These are just some of the notes I jotted down and the RSP will be reporting on the workshop in detail. But a very useful session - highly recommended for anyone in the repository field - particularly those who are fairly new to the area.
forthcoming events - http://www.rsp.ac.uk/events/

Case studies galore

As part of the Repositories Support Project’s session for Repository managers at the splendid Open Repositories 08, the conference organisers collected a load of case histories from repository managers in the US and Europe.

The case histories have been made available on the Repository Support Project’s website. They cover a variety of different repositories in a variety of different settings. Some of them are short and some are long but they are all an interesting read.

As far as I can see these are useful in a number of ways:

Posted by: Andy McGregor

Is this an effective development community?

The information environment, and repositories in particular, were highlighted by Sir Ron Cooke (JISC chair), in his opening keynote at the JISC conference. (See the online conference proceedings.)

He described the vision of a national e-infrastructure supporting the “body of knowledge” at the centre. He told delegates that “[his] nightmare is the challenge of the super-abundance of digital data” and stressed the importance of positioning our repositories very carefully in this landscape of abundant information. From a seemingly different perspective, the closing keynote by Angela Beesley described the work of the Wikimedia foundation, which includes Wikipedia but also other interesting projects I had not heard of before. Their vision is of open access, of making as much knowledge as possible available to the world. Their solution is less about infrastructure and more about mass, scaleable workflows. Her answer to “can you trust user-generated content?” was a refreshingly firm “no. but you can trust the process”.

So how do we develop a layer of scholarly information (for research, learning and teaching) where individuals can find, use and share trusted information, supported by an agile infrastructure provided by institutions, publicly funded shared services, commercial services and wikipedia? It’s a heady mix. I took heed from Ron’s warning that “it’s often easier to have the vision than to have the stamina to battle against institutional inertia or even resistance”.

I think that’s the key challenge for us now, in the world of digital libraries and e-infrastructure. How do we ensure that we’re building firm foundations instead of castles in the sky? How do we avoid going down routes that are technically interesting but offer no tangible benefits to staff and students in institutions?

An important part of the answer is in how we, as a development community, work together to make sure we’re doing the right sorts of things in the right way in the right order. This was the focus of the Rapid Community Building session I went to in the afternoon . The Users and Innovation Development Model marries up the requirements analysis process with the development process to encourage constant sense-checking and quality assurance. We need this on a grand scale if we’re to continue developing in the right direction. The Emerge project is about sharing ideas to support this virtuous cycle and the overall impression I had was of creative chaos! Not everyone wants to work in the web2.0 way. But perhaps if every cluster of developers has an enthusiastic communicator then the community will get more of the benefits sooner.

I’ll finish with a quote and a question.

Quote, with thanks to George Roberts in the community building session:
“Much of what works is already there” Cooperrider and Srivastva (1987)

Question … Is it true? How do we review what works? How do we address the gaps? The IE team really wants to hear from projects how we can improve the development cycle, from identifying useful projects through to embedding outputs. What sorts of things can we all do to make this process work better?

Open Repositories 2008, and Web Science

Having missed most of the presentations at the Open Repositories conference in Southampton this week, my reflections on the event have been prompted more by the Southampton-MIT collaboration described by Wendy Hall and Nigel Shadbolt before the conference dinner, the Web Science research centre. Initiatives such as this (see also, for example, the Oxford Internet Institute), with their focus on an interdisciplinary understanding of the web as a ‘first class’ research object, are particularly timely. I was struck by a potential parallel: an Australian colleague had earlier told me how researchers and technologists there are working to create an interdisciplinary research data resource on breast cancer; on the one hand there are similarities (the alliance of researchers and technologists brought to bear on a research topic), but there are differences. It could be argued (though not all would agree) that the web is a product of social interaction in a way that breast cancer is not. That is, technology (including the infrastructure underpinning science) is ‘social relations made concrete’. Incidentally, those social relations include interventions by those studying or evaluating science practice, so that ‘web science’ is a reflexive undertaking in a way that the study of breast cancer is not (again, not all would agree, see the ‘Science Wars’ entry on Wikipedia).

Some examples from the conference: Johan Bollen presented the outstanding and topical LANL MESUR work on metrics. Fifteen years ago, Steve Woolgar* alerted us to the social, as well as the academic, reasons for the persistence of citation metrics as a tool in research evaluation. This mash-up of social and academic relations is likely now to be embedded in a technical infrastructure for research, so that it is important that the social aspects are well understood before that infrastructure is ‘fixed’. For example, should we be concerned about the potential of this infrastructure for surveillance?

One of the most successful parts of the conference was the ‘Repository Challenge’ (and I don’t just say that because JISC sponsored it!). Some 19 teams of developers competed in building potentially useful tools from existing services and components. It’s perhaps telling that many of those shortlisted focused on ingest, getting material more easily into repositories. In particular, the aim seemed to be to make ingest invisible. What does this say about the relations between the repository community and scientists?

Finally, I was struck by the number of times within a single evening that the conversation turned to the key role played by seemingly rather prosaic aspects of university organisation. Two examples: (i) Talking with a developer who wanted to use Amazon’s S3 storage services, the almost insurmountable obstacle was the difficulty in getting access to an institutional credit card (the only means of payment). (ii) Talking with a scientist-informatician, it became clear that the move in the UK to a single pay spine for all those working in a university does not mean that the boundaries around traditional academic disciplines are any less rigid – he wants to employ people with both science and informatics skills, but they have no comfortable home within the set of university roles as currently defined.

The range of these concerns, from research evaluation policy to whether or not a developer can use the departmental credit card, shows that ‘web science’ (as a practice and a topic for research) operates on a broad front, not all of which is especially elevated. If we’re to appreciate the ways these interconnect, then the need for some insight from disciplines such as anthropology seems obvious.

*Woolgar, S. (1991) Beyond the citation debate: towards a sociology of measurement technologies and their use in science policy. Science and Public Policy, 18(5), 319-326.

Posted by: Neil Jacobs

Host your own programme meeting

JISC regularly holds meetings for the people involved in the projects funded under a particular programme. These programme meetings are popular. Well, some parts are, the networking parts are popular, the parts where we discuss JISC objectives and reporting are less popular.

Because the Repositories and Preservation Programme is very large (circa 80 projects) and addresses a variety of themes, it is very difficult for JISC to design programme meetings that meet the networking and sharing needs of all the project staff while still acheiving those pesky JISC objectives.

To help with this difficulty we decided to offer extra funding to project staff to enable them to host their own programme meetings. These meetings would be free from JISC interference (unless desired) and would be based around themes chosen by the project staff. The only restriction we placed on the meetings was that they had to include case studies and networking.

We got a healthy response to this idea and as a result the following meetings will be funded:

20th May 2008 - Differences between research repositories and repositories for learning and teaching purposes - DRAW project (University of Worcester).

June 2008 - From VLE to Repository: How do we do it? - CURVE project (University of Coventry).

June 2008 - Digital Curation and Preservation Projects Forum – Placing Ourselves in the Bigger Picture - The preservation strand of the Repositories and Preservation Programme.

September 2008 - The Impact of Organisational Culture on Repository Growth and Development - Embed project (Cranfield University).

November 2008 - Advocacy issues in populating institutional repositories - BURP project (University of Bradford).

November or December 2008 - Demonstrating and exploiting repository value - NECTAR (University of Northampton) and WRAP (University of Warwick) projects.

All dates may be subject to change.

The meetings are primarily for staff working on projects funded under the repositories and preservation programme however some people from outside may be invited and any spare capacity will be opened up to the wider community.

Further details will be blogged in due course.

Posted by: Andy McGregor