Category Archives: data

The value and impact of the British Atmospheric Data Centre (BADC)

Jisc[i] in partnership with NERC[ii]  have commissioned work to examine the value of impact of the British Atmospheric Data Centre (BADC). Charles Beagrie Ltd, the Centre for Strategic Economic Studies Victoria University, and the British Atmospheric Data Centre are pleased to announce key findings for the forthcoming publication of the results of the study on the value and impact of the British Atmospheric Data Centre (BADC). The study will be available for download on 30th September at:

Key findings:

The study shows the benefits of integrating qualitative approaches exploring user perceptions and non-economic dimensions of value with quantitative economic approaches to measuring the value and impacts of research data services.

The measurable economic benefits of BADC substantially exceed its operational costs. A very significant increase in research efficiency was reported by users as a result of their using BADC data and services, estimated to be worth at least £10 million per annum.

The value of the increase in return on investment in data  resulting from the additional use facilitated by the BADC was estimated to be between £11 million and £34 million over thirty years (net present value) from one-year’s investment – effectively, a 4-fold to 12-fold return on investment in the BADC service.

The qualitative analysis also shows strong support for the BADC, with many users and depositors aware of the value of the services for them personally and for the wider user community.

For example, the user survey showed that 81% of the academic users who responded reported that BADC was very or extremely important for their academic research, and 53% of respondents reported that it would have a major or severe impact on their work if they could not access BADC data and services.

Surveyed depositors cited having the data preserved for the long-term and its dissemination being targeted to the academic community, as the most beneficial aspects of depositing data with the BADC, both rated as a high or very high benefit by around 76% of respondents.

The study engaged the expertise of Neil Beagrie of Charles Beagrie Ltd and Professor John Houghton of Victoria University, to examine indicators of the value of digital collections and services provided by the BADC.

The findings of this study are relevant to the community attending the conferences below hence the announcement.

13th EMS Annual Meeting & 11th European Conference on Applications of Meteorology (ECAM) | 09 – 13 September 2013 | Reading, United Kingdom

2013 European Space Agency Living Planet Symposium

The British Atmospheric Data Centre (BADC)
The BADC, based at the STFC Rutherford Appleton Laboratory in the UK, is the Natural Environment Research Council’s (NERC) Designated Data Centre for the Atmospheric Sciences. Its role is to assist UK atmospheric researchers to locate, access, and interpret atmospheric data and to ensure the long-term integrity of atmospheric data produced by NERC projects. There is also considerable interest from the international research community in BADC data holdings.


Performance and Measurement in Libraries

In his article in the New York Times, Robert Crease wrote:

We look away from what we are measuring, and why we are measuring, and fixate on the measuring itself.

For libraries, so used to collecting, managing and analysing various sets of data and metrics, this is a critical point.

It is also a sentiment that kicked off the 10th Northumbria conference on Performance Measurement in Libraries held in York earlier this week.

Elliot Shore from ARL (Association of Research Libraries) spoke about the need for libraries to take heed of this advice: To focus on the ‘fit’ of what we’re measuring. 

This fit, as Shore calls it, has been evolving over the past 10 years as the role and presence of the library has changed. The digital environment and changing technologies and expectations of users means that what was once important to measure and capture may no longer have the same urgency. 

This focus on what should be measured – and how it impacts on the role and shape of the library – was developed in a great talk by Margie Jantti at the University of Wollongong in Australia.

Margie talked about the constant flow of information and data that her staff (relationship managers) get from the researchers and academic staff, which is used to tailor services and focus resources on priority services. This has seen the library develop expertise in publication support for researchers by the library.

The large knowledgebase of data the library collects on its users enables it to punch far above it’s weight: helping develop a fast; agile and world-class library team.

Finally, one thing that emerged from a majority of the presentations during the conference was the increasing recognition that data and metrics from inside or about the library were no longer enough. The field from which the data and metrics is harvested is growing, and reaching further beyond the library. Into the teaching and learning space through to research, registry and student services and beyond.

The idea that library performance and measurement requires only data from the library – or within the immediate vicinity of the library – is no longer an option.

So, it was against this background that the Library Analytics and Metrics Project (LAMP) presented at the conference.

We provided some of the background to the project (where it has come from and the work that has led us to this point) and provided an overview of the work so far and how you can get involved and follow the progress of the project.

For me, what’s really interesting, is that LAMP has the potential to bring in data from across the institution (and beyond) to help inform decision making and how and where resources are allocated. It also takes away the burden of collecting the data and provides the space for libraries to act on the data, and to think strategically about what they want and should be measuring and analysing.

The conference was also useful in bringing to my attention LibQual, and the potential for LAMP to work with that data too (although this may be something for further down the development pipeline).

You can find a link to our presentation here. At the end are some ways that you and your library can get involved – so do feel free to get in touch.


Observing the Web

Like a lot of people, when I think about it, or when I’m reminded about it, I understand that the Web is a place where someone is always watching what you do. I understand that … but then I think, well … the Web is such a huge beast; such a vast ocean; such a giant metropolis where the comings and goings of individuals are insignificant. How and why would anyone notice what I’m looking at and which links I’m clicking on?

Then up pops Tom Barnett from Switch Concepts Ltd. at a meeting yesterday to tell us that ‘Google has a file the size of an encyclopedia on everyone in this room.’

Hmmm … that’s not a particularly comfortable idea for someone to put in your head. I start to feel a vague sense of paranoia creeping through my mind.

And then I think, c’mon Neil, pull yourself together! Google really doesn’t care who you are. They just want to put things in your line of sight that are more rather than less likely to get you to open your wallet and part with your wages!!

Such were the thoughts that were buzzing around my head yesterday at an event organised by the Web Science Trust (

The meeting was entitled ‘Observing the Web’ and the purpose was to highlight some of the work that the Web Science Trust and their partners and collaborators are doing to build a global network of Web Observatories providing an open analytics environment to drive new forms of Web research. We went round the room doing introductions and Dame Wendy Hall ended up branding us a ‘motley crew’. Academics, industry players, not-for-profits, technologists, funders, charities, a lawyer. (Quite a respectable looking motley crew in the very smart surroundings of the Royal Society I might add). But ‘motley crew’ felt about right for a topic and a collaborative, academic, open activity that is still exploring the territory and testing new ground. Presumably in contrast to the well-resourced, sophisticated and highly developed (but opaque) methods employed by the corporate observers of the Web (Facebook, Amazon, Google, Microsoft, Yahoo etc.).

The point of all of this ‘observing’ is not to try and take account of every little bit of data and content on the web, but rather to understand what the aggregated use of the Web can tell us; how trends and fashions and changes of behaviour in relation to the Web might illuminate aspects of our society and culture, both now and for future students and researchers.

This was all of great interest to Jisc. We are currently working with the British Library, the Oxford Internet Institute and the Institute of Historical Research on an initiative that aligns very well with the notion of the Web Observatory.

The Big Data project (


the AADDA project (

are both using a copy of the Internet Archive’s collection of UK domain websites collected over the period 1996-2010, to examine new ways to engage with the web at domain level, and develop new forms of research that leverage the scale of the web. As the name of the Oxford project says … it’s all about using ‘Big Data’.

This was work that emerged from influential JISC-funded reports commissioned in 2010 –

Researcher Engagement with Web Archives

As we heard at the meeting, the academic observatory is a very different proposition to the corporate observatory and comes with enormous challenges including: interoperability (how do we link observatories?); access (asides from Twitter which of the big corporates will let us use their data?); privacy (will people feel spied upon?); and sustainability (what is the business model?).

A fascinating meeting and big topic. There will be more discussion in Early May at the ACM Web Science Meeting in Paris.



Show us something cool

Recently the library, museum and archive world has taken to experimenting with open data with a vengeance. It seems an interesting new dataset is released under an open licence most weeks.

There are many motivations behind these data releases but one of the major ones is the hope that someone else will think of something cool to do with the data (to mangle a Rufus Pollock quote).

Well, all you someone elses are in luck. The JISC Discovery programme and the DevCSI project are running a competition to see what clever people can do with this open data.

The rules of the competition are laid out in detail on the Discovery site but in essence all that’s needed to enter the competition is to develop something using one of 10 recommended datasets. You can use other datasets too but you have to do it in conjunction with one or more of the 10 datasets listed on the Discovery site.

I’m probably revealing my nerdy librarian hand here but the 10 datasets are really rich and exciting:

  • There is library data from the British Library, Cambridge and Lincoln
  • There is archives data from the National Archives and the Archives hub;
  • Museum data from the Tyne and Wear Museums collections
  • English Heritage places data
  • Circulation data from a few UK university libraries
  • The musicnet codex
  • And search data from the OpenURL router service

Details on all of these are listed on the Discovery site.

There are 13 prizes to be won so there is every incentive to enter even if you are somehow able to resist the siren call of all that exciting data!

The competition is open now and closes on the 1st of August.

Exploiting institutional activity data

JISC is currently funding a range of projects that investigate how data stored in institutional systems can be mined to gain insights into the way that university services are operating and use those insights to improve the services. These projects are spread across two programmes, the activity data programme and the business intelligence programme. There are a few other projects working in similar areas spread across other programmes.

Last week we took the opportunity to bring most of these projects together to discuss their various approaches and to think about what else JISC can do to help universities make the most of their activity data.

We started the event with lightning talks from each project attending. I suspect faithfully listing all of those projects here would probably make for a gruelling reading experience. So instead I’ll group them into the broad motivations the projects are pursuing. Some projects fall under more than one category. I have linked to the presentations the projects gave, if I do not have the slides I have linked to the project website.

Some projects are reusing data about user behaviour to provide new or enhanced user experiences

Some are mining data to gain insights into behaviour of people or systems in the institution to allow better resource allocation and intervention at crucial periods
LIDP, Supporting institutional decision making, Bringing corporate data to life, Lumis, IN-GRiD, Retain, Student Engagement Traffic Lighting

Others are visualising data to explore its meaning
LIDP, AGtivity, Bringing corporate data to life

A few are thinking about how various silos of data can be brought together to allow them to be mined for insight
UCIAD, Supporting institutional decision making, Bringing corporate data to life

A couple are looking at national services to help institutions explore data more easily or to enable reuse of existing data in novel ways
Using openURL router data, JUSP

The projects cover a vast range of areas from libraries to student management to environmental monitoring. Despite this breadth there are some common issues. These are the issues that jumped out at me on the day:

  • Not all institutions have people with the technical skills and the statistical skills required to manipulate and analyse data .
  • A lot of these datasets are large and that brings up issues of how to store and manipulate the data and how do you decide what to retain.
  • Institutions might need to take an institution wide strategic approach to deciding what data should be collected, how it should be exploited and by who. There also needs to be long term strategic approaches to data exploitation in departments like libraries.
  • Working across different silos of activity data is a problem we are only just beginning to face.
  • Language is an issue. Throughout this blog post I have used the term activity data, but this is a generic term we need to be clearer about what type of data we are talking about.

To help ensure that others can benefit from the lessons the projects learn there are synthesis projects for the activity data programme and the business intelligence programme. The purpose of the synthesis projects is to gather information on key issues and turn them into advice and guidance than anyone in the HE sector can use to inform the way they do things at their institution. Infonet are producing a business intelligence infokit for their synthesis project. The university of Manchester and Sero Consulting are producing the activity data synthesis. You can read the activity data synthesis blog which talks about progress so far and can also see a mindmap that describes the areas their final website will cover.

We ended the day with a discussion of the possible ways that JISC could look to address some of these issues. We produced a long list of very good ideas. We also had a go  at prioritising which were the most pressing or valuable. Our top 6 ideas were:

  1. Developing guidance for institutions on taking a strategic approach to exploiting activity data
  2. Addressing the need for new skills for exploiting activity data both from a technical perspective and from a statistical skills perspective
  3. Establish clear definitions for terms used – this could include a simple glossary and use of examples to illustrate the terms
  4. Developing a culture of exploiting data in institutions
  5. Exploring what’s involved in ensuring data is easy to reuse
  6. Study behaviour and how it relates to usage patters

JISC won’t be able to address all of these issues straight away, so my colleague Myles Danson and I will have to decide which we focus on. Comments and advice would be very welcome indeed!

After the meeting Mathieu from the UCIAD project wrote a very interesting blog post about the need to take a user centered approach to activity data so that’s something we’ll need to consider too.

Data Management Policy – An Interview with Paul Taylor

Dr. Paul Taylor works at the University of Melbourne and has just finished a 2 week secondment in the UK with the JISC-funded EIDCSR (Embedding Institutional Data Curation Services in Research) project based in Oxford. This is an approximate transcript of a quick 5 minute interview between Paul and Neil Grindley (JISC Information Environment Programme Manager)

Hi Paul, thanks for sparing the time out of a very busy schedule … what role do you have in the EIDCSR project?

Thanks Neil … I’m here to help them come up with a draft policy for the management of research data and records. It’s something we’ve had in place at the University of Melbourne since 2005 and we’ve just completed a revision of the policy to hopefully help make it a little more useful for researchers.

Tell us a little bit more about how that policy has been developed at the University of Melbourne and the reactions to it from researchers and data managers.

As I said, we’ve had policy in place since 2005 and early this year we were asked to work out how compliant we were with it, on the basis that if you have a policy and no-one pays any attention to it, its probably not much use keeping it there! Not surprisingly, we found out that most people weren’t compliant and also didn’t really know that the policy was there. We’re hoping that was the reason that they weren’t compliant rather than any sort of animosity against policies in general – but that’s still to be determined.

We reviewed the policy for two reasons: firstly to try and make it of more use to researchers (… there’s limits to that because when you are writing a policy to go across the institution, it has to contain really high level principles about the management of research data. If you get too specific you rule large populations out and then people pay even less attention to it than they did before). Secondly, its to get some attention and a bit of refocus on the data management area. There are a lot of things happening at the university at the moment in terms of the services that the university intends to provide for it’s researchers and some other changes in the Australian environment. We’re hoping to lock the high-level principles away in policy documentation and focus on keeping the guidance, information and support materials up to date and relevant for researchers.

The sustainability of keeping that guidance and information for researchers up to date is a real issue. Capturing their feedback and working it back into future iterations of those materials (and ultimately the policy documentation) is a desirable outcome but also a big challenge isn’t it?

Yes, it is.

How do you think that the policy that you’ve developed in Melbourne transposes to the University of Oxford?

That’s a good question … one of the things that we’ve learnt from the 2005 version of the policy is that its not enough to have the central policy on its own. There needs to be some kind of localisation of the policies and so with this new version of our policy we’ll be asking faculties to come up with their own enhancements so that it makes more sense to their researchers, and then probably get departments to do the same thing. I’d imagine the same sort of system could work at Oxford but it would be a little more complex with the number of people that would need to be involved in coming up with these localised versions of the policy. The hope is that there will be a trickle down effect from the high-level policies which have a practical influence on the way that researchers go about managing data.

In the meetings that I’ve had since I’ve been here, there have been some excellent examples of data managers and data management researchers (I guess you’d call them) who are working closely (one-on-one) with researchers who have come up with some excellent and novel solutions. I think the more that that can happen – a sort of resourcing at the coal face – then the more likelihood there is of high level principles trickling down to meet some of the very local one-on-one researcher-based developments. At that stage, perhaps there would be a general improvement in the management of research data across the institution.

One of the things I’ve heard a lot from people is the need for it to be a federated system. A lot of the departmental research groups have come up with their own systems for managing their own research data. Anything new that is provided centrally from the university has to try and complement those processes rather than take them over. That wouldn’t work well here (in Oxford) and it wouldn’t work in Melbourne. It would tend to antagonise people rather than improve the situation.

Yes … that principle of embedding existing processes and workflows into broader policy initiatives is an important concept for institutions grappling with these kinds of issues at the moment. Thanks very much Paul.


University of Melbourne – Policy on the Management of Research Data and Records (2005)

Review of Policy on the Management of Research Data and Records (2009)

EIDCSR Project (Embedding Institutional Data Curation Services in Research)