Research Data Sharing without barriers…get involved?

Research Data Alliance (RDA) – second plenary meeting – Washington DC – 16th – 18th September.

Many readers of this blog will know about the Research Data Alliance already, but there will, I guess also be a lot of people that don’t. I am using this post as an introduction to the RDA – having this week been to Washington DC to attend the second plenary meeting of the organisation.

What with all of the interest and some urgency around research data publishing, management and re-use, at Government level, at university level, disciplinary level; and of course with an eye on research being global, there is a need to join the data up with shared practices, standards, policies and infrastructure. That’s where the RDA comes in.

Building on initiatives such as Data One in the US, the initiatives across Europe, such as the Jisc research data activity, that take place in many member states & have collectively informed the EC’s direction on research data infrastructure as part of the forthcoming Horizon 2020- and the Australian National Data Service, the RDA has been formed. It’s been formed to address the ‘joining-up’ challenges and to build a global community that can contribute to shared practice and ultimately a more sustainable way to build an infrastructure and the intersections required to support data-driven research and innovation.

The founding members from funding type agencies are the US National Science Foundation (working also with Chris Greer from NIST), the European Commission and the Australian National Data Service (ANDS) – over the past year these partners have carefully consulted and built a community that is global and encourages bottom up sharing and agreement. I have been to some prior gatherings, and had discussions with Ross Wilkinson from ANDS, Carlos Morais-Pires from the EC, Juan Bicarregui from STFC, and others; and witnessed their planning and progress. In Europe engagement is overseen by RDA Europe, Norman Wiseman from Jisc is on the Strategic Forum that oversees this on behalf of the Knowledge Exchange/KE (KE do alot of work on Research Data!). It’s a big ask – forming a structure that can collaboratively take on progressing the research data challenge. And I have to say the meeting this week in Washington demonstrated pretty impressive progress.

So in short over the past year a set of working groups and interest groups have been formed to collectively work on key issues, and Washington was really the first time that they were there face to face to develop their work – there was a first plenary meeting from the 18th -20th of March 2013 in Gothenburg, Sweden where the initiative was formally launched and groups started to form their case statement for work – but in Washington these groups were able to show early outcomes and to form firmer priorities and plans.

So what are they (we) working on ? it’s a long list [see here for the current list -https://rd-alliance.org/working-and-interest-groups.html]. Some of the areas that the groups are tackling: metadata & a metadata standards directory; legal interoperability; data citation; a community capability model; persistent identifiers; practical policy; data foundation & terminology; big data and analytics & more – including interest groups that cover some disciplinary areas – such as agriculture and history and ethnography.

This Alliance is forming – but from what I experienced in Washington it certainly has a lot of potential and should be an essential vehicle to research data interoperability. In Washington this week, following the group discussions there was a plenary update from all of them highlighting their priorities (given in the grand setting of the US National Academy of Sciences) and Mark Parsons, RDA/US Managing Director facilitated a discussion on the scope and ways of working. It was a really useful discussion; and one where I think there was consensus that RDA isn’t a standards body but more of a clearing house for best practice, standards and approaches. So if you’re interested join up? I think it is an important initiative that will help to address the organisational,social and technical infrastructure required for real research data sharing. Jisc, and the Digital Curation Centre (DCC) are engaged in the initiative and will continue to be so; and we will tie in UK activities as best we can so we can learn from others and also input the lessons and emerging practice from the UK so we get to that utopia …a global research data infrastructure (note:there are many UK participants already).

We will continue to give updates on progress to try and keep people in the loop. But if it is your bag – go ahead and join in the discussions. Currently there are 800 members from over 50 countries, and I can say from having been there this week it’s an impressive crowd…

Yes it is early days – but it’s important and thus far very positive. Looking forward to seeing more progress – I think there will be!

How you can help us to make research administration more efficient

Jisc and CASRAI are piloting the development of a ‘UK chapter’ of the CASRAI dictionary to improve research interoperability.
Get in touch by emailing info@casrai.org if you’d like to keep up-to-date with progress and to contribute your views. More information is below.

The problem we are addressing
Research teams and administrators must retype the same information repeatedly when applying for grants and reporting to funders. Research policy-makers, managers and evaluators are consistently frustrated by an inability to draw meaningful conclusions from a growing mass of disconnected data. The problem and a way to reduce this administrative duplication for the research community are nicely illustrated in the CASRAI video:

The Solution
The solution suggested by CASRAI is compiling a common, international dictionary. The dictionary contains definitions of key terms or information elements which relate to the management of e.g. research grants, CVs or data management plans and documents controlled vocabularies, authoritative lists and identifiers that are relevant for these terms.  The dictionary thereby provides the basis for data profiles to ease the exchange of information within and between organisations.  As a single, open and unambiguous reference source for data profiles, the CASRAI dictionary can be used by multiple technology suppliers – including those using CERIF or VIVO – thereby forming a basis for interoperability and allowing information to be exchanged smoothly.

The ‘CASRAI approach’ to developing this dictionary and building agreement around key terms is generating more and more interest in the UK research community and we are excited about trialling it here in the UK.

CASRAI-UK
CASRAI and Jisc are piloting three National working groups (NWGs) around a number of priority areas identified at the CASRAI-UK summit organised by Jisc and CASRAI last December. These pilot projects are exploring both the methods and the particular content that is the focus of this work.

The people on the working groups (i.e. funders, research managers, standards experts) will identify and document agreements on vocabularies. While these agreements will build on and have defined relationships with an international core, they will reflect UK requirements. A CASRAI analyst will help to develop ‘data profiles’ which are defined  as a harmonized standard that specifies a subset of information required by the users of an inter-organisational work process.

This approach can prevent us from reinventing wheels and offer a sustainable home for these agreements –  for example also for the outputs of the Jisc UK Research Information Shared Service project.

Three pilot National Working Groups will focus on
1. Data Management Plans

2. Organisational Lists

3. Research Reporting

While other areas identified at the CASRAI-UK Summit – Ethics Review and Research Equipment profiles – are also important, we think that these need more discussion before we can convene and set the scope for working groups. For now, these discussions will be continued in a special online forum for each of these topics.

How it works and how you can get invovled
Due to the pilot status of the working groups, the CASRAI governance arrangements and membership model will not be applied to its full extent this year. An objective of the pilot is to develop – together with the people participating in the working groups – a mechanism that works appropriately for the UK.

At the same time as we are starting the working groups we are also convening the CASRAI-UK National Review Circle. This group includes a wider group of people that are interested in the progress of CASRAI-UK and is open for anyone interested to join.

The National Working Groups (NWGs) will, in the course of their work, produce drafts, announcements and other outputs. These will be posted to a dedicated forum for the National Review Circle. This wider group can keep up-to-date on progress but also contribute advice and feedback to the NWGs as they evolve new national standards for the UK research community.
The purpose of the National review Circle is to ensure that the resulting standards are applicable and multi-disciplinary and that valid diverged views are communicated.

Please email info@casrai.org if you are interested in joining the National Review Circle or the discussion forum on Ethics Review or Research Equipment profiles.

Bridging the Divide: The role of libraries in the sciences

http://www.flickr.com/photos/mikolski/3269902495/

While libraries come to terms with new forms of scholarly communication and the technological transformation of the academy, has one academic domain already drifted beyond reach?

Have the sciences already become self-sufficient in their information needs? Are libraries lacking in the services and information resources that scientists require?

In the first of three reports on Research Support Services for Scholars: Chemistry Project, a study being undertaken by Ithaka S+R on UK institutions, it is clear that within chemistry, and arguably the sciences more generally, a growing distance is developing between the everyday work of chemists and the library. As the report makes clear:

“This gap in mutual understanding prevents partnerships from developing between chemists and the library”

While the Chemistry Project is a researcher-centric approach to understanding the scholarly and information needs and requirements of Chemists, this first report update has taken the library and liaison services as a starting point.

The report is based on conversations with research support professionals (mostly liaison librarians) and has some very interesting headlines:

A few things strike me about the findings that have emerged so far from the library discussions:

How do you bridge the divide between the sciences and the services of the library? One potential answer might be that libraries shouldn’t – the relationship that currently exists works for chemists, and libraries need not expend resources on developing unnecessary and unused services.

Are graduate students the answer? There also seems to me to be an implication that something like a ‘hybrid’ researcher/librarian will develop. Is a convergence of subject knowledge and domain expertise going to be the future of library liaison?

Related to the above point is the idea of library services being embedded into the department. In the case of the group-model for chemistry departments and research this could be fruitful.

These interim findings should provide a nice complement (contrast) to the subsequent researcher based conversations and interviews, and it will be interesting to see if there are obvious opportunities for libraries and their engagement with the sciences.

Find out more about this project on the JISC webpages, and find out more about the role of libraries in the digital humanities in this recent post.

Beyond Grid vs Cloud – EGI Community Forum 2012

‘The grid? Shouldn’t they all be doing cloud computing now?’ As a JISC programme manager working with the National Grid Service (NGS) project people ask me this question more and more often. ‘Absolutely’ and ‘not at all’ is the seemingly contradictory answer I usually give, for instance the other week when I mentioned I would attend the European Grid Infrastructure Community Forum 2012 in Munich.

the delights of cloud and grid computing: cake

conference coffee break revelations: EGI delivers -- cloud, grid and cake!

I give this answer because the question originates from a double misunderstanding. The first is about the nature of cloud computing that, despite some marketing claims, is not the answer to everything and in some ways more a new business model than a new technology as such. The cloud is neither the solution for all computing needs, nor is it always the cheapest option – as a recent study commissioned by EPSRC and JISC (PDF) shows. The second misunderstanding relates to branding and the history of research computing. When projects like the National Grid Service were envisaged, grid computing was the dominant paradigm and their names reflect that. These days however, they are looking at a broad range of technologies and business models for providing compute resources, and mostly refer to themselves by their acronyms: NGS and EGI in this case. So at least for the initiated it was no surprise that the EGI conference was as much about the cloud as it was about the grid.

The conference, hosted by the LRZ supercomputing centre and the Technical University of Munich, was a five day event to bring together members of the research computing community from across and beyond Europe. With several parallel strands and many session to attend I won’t summarise the whole conference but instead pick out a few themes and projects I personally found interesting.

First of all I noticed there was a lot of interest in better understanding the use of e-infrastructures by researchers and, related to that, the impact generated by this. In some ways this is a straightforward task insofar as easy to capture and understand numbers can be collected. The EGI for instance now has over 20,000 registered users. You can count the number of cores that can be accessed, monitor the number of compute jobs users run and measure the utilisation of the infrastructure. However, this becomes more difficult when you think of a truly distributed, international infrastructure such as the EGI. Will national funders accept that – while the resources they fund may used by many researchers – much of that usage originates from abroad? If we want to support excellent research with the best tools available we have to make it as easy as possible for researchers to get access to resources no matter which country they are physically based in. Thinking in terms of large, distributed groups of researchers using resources from all over the world, often concurrently, the task of understanding what impact the research infrastructure has and where that impact occurs (leading to who may lay ‘claim’ it in terms of justifying the funding) can make your mind boggle. We need business models for funding these infrastructures that don’t erect new national barriers and address these problems from the angle of how to best support researchers.

Business models, not surprisingly, was another theme I was very interested in. Complex enough already, it is made even more difficult by commercial vendors now offering cloud nodes that for certain, smaller scale scenarios can compete with high performance computing – how do you fairly compare different infrastructures with different strengths and very different payment models? Will we see a broadly accepted funding model where researchers become customers who buy compute from their own institution or whoever offers the best value? Will we see truly regional or national research clouds compete against the likes of Amazon? What the conference has shown is that there are emerging partnerships between large academic institutions and vendors that explore new ways for joint infrastructure development. One example is a new project called ‘Helix Nebula – the Science Cloud’, a partnership that involves CERN, the European Space Agency and companies like T-Systems and Orange. Such partnerships may have a lot of potential, but finding legal structures that allow projects based in academia to work in a more commercial environment is not always easy. A presentation from the German National Grid Initiative explored some of these problems and also the question of developing sustainable funding models.

In order to develop good models for funding e-infrastructure we also need to understand the costs better. As far as institutional costs are concerned these are mostly hidden from the researchers, whereas the costs of commercial providers are very visible – but not always easy to understand in terms of what exactly it is you get for a certain price per hour. As our cloud cost study shows this is an area where more work needs to be done, and so I was happy to find a European project that aims to address this. e-FISCAL works on an analysis of the costs and cost structures of HTC and HPC infrastructures and a comparison with similar commercially offerings. It already lists a useful range of relevant studies and I hope we will see more solid data emerge over time.

In the commercial/public context I found it interesting to see that some academic cloud projects aim to take on commercial players. BeeHub, for instance, was presented as an academic DropBox. Now, to be fair to the project it aims to be an academic service for file sharing in groups and to address some of the concerns one might have regarding DropBox, but I wonder how successful they will be against such a popular offering.

I was also very interested to learn more about initiatives that address the critical question of training. Usually these are researcher training or more technically focussed programmes, but the EU-funded RAMIRI project offers training and knowledge exchange for people (hoping to be) involved in planning and managing research infrastructures.  Because of the increasing complexity of this task in terms of legal, cultural, technical and other issues better support for those running what often are multi-million projects is highly desirable.

As I cannot end this post without referencing the more technical aspects of research infrastructure let me point you to a project that shows that grid and cloud can indeed live together in harmony. StratusLab is developing cloud technologies with the aim of simplifying and optimizing the use and operation of distributed computing infrastructures and it offers a production level cloud distribution that promises to marry ease of use of the cloud with grid technologies for distributed computing.

To sum up, it is not a question of grid versus cloud. It is about selecting the technologies that are best suited to facilitate great research – and then do deal with the non-technical issues from training to sustainability and cultural change that will decide how well we will be able to make use of the potential the technology offers.

New Digital Infrastructure funding call available now

What better way to welcome the freshly rebranded Digital Infrastructure team blog than to announce a new funding call that spans nearly all the activities that the team is involved in.

The call is available now from the JISC site and the deadline for submissions is 12 noon on Monday 21st of November.

The call seeks projects in the following areas:

  • Resource Discovery – up to 10 projects to implement the resource discovery taskforce vision by funding higher education libraries archives and museums to make open metadata about their collections available in a sustainable way. Funding up to £250,000 is available for this work.
  • Enhancing the Sustainability of Digital Collections – up to10 projects to investigate and measure how effectively action can be taken to increase the prospects of sustainability for specified digital resources. Funding of up to £500,000 is available for this work.
  • Research Information Management – 3 projects to explore the feasibility and pilot delivery of a national shared service for the reporting of research information from Research Organisations to funders and other sector agencies, to increase the availability of validated evidence of research impact for research organisations, funders and policy bodies, and to formally evaluate JISC-funded activities in the Research Information Management programme and to gather robust evidence of any benefits accruing to the sector from these activities. Funding of up to £450,000 is available for this work.
  • Research Tools – 5 to 10 projects on exploiting technologies and infrastructure in the research process as well as innovating and extending the boundaries to determine the future demands of research on infrastructures. Funding of up to £350,000 is available for this work.
  • Applications of the Linking You Toolkit – Up to 10 projects investigating the implementation and improvement of the ‘Linking You Toolkit’ for the purpose of demonstrating the benefits that management of institutional URLS can bring to students, researchers, lecturers and other University staff. Funding of up to £140,000 is available for this work.
  • Access and Identity Management – 5 to 10 projects investigating the embedding of Access and Identity Management outputs and technological solutions within institutions. Funding of up to £200,000 is available for this work.

As always, JISC programme managers are keen to speak to prospective bidders. We’re always keen to talk ideas through and clarify the finer points of the call document. We have set aside 2 specific days for these conversations, the 26th and 27th of October so if you are considering a bid, please do get in touch to arrange a conversation. If those days aren’t good for you then my team mates and I will be happy to arrange alternative times.

This is always an exciting process for JISC staff as we get to hear lots of exciting ideas so I’m really looking forward to seeing what you clever people  come up with this time.

Understanding the Information Support Needs of Scholars

Today sees the start of a new programme of work that’s exploring Research Support Services for Scholars (RSS4S) being undertaken by Ithaka S+R.

As the blog for the projects makes clear:

This series of discipline-specific projects aims to provide critically needed research about the evolving behavior of scholars to the information support service providers who work with them.

JISC is funding one of the projects in this programme that’s examining Chemistry. This project is attempting to understand the impact that new and innovative practices in this domain are having on traditional service providers that historically supported their work.

The changing information environment has resulted in gaps developing between researcher needs and the services available to support scholars.

Chemistry is a particularly interesting subject to explore, it is one of the hard sciences that are often seen as difficult for traditional support services, such as the library, to meet its information needs. Furthermore, it is a subject area that JISC has a long engagement with, for example: Chemistry Using Text Annotations and Chemical Laboratory Repository Notebooks projects.

In addition to the JISC funded Chemistry project Ithaka have been funded by the National Endowment for the Humanities(NEH) to explore the subject of History.

The intention is to have a number of other disciplines as part of the programme, so that an overview of researchers can be obtained and recommendations can be developed as to how service providers can best adapt to support the information needs of scholars.

What makes this programme of work even more important is that funding bodies in both the UK and US recognise the need for work to explore the evolving research practices and support needs of scholars. With a researcher-centric approach this work will ensure that the requirements of the user (in this case the researcher) are at the heart of support service developments.

Without this evidence the information support service providers who work with researchers, such as academic libraries, digital/data centers and scholarly societies will be unable to ensure that scholars in the UK and internationally are best situated to continue their world class research.

Chemists at Work, The University of Iowa Libraries

To find out more about this programme you can read the project page on the JISC website, or follow the Ithaka project blog.

Glimpse into the Future of Repositories: videos now available!

DevCSI Challenge @ Open Repositories 2011

http://devcsi.ukoln.ac.uk/blog/dev-challenge-or11/

As usual the standard of the entrants were very high and the solutions were diverse.  There was also high energy and an infectious buzz in the room during the presentations!  See videos at http://devcsi.ukoln.ac.uk/blog/2011/07/29/or11-developer-challenge-videos/

JISC Prize:

First:

“Repository as a Service (RaaS).  Stuart Lewis, Kim Shepherd, Adam Field, Andrea Schweer, and Yin Yin Latt (University of Auckland, DSpace Committers, EPrints services and the library Consortium of New Zealand.

Repository as a Service (RaaS) is the idea that the repository is a commodity which provides a service. In order for current repositories to act like this they need standard interfaces to get data in and out.  Once these standard interfaces are in place, the repository becomes a commodity which can be swapped in and out, and the ‘repository service’ can be provided by many repositories or one.  The entry demonstrated an Android mobile app that used SWORD to deposit photos into both DSpace and EPrints.  Then using solr indexes as a common interface for getting access to the items in the repository, a tool called Skylight was demonstrated that could display the repository collections.  Identical experiences were provided by both EPrints and DSpace because of the common interfaces in and out.  In addition, the repository as a commodity was shown to be useful for providing further services – examples including translating the content of the repositories using the Microsoft Translation API, and extracting geo-location data from GPS-tagged photos.  The idea for RaaS was conceived and worked up during the conference and it demonstrated strong collaboration and agile development.

JISC Runners up:

“Distributed Research Object Creator” D-ROC Patrick McSweeney and Matt Taylor, University of Southampton

D-ROC is a data driven interface collating resources which already exist on the web to tell a story of research from the research object creators perspective. The author uses a tool to explain how resources from web sources like institutional repositories, slideshare, data repositories, youtube and other online sources are linked together to make up a full piece of research. Behind the scenes this makes an RDF linked data document which could be reused in a number of ways. For their competition entry Patrick and Matt chose to make a data driven website which aggregates attention metadata (views, dowloads, citation counts) from the various web sources but they invision far wider scoped applications for this kind of rich data. One of the key selling points is that a user can imediately see value from there time invested using to tool. To be able to design a project website in half an hour illustrates the power of the tool.  http://blogs.ecs.soton.ac.uk/oneshare/tag/erevnametrics/

Microsoft Prize:

“Dynamic Deep Zoom Images and Collections with Djatoka” – Rebecca Sutton Koeser, Emory University Libraries

This entry used the Microsoft and Deep Zoom and Pivot applications on top of special image collections in their Fedora repository.  This has wider application to other image-based repository collections and it was impressive to see what was achieved in the time constraints of the developer challenge.

Special mention goes to Sam Adams from Cambridge University for his use of the PIVOT tool over the chempound semantic data repository (JISC Clarion project) which allows rich domain access to physical science data.

Special mention goes to Dave Tarrant from Southampton University for using the XBOX Kinect technology to drag and drop items into ePrints.  It was very ingenious and entertaining watch.

Use of SWORD prize:

RaaS  – same as above.  The project produced a SWORD App for Android mobile devices to allow photos to be deposit from smartphones.  The potential for this implementation as a mobile deposit device is fairly extensive, potentially allowing for geo location, orientation, audio, video, stills to all be recorded to an archival location in near real time, or to enable ‘citizen science’ via data collection from thousands of remote devices.  http://www.appbrain.com/app/sword-share/org.skylightui.swordshare

Thank you to:

Marketing and other dirty words

I have been thinking a lot recently about how to move beyond the rhetoric of “open equals good” towards identifying where open approaches help us meet key business cases. A notable quote from the Power of Open book launch was that “open isn’t a business model, its a part of a business model”. I’m seeing this trend in open educational resources, open access repositories and open innovation. It’s how open source became more mainstream, and we need to be learning from that journey. If we want to see open approaches sustained, we need to get businesslike about how make the case, however contradictory that might sound.

Earlier this month I spoke at a UKOLN event on metrics and the social web, and the discussion there reinforced the potential of using the web more effectively to underpin our key business goals in further and higher education.

On 26th July I am presenting at the Institutional Web Managers Workshop 2011 and I will be developing this theme further, paying particular attention to the way that web managers can support open access, open educational resources and open social scholarship.

In reflecting on how open access and OER can contribute to the core business cases of universities, I think that activities particularly worthy of more attention include:

  • Profiling academic expertise
  • Supporting REF impact metrics
  • Enhanced research publications
  • Cross-linking open content to open course data
  • Social media listening tools
  • Web analytics and visualisation

My presentation on slideshare: Marketing and other dirty words

Upcoming funding opportunities

My colleagues and I in the digital infrastructure team are currently knee deep in preparations for releasing a number of funding calls at the end of July.

The calls will cover 4 areas:

Outline details of funding amounts and descriptions of the calls can be found on the JISC roadmap of future grant funding calls.

We’re in the final throes of getting the calls ready for release. Questions are very welcome but for now some of the answers may have to be wait and see…

jiscUX: Usability and ‘Learnability’ projects funded

JISC has recently funded a number of projects as part of the grant funding call 01/11: Digital infrastructure: Embedding usability & improving the uptake of resources & tools.

Most projects have just started, and over the next few weeks they will begin publishing their project plans and initial blog posts.

The support project at Southampton has already published their first post, and it gives an excellent overview of the aims and objectives of that project and the resource it is building for the HE sector.

Below are some details about the projects, these will be complimented by a webpage on the JISC site shortly.

If you would like to find out more about each project, and the programme of work then there are a number of presentations from the recent programme meeting available here.

Strand A (Support Project):

Strand B projects (Usability):

Strand C (Learnability and adaptability)

Next Page →