DevCSI Challenge @ Open Repositories 2011
As usual the standard of the entrants were very high and the solutions were diverse. There was also high energy and an infectious buzz in the room during the presentations! See videos at http://devcsi.ukoln.ac.uk/blog/2011/07/29/or11-developer-challenge-videos/
“Repository as a Service (RaaS). Stuart Lewis, Kim Shepherd, Adam Field, Andrea Schweer, and Yin Yin Latt (University of Auckland, DSpace Committers, EPrints services and the library Consortium of New Zealand.
Repository as a Service (RaaS) is the idea that the repository is a commodity which provides a service. In order for current repositories to act like this they need standard interfaces to get data in and out. Once these standard interfaces are in place, the repository becomes a commodity which can be swapped in and out, and the ‘repository service’ can be provided by many repositories or one. The entry demonstrated an Android mobile app that used SWORD to deposit photos into both DSpace and EPrints. Then using solr indexes as a common interface for getting access to the items in the repository, a tool called Skylight was demonstrated that could display the repository collections. Identical experiences were provided by both EPrints and DSpace because of the common interfaces in and out. In addition, the repository as a commodity was shown to be useful for providing further services – examples including translating the content of the repositories using the Microsoft Translation API, and extracting geo-location data from GPS-tagged photos. The idea for RaaS was conceived and worked up during the conference and it demonstrated strong collaboration and agile development.
JISC Runners up:
“Distributed Research Object Creator” D-ROC Patrick McSweeney and Matt Taylor, University of Southampton
D-ROC is a data driven interface collating resources which already exist on the web to tell a story of research from the research object creators perspective. The author uses a tool to explain how resources from web sources like institutional repositories, slideshare, data repositories, youtube and other online sources are linked together to make up a full piece of research. Behind the scenes this makes an RDF linked data document which could be reused in a number of ways. For their competition entry Patrick and Matt chose to make a data driven website which aggregates attention metadata (views, dowloads, citation counts) from the various web sources but they invision far wider scoped applications for this kind of rich data. One of the key selling points is that a user can imediately see value from there time invested using to tool. To be able to design a project website in half an hour illustrates the power of the tool. http://blogs.ecs.soton.ac.uk/oneshare/tag/erevnametrics/
“Dynamic Deep Zoom Images and Collections with Djatoka” – Rebecca Sutton Koeser, Emory University Libraries
This entry used the Microsoft and Deep Zoom and Pivot applications on top of special image collections in their Fedora repository. This has wider application to other image-based repository collections and it was impressive to see what was achieved in the time constraints of the developer challenge.
Special mention goes to Sam Adams from Cambridge University for his use of the PIVOT tool over the chempound semantic data repository (JISC Clarion project) which allows rich domain access to physical science data.
Special mention goes to Dave Tarrant from Southampton University for using the XBOX Kinect technology to drag and drop items into ePrints. It was very ingenious and entertaining watch.
Use of SWORD prize:
RaaS – same as above. The project produced a SWORD App for Android mobile devices to allow photos to be deposit from smartphones. The potential for this implementation as a mobile deposit device is fairly extensive, potentially allowing for geo location, orientation, audio, video, stills to all be recorded to an archival location in near real time, or to enable ‘citizen science’ via data collection from thousands of remote devices. http://www.appbrain.com/app/sword-share/org.skylightui.swordshare
Thank you to:
- University of Texas at Austin for hosting or11 and supporting DevCSI.
- Microsoft Research for supporting DevCSI
- Mahendra Mahey for organising the event
- Peter Sefton for supporting the event and chairing the presentations and keeping the judges in order
The information environment programme 2009-11 (mercifully shortened to inf11) is drawing to a close and we are starting to reflect on what it has achieved.
We chose to manage this programme as one very broad programme rather than a number of smaller programmes and it has included work on:
- Activity data
- Automatic metadata generation
- Infrastructure for resource discovery
- Repositories – enhancement, take up and embedding and improving deposit
- Linked data
- Scholarly communication
- Rapid Innovation
- Library management systems – includes work on a shared ERM system with SCONUL
- Research Information management
- Developer community
This represents a lot of work that has produced some exciting outputs and interesting results. To try and help people see what outputs and results are relevant to them, we have prepared a list of 27 questions that the programme has addressed or started to address. This was put together by Jo Alcock from Evidence Base who are evaluating the programme.
The programme won’t finish until July so we will continue to add to these questions. If you have any suggestions for things to be included, please let me know.
For our next programme of work we will have 4 separate programmes:
- Information and Library Infrastructure
- Research Management
- Digital Infrastructure Directions
We will be blogging more about these programmes soon.
NOTE TO READER: JISC IS CURRENTLY IN THE PROCESS OF DRAFTING A CALL FOR PROPOSALS TO FURTHER EMBED DEPOSIT TOOLS AND SOLUTIONS INTO THE AUTHORS DAY-TO-DAY WORKBENCH. PLEASE SUBSCRIBE TO ONE OF JISC’S MANY FUNDING ANNOUNCEMENT FEEDS FOR FURTHER INFORMATION ON THIS CALL.
Published by: David F. Flanders (JISC Programme Manager)
Just before I sat down to write this post, I quickly went back to have a look at the originalSWORD (Deposit API) Project to look up when the first draft specification was published, to my amazement version 1 was published *exactly* two years to the date of the “Deposit Tool Show & Tell” event: 12 October 2007. And quite significantly (as you’ll see below), there are well over twenty different applications and deposit tools built atop the SWORD Deposit API since that first 1.0 publication. So, CONGRATULATIONS ON YOUR TWO YEAR ANNIVERSARY SWORD! A little tip of the hat to Rachael Heery who brought a bunch of us hackers to sit around a table to talk about how deposit could be improved, your focus and drive in this space is missed.
The show (and tell) -must of course- go on, accordingly here is agenda for the day along with the people who attended. The rest of the story is picked up by our blogger-on-the-day Bashera Kahn:
12 October 2009, London, UK. JISC held a one-day Barcamp at the University of London focusing on author deposit tools, ahead of the DSpace User Group Meeting at the University of Gothenburg in Sweden.
The Deposit Show & Tell event is one of the first steps in JISC’s plan to invest £300,000 in sustained improvements to author deposit tools. It followed the September 2009 JISC report into how and why UK researchers publish and disseminate their findings, which provides an excellent contextual backdrop to the challenges facing the architects and users of repositories and deposit tools.
‘DepoST’, as it was tagged, brought together developers and stakeholders from across the UK and Europe who have already broken ground on creating and refining author deposit tools and interfaces.
Several lightning-fast rounds of demonstrations proved that the development space in this area is thriving, with a strong focus on making the deposit process quicker and easier for users authoring research content, from academics to students, librarians to archivists and curators.
JISC’s David F. Flanders stressed in his welcoming address the importance of adding improved ‘feedback loops’ to the deposit process, to provide authors with more information during and after the process than just ‘Okay’.
Flanders mentioned a few patterns he’d observed in the showcased tools which adopted workflows and interactions that would be familiar to users from commonplace computing or online experiences, such as:
- Drag & Drop
- Upload and add, as popularised by the Flickr Uploadr and other such upload tools
- Machine-assisted, e.g. a deposit tool that crawls the user’s HD for files to deposit
- Network drive e.g. a tool that allows the user to ‘map’ the folder containing papers or accompanying media
- Contextual community dashboard which draws on the ancillary information around other researchers in a particular subject area to create a view of the research community around that subject area
- Tools embedded into existing applications, e.g. Microsoft’s Chem4Word project to support the authoring and rendering of semantically-rich chemistry information in Word 2007 documents.
<–!DFF: The twenty some, short and fast (“lightning talk”) ‘show and tell’ presentations followed with five minutes a piece to SHOW their app, with five minutes ‘question and TELL’ following:
Shown & Told:
- Tool: DepositAir IE Demonstrator
- Works with: SWORD, DSpace
- Platforms/Languages: Adobe AIR, SQLite, Ruby on Rails
- Description: DepositAir is an Adobe AIR application which borrows its look and feel from the Flickr Uploadr. The user drags and drops the files to deposit from the source folder to the application. DepositAir auto-populates metadata fields such as title, ISSN, publisher, author name, and then sends the files and metadata to dspace.swordapp.org.
- Tool: ePrints 3 Upload Handler plugin
- Works with: ePrints, SWORD, Microsoft Word, Microsoft PowerPoint
- Platforms/Languages: OpenXML
- Description: The development roadmap for ePrints 3.2 is focused on a more modular experience with better desktop and cloud integration. The plug-in works with Microsoft Word 2007 and Powerpoint to extract metatdata and media during the deposit process. Although the current extraction process is inline, the plan is to make it an unobtrusive background operation.
(3) Pat McSweeney, ePrints project developer, University of Southampton
- Tool: PDFMetaExtractor
- Works with: ePrints
- Platforms/Languages: Java, OO-Perl
- Description: This tool searches the user’s computer for PDFs and then intelligently extracts metadata as well as keywords specified within the document. A known issue is that non-native PDF documents (e.g. those converted from Microsoft Word documents or scanned from paper) may return incomplete information.
(4) Peter Sefton, eScholarship Tech Team Manager, University of Southern Queensland, Australia
- Tool: ICE (Integrated Content Environment)
- Works with: Microsoft Word 2007, OpenOffice, Zotero, WordPress
- Platforms/Languages: Windows, Mac, Ubuntu
- Description: ICE lets you create web and print documents from a word processor. You can use Microsoft Word, or the free OpenOffice.org. Peter demonstrated the ICE toolbar in Word, uploading the document as styled HTML to an ICE server and then publishing to a WordPress blog. The tool is especially useful for thesis supervision, as it allows comments and annotations to be made without changing the content of the document.
(5) Richard Jones, Symplectic Limited
- Tool: Dashboard deposit in ‘Publications’ product
- Works with: DSpace, SHERPA/RoMEO, all major digital repository technologies
- Description: Symplectic’s tools to link the Repository module of the Symplectic Publications Management System to digital repositories using all major digital repository technologies. Users can upload full text documents and supporting information directly from the Symplectic Publications interface. Copyright guidance is collected automatically from SHERPA/RoMEO and made available to users. A stand-out feature is that the author provides distribution rights information only if it’s available and/or necessary; the system doesn’t mandate that this information is present.
(6) Alex Strelnikov, UKOLN
- Tool: Email-based deposit plugin for SWORD
- Works with: SWORD
- Description: The premise of this deposit tool is to encourage take-up and use of ‘1-click’ deposit tools by embedding them in trusted and frequently used applications, like email, or Facebook. The user can deposit papers by attaching them to an email and sending to a pre-defined email address. The plugin checks for an attachment, and if found, sends it to an analysis server where metadata is automatically extracted. Future development roadmap includes support of email threads.
(7) Jan Reichelt, Mendeley
- Tool: Mendeley
- Works with: PubMed, CrossRef, Google Scholar, ACM, IEEE and others
- Platforms/Languages: Windows, Mac, Linux
- Description: Described as “Last.fm for research papers”, Mendeley is more a workflow productivity tool rather than repository tool. It is a free research management tool for desktop & web which aggregates metadata from all papers added to the Mendeley research network via the Mendeley Desktop software. This indexes and organizes PDF documents and research papers, creating a personal digital bibliography for users. Mendeley has enjoyed takeup from users in highly respected universities around the world, including Stanford, MIT, Cambridge, Harvard, Aachen, Cornell and others. The company is attempting to redefine the space, time-frame and influences by which the ‘impact factor’ of scientific careers can be determined, by analysing discussions around research findings in social networks such as Twitter, Facebook and FriendFeed.
- Tool: The Open Access Repository Junction
- Works with: RoMEO, OpenDOAR, all major repositories
- Description: Known as OA-RJ, this project’s aim is to build on the existing EDINADepot to create a ‘middleware’ interoperability bridge between existing repositories which will act as a deposit broker system. The tool will help authors who are either not associated with an institution, or collaborative researchers from different institutions, to find the right repositories to deposit their work into. The system will automate RoMEO and OpenDOAR lookups, and provide an author disambiguation feature. Although still in development, the Nature Publishing Group is interested in using this tool.
(9) Joe Lambert, University of Southampton
- Tool: Drag&Drop Deposit Tool
- Works with: ePrints
- Platforms/Languages: Mac, Cocoa
- Description: This prototype updater is written with the collaborative author in mind. It tries to address the issue of metadata tools for time-starved academics submitting PDFs to ePrints. The development roadmap suggests an ideal user experience of being able to drag and drop multiple files into the application, which would return a report of all the metadata extracted for the user to check, approve, edit if necessary and then file to the IR.
(10) Viv Cothey, Gloucestershire Archives
- Tool: GAip desktop curation tool
- Works with: SWORD, DSpace
- Platforms/Languages: Perl
- Description: This tool stood out for being one of the only deposit tools to address archive and repository materials which aren’t academic research papers. The Gloucestershire Archives deals with physical materials as well as digital records, and faces the problem of taking “a 100-year view”. The intended user for GAip is an archivist – not the creator or the author. Viv raised the very pertinent issue of trusted storage. (Aside: anyone interested in the issues around long-term digital storage should read/listen to Clay Shirky’s Long Now lecture on digital durability.)
(11) Tim Brody, EPrints WebDav, University of Southampton
- Tool: Map a WebDav or FTP drive directly into ePrints 3.2
- Works with: ePrints
- Description: The ePrints team presented a video walkthrough of this tool, authored by Tim Brody. This solution seems targeted at a technical IR administrator or author, as the interface design is definitely geared to people very familiar with the command line, rather than your standard non-techie academic user. It provides a browsable and searchable folder structure with ‘dropbox’ like import functionality. At present it lacks any automatic metadata harvesting, and requires the user to complete the deposit via a standard ePrints web interface.
(12) Theo Andrew & Fred Howell, The Open Access Repository, EDINA
- Tool: EM-Loader (Extracting Metadata to Load for Open Access Deposit)
- Works with: SWORD, the Depot, PublicationsList.org
- Description: This project, still under development, is a proof of concept middleware that links the Depot and PublicationsList.org, a web site for researchers to build a web page listing their publications. EM-Loader’s goal is to make batch deposits easier, by handling multiple queries for metadata from web-based resources like PubMed, Web of Science, and personal databases such as EndNote, Reference Manager, BibTeX etc. Fred’s annotated presentation on the ‘From Swords to Ploughshares’ is available on his site.
- Tool: EasyDeposit configurable deposit client
- Works with: SWORD
- Platforms/Languages: PHP
- Description: The EasyDeposit client is a PHP powered configurable SWORD repository deposit client which can be configured to create a custom deposit interface for your repository. In this case, Stuart demonstrated how it can be configured to accept deposits via email using the standard PHP IMAP library to connect to your inbox. It extracts metadata from the sender of the email, the email subject, and the body of the message, which should contain the abstract. The script also adds each email attachment to the deposited item. When the deposit process is completed, the sender receives an email with a URL linking to that record in the repository. The script can also be configured for deposit via Facebook.
(14) Alex Wade, Director for Scholarly Communication, Microsoft External Research
- Tool: WordDeposit
- Works with: Microsoft Word 2007, ArXiv, SWORD
- Platforms/Languages: Windows
- Description: Microsoft’s External Research division is working with several leading academic organisations and researchers to produce workflow support tools. Alex discussed two exciting repository developments. First, the fact that arXiv now accepts submissions of Microsoft Office Word .docx files and other Office Open XML documents. Second, the company’s hosted self-publishing eJournal Service, currently in alpha, which helps conference chairs handle submissions of papers, and subsequently allows them to easily select and share those papers (via SharePoint Server 2007) with one click.
(15) Seb Francois, University of Southampton
- Tool: sWordInbox
- Works with: SWORD, ePrints, WordPress
- Description: Seb demod an embeddable remote uploader tool for ePrints, which he developed for the University of Lincoln. It addresses the use case more widely seen as individual researchers maintain their own blogs, i.e. it integrates with WordPress and allows the user to post their papers to their own blog once the deposit to ePrints is complete. There are still some bugs to work out, not least that embedding a login request into a web page has all the appearance of a phishing attack!
(16) Julian Tenney and Patrick Lockey, Xerte, University of Nottingham
- Tool: Xerte online authoring toolkit and Xpert deposit tool
- Works with: Any LMS or VLE
- Platforms/Languages: Web-based
- Description: This is another of the tools demo’d with a focus on something other than academic research papers. Xerte is an open source suite of tools to rapidly develop richly interactive learning content. Content created in Xerte can be deposited into Xpert, a searchable distributed repository compiled by harvesting content from the publishing institution via RSS feed. The aim is to make learning content available for re-use, re-purposing and adaptation.
(17) James Ballard & Richard Davis, University of London
- Tool: Copyright Licensing Applications using SWORD for Moodle
- Works with: SWORD, Moodle, ePrints, DSpace
- Platforms/Languages: PHP
- Description: Another tool in development with a focus on learning materials, CLASM assists students and academics who deposit through the familiar Moodle interface into a closed repository designed with a librarian’s workflow in mind. CLASM is designed to support better management of CLA licensed materials.
(18) Dan Needham, University of Manchester & Alan Danskin, British Library
- Tool: Names Project
- Works with:
- Description: The last of the tools to focus on something other than deposit workflows, the Names Project is developing a pilot name authority system to address the critical issue of author disambiguation. It uses data from Zetoc, British Library and contextual information from research documents to build a database of all UK research authors which will reliably and uniquely identify individuals and institutions. A public beta API is available for testing and no doubt all eyes will be on the British Library and Mimas to produce what most think will be an invaluable system.