Event Report: Gaining Business Intelligence from User Activity (July 2010)

On 14th July 2010 David Kay from Sero Consulting facilitated the above workshop, which built on the findings of the JISC MOSAIC project.  The event report and presentations from the workshop are now available.

The following are few reflections from the workshop:

There are a small number of projects (e.g. MOSAIC, PIRUS2 and Journals Usage Statistics Portal projects) working in the area of activity data, however the potential for the Education sector to derive a wide range of business intelligence information from these data, is at the moment largely underutilised.  The commercial sector, with the likes of Amazon and Tesco, already exploit this type of data to give them competitive advantage.  The following economist article “Data data everywhere” talked about “the trail of clicks that internet users leave behind from which value can be extracted that is becoming the mainstay of the internet economy.”  So how do we take advantage of this area in the education sector?  The presentation from Dave Pattern at Huddersfield University Library demonstrated that the business intelligence derived from the library circulation data has provided them with collections development strategies and allowed them to provided added value recommender type services on top of their library OPAC, which has seen the increase of library loans.  Just to note that it was made clear in the workshop that this was not just about Library systems but also included administrative, virtual learning environments, student registries research environments etc.

The following were the main reasons why we should be investing in this area

  1. Resource discovery, improving all the classic information retrieval genres, search, browse, etc (this can lead to more exploitation of the wider collection and wider reading by students; may be related to the student experience and hence perhaps student satisfaction, retention, progression).
  2. Stock management (this is related to business efficiency for, e.g. the library – as above better exploitation of stock, value for money).
  3. ‘Customer’ profiling, across the institution, to inform the work of support services, academics, strategic management.
  4. Deep log analysis and its various uses, e.g. to inform longer term investment in infrastructure at institutional and national level.
  5. Digital / information literacy – there was a lot of discussion about recommender systems and their effect/contribution to literacy.
  6. Management information: where to invest and where to cut.  Better running of business.  What contribution can this data make to e.g. lean processes, organisational development, business process re-engineering.

Discussion around the issues and what we do next:

  1. Technical and analytical skills: getting the data out of the systems and then making sense of the data in a consistent way was an issue. Developing institutional capacity, skills, etc was seen as an important aspect to enable institutions to take advantage of these opportunities, in terms of existing services and innovation to provide institutions with their competitive advantage.
  2. Interoperability and standards – PIRUS2 project and the MOSAIC project have also focused on the issues with interoperability and standards for the repository and library landscape, but more advice and guidance is required for this area. Also effective communication with system vendors and publishers was seen as essential.
  3. Linked to 1 and 2 above, developing the appropriate algorithms and tools for the HE sector.
  4. Data Protection issues – It was clear that as long as the data did not have any personally identifiable information attached to it (names age etc) then it did not contravene the data protection act but there was concerns that there could be isolated cases were the user might be inadvertently identified through a “mashing up” of different data.  There are also issues of database rights – who owns this type of data? The individual, a collective, a trusted broker, a multinational, a public body?  Advice and guidance on good practice in anonymising data, complying with legal frameworks was seen as vital in this area.
  5. In addition, JISC work needs to uncover and present evidence on the opportunities and risks associated with public and/or commercial solutions.  What is the public interest?  What is the HEI interest, especially at a time when HEIs need to marshal evidence of their value to government?
  6. We also need to present evidence in support of the business case for the use of these data in different contexts in support of the core business of HE.
  7. Accountability – Any work in this area needs clearly to link to HEI-level benefits.    The work needs to be configured to demonstrate and lead clearly and quickly to value for money, return on investment, shared services and efficiency.
  8. There was discussion on what does a shared service look like in this area? – there would need to be a business case to support this – also there may be a need for a trusted platform for (linked) data, to aggregate and make this available – but also what are the implications for identifiers and vocabularies?
  9. Students and staff need to be aware of the uses to which ‘their’ activity data can be used, to be literate about the opportunities and risks.
  10. There may also be a requirement for a representative body to ensure that the development of any international infrastructure / practice is undertaken in such a way as to reflect the interests of UK HE.
  11. Any JISC work needs to be iterative, but can be agile as the technology is available – however it should be considered as the HEI landscape is complex.  Where we can make quick wins, we should do so.  For example to advocate to institutions to start to collect these data, in anticipation of exploiting these data at a later date.

It was clear from the workshop that delegates were engaged in this area of development and that they were interested in fully capitalising on the benefits of exploiting activity data.

Building on the relevant projects and the discussions from this workshop JISC will be releasing a Call in September 2010 which will include a strand on Activity Data.  Also JISC will be releasing a Business Intelligence Call.  Both Calls are distinct but will have interesting overlaps, which we will explore and report on throughout the programme.  Further information about the Call can be found on the Information Environment Programme blog

3 thoughts on “Event Report: Gaining Business Intelligence from User Activity (July 2010)

  1. Stuart Dempster

    I see that the EC Commission is taking the UK Govt to court over data protection/privacy after complaints about behavioural ad targeting. I see that digital industry is developing a pan-European self-regulatory framework for behavioural ad targeting to promote transparency in data collection, how, where and why user tracking data is used, as well as setting out consumers? options.

  2. Phil Barker

    Hi, we at CETIS are involved in some work that seems complementary to this. First, we have an interest in what data we can surface that will help us answer the question of what metadata is really useful (as opposed to potentially useful in someone’s opinion) for describing learning materials. I didn’t see that among your reasons for investigating this area, perhaps with learning materials we are at a different level of maturity in our understanding of how to facilitate resource discovery. This page describes our thinking http://wiki.cetis.ac.uk/Cetiswmd (the event it’s about is full, but look out for reports from it).

    The other angle of interest we have is in tracking the use of open educational resources (OERs). We have lots of similar reasons for being interested as you describe: facilitating discovery, justifying expenditure, deciding where to put effort. We have a couple of interesting extra factors too. First, these OERs are free for anyone to copy, so we don’t know where they are going to be; that is, we can’t assume that their use will show up on our own system logs because they might be being used on someone else’s system. Secondly, requiring that people log in before they can access an resource isn’t as open as we would like our systems to be, so we’re limited on the user tracking we can do. You can see our thinking on this aspect at http://wiki.cetis.ac.uk/OER_Tracking

  3. Pingback: Activity data « Briefing Paper for eResearch & IE Call – 10/2010

Comments are closed.