Experimenting with the Learning Registry

This post is my reflections on the emerging conclusions from the JLeRN Experiment.

Applying a new approach to an old problem

Followers of technology trends will have noticed some of the big themes of recent years include cloud storage, big data, analytics and activity streams, social media. Technologists supporting education and research have been using these approaches in a range of ways, finding where they can help solve critical problems and meet unmet needs. Many of these explorations are investigative: they are about getting a grasp of how the technologies work, what the data looks like, where there are organisational or ethical issues that need to be addressed, and what the skills are that we need to develop in order to fully exploit these emerging opportunities.

The Learning Registry has been described by Dan Rehak as “Social Networking for Metadata” (about learning resources) . Imagine pushing RSS feeds into the cloud of all the urls of learning resources you can imagine, from museums, from educational content providers, from libraries. This is about web-scale big data. Imagine that cloud also pulling in data about where those urls have been shared, on facebook, twitter, blogs, mailing lists.  If you’ve tried out services like topsy.com or bit.ly analytics you’ll know that finding out information about url shares is possible and potentially interesting. Now imagine being able to ask interrogate that data, to see meta trends, to provide a widget next to your content item that pulls down the conversation being had around it. That is the vision of the learning registry. Anyone who has been involved with sharing learning materials will recognise the scenario on the left below.

jokey sketch of the use case for the Learning Registry

 Learning Registry Use Case, Amber Thomas, JISC 2012, CC BY

The Learning Registry is about applying the technologies described above to the problem on the left, by making it possible to mine the network for useful context to guide users.

The Experiment

To explore this potential, the JISC/HEA OER Programme has been funding an experiment to run a Learning Registry “node” in the UK. The growth of openly licensed content and the political momentum to encourage the use of that content has been a spur to funding this experiment though it should be noted that the Learning Registry is not designed purely for open content.

See this useful overview for more detail of the project. It has been running on an open innovation model, sharing progress openly, and working with interested people. Headed up by the excellent Sarah Currier, with input from Lorna Campbell and Phil Barker from JISC CETIS, in my view it has been a very effective experiment.

Towards the end of the work, on 22nd October 2012, Mimas hosted an expert meeting of those people that have been working with the Learning Registry, services and projects, contributors and consumers, developers and decision makers. It was a very rich meeting, with participants exchanging details of the way they have used these approaches, and deep discussions on what we have found.

What follows is my analysis of some of the key issues we have uncovered in this experiment.

Networks and Nodes

The structure of the LR is a fairly flat hierarchy, it can expand infinitely to accommodate new nodes, and nodes can cluster. See the overview for a useful diagram.

What this structure means is that it can grow easily, and that it does not require a governance model with large overheads. The rules are the rules of the network rather than of a gate-keeping organisation. This is an attractive model where it is not clear who the business case lies with.

One of the ways of running a node is to use an Amazon Web Service instance. That seems a nice pure way of running a distributed network, however university procurement frameworks have still got to adjust to the pricing mechanisms of the cloud. Perhaps in that respect we’re not ready to exploit cloud-based network services quite yet.

However more generally I think we are seeing is a growth in the profile of services that are brokers and aggregators of web content. Not the hosts, or the presentation layers, but services in between, sometimes invisible. JISC has been supporting the development of these sorts of “middleware”, “machine services” from the early days: the terminology changes but the concept is not new to JISC. What does seem to be developing though (and this is my perception) is an appetite for these intermediary services, and the skills to integrate them. Perhaps there is a readiness for a Learning Registry-ish service now.

Another key architectural characteristic is a reliance on APIs. This enables developers to create services to meet particular needs. Rather than a centralised model that collects feature requests from users, it allows a layer of skilled developers to create services around the APIs. The APIs have to be powerful to enable this though, so getting that first layer of rich API functionality working is key. To that extent the central team has to be fast and responsive to keep up momentum.

However the extent to which the LR is actually a network so far is unclear. There are a handful of nodes, but not to the extent that we can be sure we are seeing any network effects. The lack of growth of nodes may be because the barrier to setting up a node is perceived to be high. It may be too early to tell. But for the purposes of the JLeRN experiment, my conclusion is that we have not seen the network effects that the LR promises.


Pushing the hardest problems out of sight?

It’s easy to fall into a trap of hoping that one technical system will meet everybody’s needs. The Learning Registry might not be THE answer, but there is something of value in the way that it provides some infrastructure to manage a very complex distributed problem.

However the question raised at the workshop by Sarah Currier in her introduction and again by David Kay in his closing reflections is: does it push some of the challenges out of scope, for someone else to solve? The challenges in question include:

  • Resource description and keywords
  • De-duplication
  • People identifiers
  • Data versioning

To take one problem area: resource description for learning materials. It is very hard to agree on any mandatory metadata beyond Dublin Core. This is partly because of the diversity of resource types and formats: a learning material can be anything, from a photo to a whole website. Within resource types it is possible to have a deeper vocabulary, for example for content packaged resources that may have a nominal “time” or “level” attached. Likewise, different disciplinary areas not only have specialist vocabularies but also use content in different ways. It is technically possible to set useful mandatory metadata BUT in practice it is rarely complied with. When we are talking about a diversity of content providers with different motivations, the carrots and sticks are pretty complicated. So users want rich resource description metadata, to aid search and selection, but that is rarely supplied.

The Learning Registry solution is to be agnostic about metadata: it just sucks it all into the big data cloud. It does not mandate particular fields. What it does is offer developers a huge dataset to prod, to model, to shape, and to pull out in whichever way the users want it. Developers can do anything they want with the data AS LONG AS THE DATA EXISTS. If there is not enough data to play with, or not enough consistency between resources, then it is hard to create meaningful services over the data.

I said above that the Learning Registry “provides some infrastructure to manage a very complex distributed problem”. But on reflection does it manage that complexity? Or does it just make it manageable by pushing it out of scope? And if it doesn’t enable developers to build useful services for educators, is it successful?

Final Thoughts

These are a selection of the issues that the experiment is surfacing. There are certainly plenty of question marks about the effectiveness of this sort of approach. But I still feel sure that there are aspects of these technologies that we should be applying to meeting our needs in education and research. Certainly, this experiment has overlapped with work in JISC’s Activity Data programme, in our analytics work and in the area of cloud solutions. There is something interesting happening here, some glimpses of more elegant ways of share content, maybe even a step change.



4 thoughts on “Experimenting with the Learning Registry

  1. Sarah Currier

    Thanks for writing this great reflection on yesterday’s JLeRN Final Workshop Amber- it’s really useful (and saves me some work!). I’m going to collate all post-workshop blog posts and tweets on the JLeRN blog in a week or so.

    Re this:
    “Developers can do anything they want with the data AS LONG AS THE DATA EXISTS. If there is not enough data to play with, or not enough consistency between resources, then it is hard to create meaningful services over the data.”

    This is true, but I think one of Pat Lockley’s points around this at the workshop was that, it’s not quite enough for the data to exist, even in large quantities. The developer needs to have *some idea* what’s there and how it’s structured etc.

    This next is my point, not Pat’s: that’s where we start moving towards things like needing to declare what’s there and how it’s structured, and maybe communities of interest (“networks” or “communities” in Learning Registry parlance) needing to reach consensus on certain metadata, or paradata vocabularies, or identifier issues, etc. Or, you know, individual developers or projects taking the data to do something with can maybe spend more of their individual time trying to make it all work, like Pat did with Xpert. The Learning Registry as a technology or approach explicitly doesn’t solve these things but they still need to be solved.

    So the delight we feel at the Learning Registry’s ideas and approach and technology, and the prototypes and tools coming out now, is bought at the cost of avoiding the painful things. This may not be a bad thing- it is nice to start with the lovely stuff. And (as I’ve often said) some of us actually like dealing with the icky stuff; please employ them to do so when you’re ready (oops, straying into business cases, business models and governance issues now)!

    I am currently neck deep in trying to make the hideous munge of Jorum’s DSpace metadata, which was once rich and quality enhanced, into something you can build stats and paradata services on, so I may be feeling slightly negative again after yesterday’s high!

  2. Lorna M. Campbell

    Excellent synthesis of a complex domain and a wide ranging meeting Amber! I particularly like your diagram 🙂

    Can I pick up on some of the issues you raised?

    “technically possible to set useful mandatory metadata BUT in practice it is rarely complied with” – I think the issue here is useful to whom? Metadata that is useful to a resource manager may not be useful to a resource user. Also who is it that decides what is and isn’t useful? These are non trivial issues in a domain as complex and diverse as education.

    “the Learning Registry ‘provides some infrastructure to manage a very complex distributed problem’. But on reflection does it manage that complexity? Or does it just make it manageable by pushing it out of scope?” – I would say that the LR approach does help to manage that complexity, though it certainly doesn’t solve all the very thorny problems as you’ve highlighted. I can understand why the LR pushed deduplication, identifiers and versioning out of scope though, these are massive generic problems and I suspect if they had tried to solve them they wouldn’t have got very far.

    “And if it doesn’t enable developers to build useful services for educators, is it successful?” – I think we saw some services yesterday that were certainly useful for developers and technologists, and possibly even educators :}

    To my mind the biggest issues with the LR are maturity and sustainability. This has been a very interesting experiment, but I’m not at all sure what the future holds for the Learning Registry, particularly in the UK.

  3. Steve Midgley

    I’m obviously a biased observer, and I really appreciate these thoughts. I think they are right on the money. Learning Registry does try to push some hard problems to the side in an effort to create functional solutions to tangible issues. Whether this is the “right” mix of solution and postponement of solutions is hard to say yet.. It depends on adoption and scale, among some technical challenges.

    The three big things to note as you look at Learning Registry down the road:

    1) Gates foundation is funding, via SLC, a thing called Learning Registry Index, which will decompose Learning Registry metadata into discrete bits (think triples) that should make it much easier to interact with metadata bound to different formats. This is scheduled for a demo release in early 2013.

    2) Several large states in the US are specifying Learning Registry in their funded Instructional Improvement Systems, which should drive more adoption both on the supplier and consumer sides of the problem.

    3) With the introduction of Schema.org and LRMI, assuming they are widely adopted, I think we have a plausible successor to dublin core for describing in a common format many of the things we want to talk about inside the LR networks.

    Thanks for this great write up – it’s really great to see the work you’ve done and the conclusions you reached. Without critical analysis from friends, it’s hard to move forward!

  4. Pingback: Rounding up the JLeRN experiment « The JLeRN Experiment

Comments are closed.