Like a lot of people, when I think about it, or when I’m reminded about it, I understand that the Web is a place where someone is always watching what you do. I understand that … but then I think, well … the Web is such a huge beast; such a vast ocean; such a giant metropolis where the comings and goings of individuals are insignificant. How and why would anyone notice what I’m looking at and which links I’m clicking on?

Then up pops Tom Barnett from Switch Concepts Ltd. at a meeting yesterday to tell us that ‘Google has a file the size of an encyclopedia on everyone in this room.’

Hmmm … that’s not a particularly comfortable idea for someone to put in your head. I start to feel a vague sense of paranoia creeping through my mind.

And then I think, c’mon Neil, pull yourself together! Google really doesn’t care who you are. They just want to put things in your line of sight that are more rather than less likely to get you to open your wallet and part with your wages!!

Such were the thoughts that were buzzing around my head yesterday at an event organised by the Web Science Trust (http://webscience.org).

The meeting was entitled ‘Observing the Web’ and the purpose was to highlight some of the work that the Web Science Trust and their partners and collaborators are doing to build a global network of Web Observatories providing an open analytics environment to drive new forms of Web research. We went round the room doing introductions and Dame Wendy Hall ended up branding us a ‘motley crew’. Academics, industry players, not-for-profits, technologists, funders, charities, a lawyer. (Quite a respectable looking motley crew in the very smart surroundings of the Royal Society I might add). But ‘motley crew’ felt about right for a topic and a collaborative, academic, open activity that is still exploring the territory and testing new ground. Presumably in contrast to the well-resourced, sophisticated and highly developed (but opaque) methods employed by the corporate observers of the Web (Facebook, Amazon, Google, Microsoft, Yahoo etc.).

The point of all of this ‘observing’ is not to try and take account of every little bit of data and content on the web, but rather to understand what the aggregated use of the Web can tell us; how trends and fashions and changes of behaviour in relation to the Web might illuminate aspects of our society and culture, both now and for future students and researchers.

This was all of great interest to Jisc. We are currently working with the British Library, the Oxford Internet Institute and the Institute of Historical Research on an initiative that aligns very well with the notion of the Web Observatory.

The Big Data project (http://www.oii.ox.ac.uk/research/projects/?id=88)


the AADDA project (http://www.history.ac.uk/projects/digital/AADDA)

are both using a copy of the Internet Archive’s collection of UK domain websites collected over the period 1996-2010, to examine new ways to engage with the web at domain level, and develop new forms of research that leverage the scale of the web. As the name of the Oxford project says … it’s all about using ‘Big Data’.

This was work that emerged from influential JISC-funded reports commissioned in 2010 –

Researcher Engagement with Web Archives

As we heard at the meeting, the academic observatory is a very different proposition to the corporate observatory and comes with enormous challenges including: interoperability (how do we link observatories?); access (asides from Twitter which of the big corporates will let us use their data?); privacy (will people feel spied upon?); and sustainability (what is the business model?).

A fascinating meeting and big topic. There will be more discussion in Early May at the ACM Web Science Meeting in Paris.




3 thoughts on “Observing the Web

