Is there a Library shaped black hole in the web? Event summary.

Is there a Library shaped black hole in the web? was the question posed by an OCLC event at the Royal College of Surgeons last week that focused on exploring the potential benefits of using linked data to make library data available to users through the web. For a comprehensive overview of the event, I’ve put together a Storify of tweets here: https://storify.com/LornaMCampbell/oclc-linked-data

Following a truly dreadful pun from Laura J Wilkinson…

Owen Stephens kicked off the event with an overview of linked data and its potential to be  a lingua franca for publishing library data.  Some of the benefits that linked data can afford to libraries including improving search, discovery and display of library catalogue record information, improved data quality and data correction, and the ability to work with experts across the globe to harness their expertise.  Owen also introduced the Open World Assumption which, despite the coincidental title of this blog, was a new concept to me.  The Open World Assumption states that

“there may exist additional data, somewhere in the world to complement the data one has at hand”.

This contrasts with the Closed World Assumption which assumes that

“data sources are well-known and tightly controlled, as in a closed, stand-alone data silo.”

Learning Linked Data
http://lld.ischool.uw.edu/wp/glossary/

Traditional library catalogues worked on the basis of the closed world assumption, whereas linked data takes an open world approach and recognises that other people will know things you don’t.  Owen quoted Karen Coyle “the catalogue should be an information source, not just an inventory” and noted that while data on the web is messy, linked data provides the option to select sources we can trust.

Cathy Dolbear of Oxford University Press, gave a very interesting talk from the perspective of a publisher providing data to libraries and other search and discovery services. OUP provides data to library discovery services, search engines, wiki data, and other publishers.  Most OUP products tend to be discovered by search engines, only a small number of referrals, 0.7%, come from library discovery services.  OUP have two OAI-PMH APIs but they are not widely used and they are very keen to learn why.  The publisher’s requirements are primarily driven by search engines, but they would like to hear more from library discovery services.

Neil Jeffries of the Bodleian Digital Library was not able to be present on the day, but he overcame the inevitable technical hitches to present remotely.  He began by arguing that digital libraries should not be seen as archives or museums; digital libraries create knowledge and artefacts of intellectual discourse rather than just holding information. In order to enable this knowledge creation, libraries need to collaborate, connect and break down barriers between disciplines.  Neil went on to highlight a wide range of projects and initiatives, including VIVO, LD4L, CAMELOT, that use linked data and the semantic web to facilitate these connections. He concluded by encouraging libraries to be proactive and to understand the potential of both data and linked data in their own domain.

Ken Chad posed a question that often comes up in discussions about linked data and the semantic web; why bother?  What’s the value proposition for linked data?  Gartner currently places linked data in the trough of disillusionment, so how do we cross the chasm to reach the plateau of productivity?  This prompted my colleague Phil Barker to comment:

Ken recommended using the Jobs-to-be-Done framework to cross the chasm. Concentrate on users, but rather than just asking them what they want focus on, asking them what they are trying to do and identify their motivating factors – e.g. how will linked data help to boost my research profile?

For those willing to take the leap of faith across the chasm, Gill Hamilton of the National Library of Scotland presented a fantastic series of Top Tips! for linked data adoption which can be summarised as follows:

  • Strings to things aka people smart, machines stupid – library databases are full of things, people are really smart at reading things, unfortunately machines are really stupid. Turn things into strings with URIs so machines can read them.
  • Never, ever, ever dumb down your data.
  • Open up your metadata – license your metadata CC0 and put a representation of it into the Open Metadata Registry.  Open metadata is an advert for your collections and enables others to work with you.
  • Concentrate on what is unique in your collections – one of the unique items from the National Library of Scotland that Gill highlighted was the order for the Massacre of Glencoe.  Ahem. Moving swiftly on…
  • Use open vocabularies.

Simples! Linked Data is still risky though; services go down, URIs get deleted and there’s still more playing around than actual doing, however it’s still worth the risk to help us link up all our knowledge.

Richard J Wallis brought the day to a close by asking how can libraries exploit the web of data to liberate their data?  The web of data is becoming a web of related entities and it’s the relationships that add value.  Google recognised this early on when they based their search algorithm on the links between resources.  The web now deals with entities and relationships, not static records.

One way to encode these entities and relationships is using Schema.org. Schema.org aims to help search engines to interpret information on web pages so that it can be used to improve the display of search results.  Schema.org has two components; an ontology for naming the types and characteristics of resources, their relationships with each other, and constraints on how to describe these characteristics and relationships, and the expression of this information in machine readable formats such as microdata, RDFa Lite and JSON-LD. Richard noted that Schema.org is a form or linked data, but “it doesn’t advertise the fact” and added that libraries need to “give the web what it wants, and what it wants is Schema.org.”

If you’re interested in finding out more about Schema.org, Phil Barker and I wrote a short Cetis Briefing Paper on the specification which is available here: What is Schema.org?  Richard Wallis will also be presenting a Dublin Core Metadata Initiative webinar on the Schema.org and its applicability to the bibliographic domain on the 18th of November, registration here http://dublincore.org/resources/training/#2015wallis.

ETA  Phil Barker has also written a comprehensive summary of this even over at his own blog , Sharing and Learning, here: A library shaped black hole in the web?

Advertisement

Can open stop the future?

wikipedia_politics_opennessLast week Catherine Cronin brought Alice Marwick’s review of Nathaniel Tkacz’s Wikipedia and the Politics of Openness, to my attention and it’s left me with a lot of food for thought.  I haven’t had a chance to read Tkacz’s book yet but there are a couple points that I’d like to pick up on from the review, and one in particular that relates to the post I wrote recently on Jisc’s announcement that it intended to “retire” Jorum and replace it with a new “App and Content store” : Retire and Refresh: Jisc, Jorum and Open Education.

I tend to shy away from socio-political discussions about the nature of openness as I find that they often become very circular, and very contentious, very quickly.  I do agree with Tkacz and Marwick that openness is inherently political but I certainly don’t believe that openness is intrinsically neoliberal. To my mind this analysis betrays a rather US centric view of the open world and fails to take into consideration many other global expressions of openness.

If I’m interpreting Marwick correctly, Tkacz also seems to be arguing that openness must necessarily be non-hierarchical, which is an interesting perspective but not one that I wholly buy into.  While I think we need to be aware of the dangers of replicating existing hierarchical power structures in open environments, I think it’s somewhat idealistic to expect open initiatives to flourish without any power structures at all. So yes, there are hierarchical power structures inherent in Wikipedia, but I think there are many more egregious examples of openwashing out there.

The point that really struck me in Marwick’s review was the reference to Jonathan Zittrain’s 2008 book The Future of the Internet – And How to Stop It  in which the author charts the evolution from generative to tethered devices.

The Future of the Internet“The PC revolution was launched with PCs that invited innovation by others. So too with the Internet. Both were generative: they were designed to accept any contribution that followed a basic set of rules (either coded for a particular operating system, or respecting the protocols of the Internet). Both overwhelmed their respective proprietary, non-generative competitors, such as the makers of stand-alone word processors and proprietary online services like CompuServe and AOL. But the future unfolding right now is very different from this past. The future is not one of generative PCs attached to a generative network. It is instead one of sterile appliances tethered to a network of control.”

The Future of the Internet – And How to Stop It
Jonathan Zittrain

Marwick elaborates on the this generative – tethered dichotomy and situates it in our current technology context.

“Those in the former (generative) group allow under-the-hood tinkering, or simply messing with code, are championed by the maker movement, and run on free and open-source software. Tethered devices, on the other hand, are governed by app stores and regulated by mobile carriers: this is the iPhone model….The most successful apps of today, from Uber to Airbnb to Snapchat, are participatory and open only in the sense that anyone is free to use them and generate revenue for their owners.

Most of these apps use proprietary formats, don’t play well with others, make it difficult for users to port their content from one to another, and are resolutely closed-source.”

Open Markets, Open Projects: Wikipedia and the politics of openness
Alice E. Marwick

Now, I’m not sufficiently familiar with Zittrain’s work to know if his thinking is still considered to be current and relevant, but his warnings about a future of closed technologies tethered to a network of control, rather amplified the alarm bells that have been ringing in my head since Jisc announced the creation of their App and Content store.  As I mentioned in my previous post, the idea of an App Store sits very uneasily with my conception of open education.  Also I can’t help wondering what role, if any, open standards will play in the development of the new app store to prevent lock-in to proprietary applications and formats.

Zittrain suggested that developing community ethos is one way to “stop the future” and counter technology lockdown.

“A lockdown on PCs and a corresponding rise of tethered appliances will eliminate what today we take for granted: a world where mainstream technology can be influenced, even revolutionized, out of left field. Stopping this future depends on some wisely developed and implemented locks, along with new technologies and a community ethos that secures the keys to those locks among groups with shared norms and a sense of public purpose, rather than in the hands of a single gatekeeping entity, whether public or private.”

The Future of the Internet – And How to Stop It
Jonathan Zittrain

I absolutely agree that when it comes to the development of education content and technologies we need a community ethos with shared norms and a sense of public purpose, but to my mind it’s increased openness, rather than more locks and keys that will provide this safeguard.  In the past Jisc played an important public role by fostering communities of practice, supporting the development of innovative open technologies and sharing common practice and I sincerely hope that, rather than becoming a single gatekeeper to the community’s education content and applications, it will continue to maintain this invaluable sense of public purpose.

What is schema.org? Technical Briefing Paper

Last week my colleague Phil Barker and I published a new technical briefing paper What is schema.org?

schema_briefingThis briefing has been produced as part of our work with the Learning Resource Metadata Initiative (LRMI). LRMI expands schema.org so that it can be used to describe educationally significant characteristics of resources. At a technical level, the first step to understanding LRMI is to understand schema.org.

What is schema.org? describes the schema.org specification for a technical audience. It is aimed at people who may want to use schema.org markup in websites or other tools, and who wish to know more about the technical approach behind schema.org and how to implement it. As such, it has relevance beyond the description of educational resources, and we hope it will be of interest to anyone describing resources on the web. Additional briefings providing an in-depth technical overview of LRMI will follow.

What is schema.org? Can be downloaded from the Cetis Publications website here http://publications.cetis.ac.uk/2014/960

About LRMI

The Learning Resource Metadata Initiative is funded by the Bill & Melinda Gates Foundation, and jointly lead by Creative Commons and the Association of Educational Publishers—now the 501(c)(3) arm of the Association of American Publishers—with the aim of making it easier to publish, discover and deliver high quality educational resources on the web. With input from a wide range of organisations, from both the open and commercial spheres, involved in publishing, creating and using educational resources, LRMI successfully proposed additions to schema.org (an initiative of Google, Yahoo and Bing) allowing the description of educationally important properties of resources to be marked-up in web pages in a manner that is easily understood by search engines. This enables users to create custom search engines that support the filtering of search results based on criteria such as their match to a specific part of a curriculum, the age of the students, or other relevant characteristics.

CEN Learning Technologies Workshop Online Consultation Meeting

I’ve being working in the domain of learning technology and interoperability standards for over ten years now, and during that time I’ve worked with a number of standards bodies including IMS, BSI, DCMI, IEEE and ISO SC36.  However the first standards body I ever worked with was the CEN/ISSS Learning Technologies Workshop when I joined the Taxonomies and Vocabularies Project as an independent expert in 2001.

The European Committee for Standardisation’s Information Society Standardisation System (CEN/ISSS) was formed in 1997 to provide a focus for CEN’s ICT standards activities.  There are currently 12 formal Technical Committees and around 17 less formal Workshops in the domain of ICT.

CEN describe the purpose of the Workshops as follows:

“This open process aims at bridging the gap between industrial consortia that produce de facto standards with the limited participation of interested parties, and the formal European standardization process which produces standards through consensus under the authority of CEN member bodies. CEN WSs are flexible structure that benefits from the openness and consensus that are key values of CEN.”

The outputs of CEN Workshops are published as CEN Workshop Agreements (CWAs) which are disseminated freely and openly.    CWAs can be regarded as pre-standardisation documents, reports and specifications.  Within the Learning Technology Workshop, some of these CWAs are produced as a result of EU funded projects, but others have been produced on a voluntary basis by members of the workshop.

Although I haven’t personally been involved with the Workshop for a number of years, Cetis has remained an active participant and several of my Cetis colleagues regularly attend meetings and contribute to Workshop activities.  Simon Grant has been a notably active Workshop participant in recent years, during which time he was involved in the development of the Integrating Learning Outcomes and Competences (InLOC) specification which was published as CWA 16655 1-3 in July this year.  The InLOC CWA documents can be downloaded from CEN here.

Having been involved in a number of standards bodies over the years, I have always found the Learning Technology Workshop to be admirably open and inclusive.   The Workshop is open to all interested parties and any member can propose or contribute to an activity, provided they have the time and resources to participate.   Although there has been limited financial support for EU funded projects, a huge amount of valuable work has been carried forward on a voluntary basis by members of the Workshop.  And perhaps most importantly, the outputs of the Workshop, the CWAs, have been made available freely and openly.

The development of open learning technology standards is largely driven by the goodwill of technical experts and educational practitioners who contribute their time, energy and expertise on a voluntary basis, and I have always felt that it is important for standards bodies to promote open development processes and to disseminate open outputs.  Consequently it’s of considerable concern that a serious disagreement has arisen between CEN and the Learning Technology Workshop regarding the free and open dissemination of the Workshop’s CWAs.  In a nutshell it appears that CEN have suggested that only CWAs produced by EU funded projects can be made available free of charge.   This position suggests that CWAs produced by unfunded voluntary projects should no longer be published as free open documents and calls into question the viability of the Workshop in its current form.

As this issue remains unresolved, the latest meeting of the Learning Technology Workshop has been cancelled and the Workshop Chairs have called a public online consultation meeting on the 15th of October to discuss the future of the Workshop.  If you have any interest or involvement in the creation of open standards and specification within Europe, or a commitment to the development of open education technology, I would strongly encourage you to attend.

Minutes and documents relevant to this issue can be fond on the Workshop wiki here: http://wiki.teria.no/display/cenwslt/

See in particular:

Minutes of the 54th CEN WS/LT Meeting June 25th Brussels

Summary of  meeting between the CEN Management and the WS/LT Chairs 24th September