Cultural Heritage Sparks

I recently went along to the first meeting of the Digital Cultural Heritage Research Network here at the University of Edinburgh. The aim of the network is to

“bring together colleagues from across the University to establish a professional network for researchers investigating digital cultural heritage issues, seeking to include perspectives from diverse disciplines including design, education, sociology, law, cultural studies, informatics and business. Partners from the cultural heritage sector will play a key role in the network as advisors and collaborators.”

About DCHRN

Anyone who follows this blog will know that I have a bit of a thing about opening access to digital cultural resources so I was pleased to be able to contribute a lightning talk on digital cultural heritage and open education. This was one of an eclectic series of lightning talks that covered a wide range of subjects and topics.  I live tweeted the event and Jen Ross has collated tweets from the day in a Storify here: Digital Cultural Heritage Research Network, Workshop 1 and has also written a recap of the workshop here Recap of Workshop 1: Cultural Heritage Sparks.

My EDINA colleague Lisa Otty kicked off the day talking about the Keepers Extra Project which aims to highlight the value of the Keepers Registry of archiving arrangements for electronic journals. Lisa noted that only 17% of journals are archived in the Keepers Registry and asked the very pertinent question “do we trust publishers with the stewardship of electronic journals?” I think we all know the answer to that question.

I confess I rehashed a previous presentation on the comparative dearth of openly license cultural heritage collections in Scotland which allowed me to refer for the millionth time to Andrew Prescott’s classic blog post Dennis the Paywall Menace stalks the Archives. This time however I was able to add a couple of pertinent tweets from the Digging Into Data Round Three Conference that took place in Glasgow earlier in the week.

did_tweet_1 did_tweet_2

One lightning talk that was particularly close to my heart was by Glyn Davis who spoke about the openness, or lack thereof, of gallery and museum content, and reflected on his experience of running the Warhol MOOC.  Glyn noted that license restrictions often prevent copyright images from being used in online teaching and learning, however many of the students who participated in the Warhol MOOC understood little about copyright restrictions and simply expected to be able to find and reuse images via google, so lots of discussion about open access was required as part of the course.

Other highlights included Jen Ross‘ talk on Artcasting a project which is exploring how digital methods can be used inventively and critically to reimagine complex issues. The project has built an app which engages audiences by allowing them to capture images and decide where to send them in time and space and time, while also retrieving data for evaluation.  Bea Alex introduced the impressive range of projects from the Language Technology Group, including historical text projects, which aim to use text mining to enrich textual metadata with geodata from the Edinburgh Geo Parser. Stephen Allen spoke about the MOOC the National Museums of Scotland created to run in parallel with their Photography – A Victorian Sensation exhibition.  The museum now hopes to reuse content from future exhibitions for more MOOCs. Rebecca Sinker presented a fascinating keynote on Tate’s research-led approach to digital programming which prompted an interesting discussion on how people engage with art now that so much of it is available online. Angelica Thumala spoke all too briefly about her research exploring emotional attachment and experience of books in different modalities, and left us with one of the loveliest quotes of the day

“Books are constant companions, people carry them around and develop physical and emotional attachments to them.”

The workshop ended with four group discussions focussing on issues raised by participants; openness and preservation; participation and interpretation; semantic web and curation; and how can DCHRN create a sustainable interdisciplinary network.  These and other issues will be picked up in the next workshop Research that matters – playing with method, planning for impact takes place in March

DCHRN is coordinated by:

  • Dr Jen Ross, Digital Education
  • Dr Claire Sowton, Digital Education
  • Professor Sian Bayne, Digital Education
  • Professor James Loxley,  Literatures Languages and Culture
  • Professor Chris Speed, Design Informatics

On a side note, it’s a while since I’ve done a lightning talk and I’d forgotten how difficult it is to put together such a short presentation. Seriously, it took me most of an afternoon to put together a 5 minute talk which really is a bit ridiculous. Seems like I’m not the only one who struggles with short presentations though, when I moaned about this on twitter, a lot of people replied agreeing that the shorter the presentation, the more preparation is required. Martin Weller reminded me of the quote “If I had more time, I would have written a shorter letter”, while Kevin Ashley invoked Jeremy Bentham who was allegedly happy to give a two hour speech on the spot, but a fifteen minute talk required three weeks notice.  I guess I’m with Bentham on that one!

Erinma Ochu: Crowd Sourcing for Community Development

Earlier this week I went along to an event at the National Museum of Scotland run by the University of Edinburgh’s  Citizen Science and Crowdsourcing group.  There were some fascinating projects and initiatives on display but the highlight of the event was undoubtedly Erinma Ochu‘s engaging and thought provoking public lecture on Crowd Sourcing for Community Development.

Erinma Ochu

Erinma outlined the benefits that amateurs can bring to scientific research; they can help to validate data, fill in gaps in data collected by scientists, bring interesting new perspectives and, if they are not overly trained, they may be better able to spot patterns in data that scientists might miss. However Erinma also reminded us of the reciprocal aspects of citizen science. Citizen science should involve scientists serving the community, not just volunteers collecting data for research. It’s important to balance social and scientific value; the community building process is as important as the data product.  We have a responsibility to make spaces in which social inclusion and engagement can happen. I particularly liked Erinma’s focus on citizen science as a learning opportunity;  projects should give something back to the people who contribute the data and help them to learn.  Along the way Erinma introduced some fascinating and inspiring projects including Turing’s Sunflowers, Farm Hack and Manchester City of Science Robot Orchestra.

For a more comprehensive overview of Erinma’s talk I’ve created a storify of tweets here: Crowd Sourcing for Community Development Storify and Erinma’s slides area available on Slideshare here.

Is there a Library shaped black hole in the web? Event summary.

Is there a Library shaped black hole in the web? was the question posed by an OCLC event at the Royal College of Surgeons last week that focused on exploring the potential benefits of using linked data to make library data available to users through the web. For a comprehensive overview of the event, I’ve put together a Storify of tweets here: https://storify.com/LornaMCampbell/oclc-linked-data

Following a truly dreadful pun from Laura J Wilkinson…

Owen Stephens kicked off the event with an overview of linked data and its potential to be  a lingua franca for publishing library data.  Some of the benefits that linked data can afford to libraries including improving search, discovery and display of library catalogue record information, improved data quality and data correction, and the ability to work with experts across the globe to harness their expertise.  Owen also introduced the Open World Assumption which, despite the coincidental title of this blog, was a new concept to me.  The Open World Assumption states that

“there may exist additional data, somewhere in the world to complement the data one has at hand”.

This contrasts with the Closed World Assumption which assumes that

“data sources are well-known and tightly controlled, as in a closed, stand-alone data silo.”

Learning Linked Data
http://lld.ischool.uw.edu/wp/glossary/

Traditional library catalogues worked on the basis of the closed world assumption, whereas linked data takes an open world approach and recognises that other people will know things you don’t.  Owen quoted Karen Coyle “the catalogue should be an information source, not just an inventory” and noted that while data on the web is messy, linked data provides the option to select sources we can trust.

Cathy Dolbear of Oxford University Press, gave a very interesting talk from the perspective of a publisher providing data to libraries and other search and discovery services. OUP provides data to library discovery services, search engines, wiki data, and other publishers.  Most OUP products tend to be discovered by search engines, only a small number of referrals, 0.7%, come from library discovery services.  OUP have two OAI-PMH APIs but they are not widely used and they are very keen to learn why.  The publisher’s requirements are primarily driven by search engines, but they would like to hear more from library discovery services.

Neil Jeffries of the Bodleian Digital Library was not able to be present on the day, but he overcame the inevitable technical hitches to present remotely.  He began by arguing that digital libraries should not be seen as archives or museums; digital libraries create knowledge and artefacts of intellectual discourse rather than just holding information. In order to enable this knowledge creation, libraries need to collaborate, connect and break down barriers between disciplines.  Neil went on to highlight a wide range of projects and initiatives, including VIVO, LD4L, CAMELOT, that use linked data and the semantic web to facilitate these connections. He concluded by encouraging libraries to be proactive and to understand the potential of both data and linked data in their own domain.

Ken Chad posed a question that often comes up in discussions about linked data and the semantic web; why bother?  What’s the value proposition for linked data?  Gartner currently places linked data in the trough of disillusionment, so how do we cross the chasm to reach the plateau of productivity?  This prompted my colleague Phil Barker to comment:

Ken recommended using the Jobs-to-be-Done framework to cross the chasm. Concentrate on users, but rather than just asking them what they want focus on, asking them what they are trying to do and identify their motivating factors – e.g. how will linked data help to boost my research profile?

For those willing to take the leap of faith across the chasm, Gill Hamilton of the National Library of Scotland presented a fantastic series of Top Tips! for linked data adoption which can be summarised as follows:

  • Strings to things aka people smart, machines stupid – library databases are full of things, people are really smart at reading things, unfortunately machines are really stupid. Turn things into strings with URIs so machines can read them.
  • Never, ever, ever dumb down your data.
  • Open up your metadata – license your metadata CC0 and put a representation of it into the Open Metadata Registry.  Open metadata is an advert for your collections and enables others to work with you.
  • Concentrate on what is unique in your collections – one of the unique items from the National Library of Scotland that Gill highlighted was the order for the Massacre of Glencoe.  Ahem. Moving swiftly on…
  • Use open vocabularies.

Simples! Linked Data is still risky though; services go down, URIs get deleted and there’s still more playing around than actual doing, however it’s still worth the risk to help us link up all our knowledge.

Richard J Wallis brought the day to a close by asking how can libraries exploit the web of data to liberate their data?  The web of data is becoming a web of related entities and it’s the relationships that add value.  Google recognised this early on when they based their search algorithm on the links between resources.  The web now deals with entities and relationships, not static records.

One way to encode these entities and relationships is using Schema.org. Schema.org aims to help search engines to interpret information on web pages so that it can be used to improve the display of search results.  Schema.org has two components; an ontology for naming the types and characteristics of resources, their relationships with each other, and constraints on how to describe these characteristics and relationships, and the expression of this information in machine readable formats such as microdata, RDFa Lite and JSON-LD. Richard noted that Schema.org is a form or linked data, but “it doesn’t advertise the fact” and added that libraries need to “give the web what it wants, and what it wants is Schema.org.”

If you’re interested in finding out more about Schema.org, Phil Barker and I wrote a short Cetis Briefing Paper on the specification which is available here: What is Schema.org?  Richard Wallis will also be presenting a Dublin Core Metadata Initiative webinar on the Schema.org and its applicability to the bibliographic domain on the 18th of November, registration here http://dublincore.org/resources/training/#2015wallis.

ETA  Phil Barker has also written a comprehensive summary of this even over at his own blog , Sharing and Learning, here: A library shaped black hole in the web?

Open Silos? Open data and OER

“Open silos” might seem like a contradiction in terms, but this was one of the themes that emerged during last week’s  Open Knowledge Open Education Working Group call which focused on Open Data as Open Educational Resources. We heard from a number of initiatives including the Creating Value from Open Data project led by Universities UK and the Open Data Institute which is exploring how open data can support the student experience and bring about tangible benefits for UK higher education institutions, and Open Data as OER, led by Javiera Atenas and Leo Havemann, which is gathering case studies on the use of real world open data in educational contexts.

While the benefits of open data are widely recognised in relation to scientific and scholarly research, open data also has considerable value in the context of teaching and learning.  Many governments, non-governmental organisations and research centres are already producing large volumes of open data sets that have the potential to be used as open educational resources. Scenario based learning involving messy, real world data sets can help students to develop critical data literacy and analytical skills. And perhaps more importantly, as Javiera pointed out, working with real world open data  from real governments and communities doesn’t just help students to develop data literacy skills, it also helps to develop citizenship.

“It’s important to collaborate with local communities to work on real problems so that students can help their communities and society to improve social and political elements of their daily lives.”
~ Javiera Atenas

ETA Javiera and Leo collecting case studies about pedagogical uses of open data across the world.  If you have a case study you would like to add, you can join the project’s idea-space here: Open Data as Open Educational Resources idea-space.

Tim Coughlan of the Open University also spoke about his experience of using open data to teach introductory programming to undergraduates. Using open data introduces an invaluable element of realism and complexity as the data is flawed and inconsistent.  Students come up against challenges that it would be difficult to introduce artificially and, as a result, they learn to deal with the kind of problems they will encounter when they get real programming jobs.

Marieke Guy, co-ordinator of the Open Education Working Group, had a similar experience of learning to work with open data

“Authenticity is critical. You get a new level of understanding when you work with data and get your hands dirty.”
~Marieke Guy

Towards the end of the meeting there was an interesting discussion on the effect of Research Council mandates on open data and open education. Although open access, open education and open data have all made significant progress in recent years, there has been a tendency for these domains to progress in parallel with little sign of convergence. Research Council mandates may have had a positive impact on open access and open research data however the connection has yet to be made to open education and as a result we have ended up with “open silos”.  Indeed open access mandates may even have a negative impact on open education, as institutions focus their efforts and resources on meeting these requirements, rather than on getting their teaching and learning materials online and sharing open educational resources.  So while it’s great that institutions are now thinking about how they can link their open research data with open access scholarly works, we also need to focus some attention on linking open data to open education. There’s no simple solution to breaking down the barriers between these “open silos” but exploring the converging and competing cultures of open knowledge, open source, open content, open practice, open data and open access is just one of the themes we’ll be focusing on at the OER16 conference at the University of Edinburgh next year so I hope you’ll be able to come and join us.