Archive for api

Web 2.0 and the Digital Humanities

What Digital Humanities tools could take from Web 2.0:

Give users tools to visualise and network their own data. And make it easy.

A good example is Last.FM. You run a program they give you that uploads the data about the songs you listen to, as you are listening to them. You can then see stats about your listening habits, and are linked with people with similar listening habits. The key thing is that you don’t have to do extra work.

Another example is LibraryThing, which makes it easy visualise and network data about your book collection. It can’t be as automatic as last.fm, but it does let you import any file you might happen to have with ISBNs in it.

Compare this to a Digital Humanities project: The Reading Experience Database, which aims to accumulate records of reading experiences. They ask that if you come across any reading experiences in your research, you note them down, and submit them to the database with their online form (there are two – a 4 page form and a shorter one page form if you can’t be bothered with 4 pages of forms).
I’m not out to disparage the RED here – in many ways it is a fine endeavour. But I do want to criticise the conceptual model of how it accumulates data:
It requires that you, as a researcher, do your normal work, and then go and fill in (ideally) 4 pages of web forms for every reading experience that you have found (and possibly already documented elsewhere). Do you like filling out forms? I don’t. Worst of all, you don’t get any kind of access to the data – yours, or anyone elses (you just have to trust they will eventually get around to coding a search page).
This doesn’t help you to do your work now.

Which brings me to my next point…

Harness the self-interest of your users

You need them to use you, so make it worth their while. Don’t ask for their help, help them!

One problem, I think, is when projects start from a research interest. They want to gather data on that topic, so they ask other researchers to help them by filling out web forms.

A better approach to gathering data, I suggest, is to help the user with their own research interests as a first priority. The guy that built del.icio.us, interestingly, said that he primarily wanted his users to tag bookmarks with the keywords that suited them best personally, to tag out of pure self interest. The network effect of their tagging is a huge side benefit, but it doesn’t need to be the reason that people use del.icio.us. The end result is something more anarchic, more used, and more useful than something like dmoz.org.

del.icio.us doesn’t say I’m interested in French Renaissance Poetry, please fill in these forms. It gives you a tool to keep track of your bookmarks. It let’s you import bookmarks you already have, and it lets you export your data too.

Have an API

You don’t know what you’ve got until you give it away.

SOAP is good, but it doesn’t even need to be that complicated. Make sure that search results are retrievable through a url, and presented as semantic xhtml, and your data is already much more sharable (listen to Tom Coates’ presentation on the Web of Data).(Base4 has an interesting post arguing that the approach to URLs is the defining characteristic of Web 2.0. )

Sharing data in a machine readable and retrievable format, is the most important feature. It lets other people build features for you

Back in March, Dan Cohen lamented on the lack of non-commercial APIs
suitable for the humanities hacker. And it’s odd – humanities scholarship is a community that you would think would want to facilitate access and reuse of their data – and the only useful APIs Dan Cohen could find (from programmableweb.org) were from the Library of Congress and the BBC. (It’s not quite as bad as that, commercial APIs are potentially useful too, and there’s also the COPAC for querying UK research libraries, and of course wikipedia).
There are a ton of digital projects stored away in repositories, such as are provided by the AHDS, but few are much more accessible or usable in their digital form than in print.
I read that the ESTC is going to be made freely available through the British Library’s website later this year – imagine the historical mashups that could be done – the information that could be mined and visualised – if they would provide a developers’ API.

Embrace the chaos of knowledge

The exciting thing about the folksonomy approach of tagging, and the user creation and maintenance of knowledge of Wikipedia, is that they have shown that a bottom-up method of knowledge representation can be more powerful and more accurate than traditional top-down methods.
It’s messy, flawed, pragmatic, flexible, useful, and realistic system for representing knowledge.

What do you think?

Some projects already do, and have done, some of these things for quite some time (please comment with examples!).

Perhaps it is wrong to try to apply lessons from commercial/mainstream web apps too closely to digital humanities projects, which after all, have different aims and priorities?
There are also different types of projects (some more like resources, others more like tools?), some of which might find these points inappropriate.

What other principles (and web trends) do you think digital humanities projects should be thinking about?

Further Reading

Reading Lists

Comments (8)

www.openacademia.org

openacademia is an initiative to collect, share, publish and manage bibliographical information, the Semantic Web-way.

Information about scientific publications is often maintained by individual researchers. Reference management software such as EndNote and Bibtex help researchers to maintain a personal collection of bibliographic references. (These are typically references to one’s own publications and also those that have been read and cited by the researcher in his own work.)

Just had a quick look at openacademia.org. Wonderful – and a great use of the simile timeline.

Leave a Comment

Amazon Historical Pricing

The Amazon Historical Pricing web service gives developers programmatic access to over three years of actual sales data for books, music, videos, and DVDs (as sold by third-party sellers on Amazon.com). Sellers can use Amazon Historical Pricing to make informed decisions on pricing and purchasing.

Unfortunately it seems to be limited to getting 10 items at a time:

* Returns pricing information for up to ten items per request

Which will limit the potential for this to be used by book historians as a kind of barometer for cultural or economic change.

Leave a Comment