Thursday, July 19, 2007

Collaborative Cataloging

This week, two new projects that attempt to provide a framework for collaborative cataloging came to my attention. The first, Freebase, describes itself as "a global knowledge base: a structured, searchable, writeable and editable database built by a community of contributors, and open to everyone," and my impression is that it's trying to build a grass-roots version of the Semantic Web. I just got an invite to try the alpha version today, so I'll post more about it when I've had a chance to experiment. For now, the O'Reilly Radar piece from March 2007 has a bit of information.

The second project is the Open Library project from the Internet Archive. Their vision: "Imagine a library that collected all the world's information about all the world's books and made it available for everyone to view and update." Pretty ambitious!

The big announcement came on Aaron Swartz's Raw Thought. For those folks who don't know who Aaron is, the summary at LISNews.org is hilarious:


What are you supposed to feel about Aaron Swartz? He co-authored RSS, served on the W3C's RDF Core Working Group, helped the wonderful John Gruber design the amazing Markdown, and developed and gave away software like rss2email that many of us use every day... and then he graduated high school.


Open Library got mentioned on a few blogs and lists -- Jessamyn posted to librarian.net about it, saying "[Open source cataloging is] a weird juxtaposition, the idea of authority and the idea of a collaborative project that anyone can work on and modify," and quite a few other blogs picked up on the discussion.

I thought the best discussions happened on non-librarian blogs, frankly, particularly on Slashdot, where Swartz popped up to explain the vision. Deborah Richman posted the following blurb to the Search Engine Watch blog:


So who is quietly trying to solve your search and discovery problem? Librarians. This week, a new searching mechanism was announced by the OpenLibrary project, with the audacious goal of providing information about every book on the planet. No ordinary catalog here, as OpenLibrary relies on the considered librarianship of everyone who uses or contributes to it.
As usual, librarians are experimenting with access, resources and usability. We’re happy to follow their lead. In this case, it’s digital librarian and archivist Brewster Kahle, who started the Wayback Machine and has been thinking about open access for years. Yet almost no one heard about this effort, and it’s pretty interesting!
http://blog.searchenginewatch.com/blog/070718-032552



You can't buy publicity like this!

So what is Open Library? Essentially, it's a wiki (built on infogami). Slashdotters compared it (repeatedly) to IMDb and Project Gutenberg, but I think it's more like a WorldCat.org where anyone can edit. Like WorldCat, it's based on authority records, primarily from the Library of Congress, to which have been added books digitized as part of the Open Content Alliance project.

There are some major hurdles to overcome. As any cataloger knows, the records at WorldCat are hardly perfect; what happens when authority control goes Wikipedia? How do you deal with editions? Can the records be FRBRized? How do you prevent vandalism?

Despite these questions, I think this is pretty exciting stuff. I'm always interested in how information forms evolve, and I tend to think wikis in general are the next evolution of the book (see, for example, the WikiBooks project); they're almost infinitely better for collaboration than other forms of editing. This could be the next evolution of cataloging, particularly when we can start plugging some web services onto it. Hmmm...

No comments: