Monday, March 30, 2009

LibraryThing at Computers In Libraries 2009

LibraryThing, your favorite makers of libraries in computers, will be at Computers in Libraries this week. We'll be passing out free stuff and showing off our new LibraryThing for Libraries feature so if you're at CIL, stop by booth 214 and say hi. Unfortunately, we're rhino-less this time, but we do have T-shirts and laptop stickers (and Tim.)

Our new feature allows our catalog enhancements to run even on items that don't have an ISBN. Check it out in action on this 1948 edition of Tom Jones, or this 1937 edition of David Copperfield

There's no ISBN on those items, but our code is still smart enough to load the right tags and recommendations info. It uses a combination of our new What Work API and the LibraryThing Connector (the JavaScript that powers LTFL) to pull title and author information out of the catalog's HTML and then match it against our system. This new feature should help our academic libraries in particular, since they tend to have a lot of older pre-ISBN books.

Labels: , , , , ,

Tuesday, March 10, 2009

New API: What work?

I've added a small, but nifty new API that provides a sure-fire way of connecting any site's book data to LibraryThing.

The "What work?" API takes an ISBN and/or the book title and author and returns the LibraryThing work number, with link URL in XML.

It's a very forgiving algorithm—these all lead to my wife's The Mermaids Singing.
In sum, if you can't connect your data to LibraryThing now, you aren't trying!

If there's interest, I can add a JSON version.


*You need to provide either an ISBN (ISBN10 or 13; with dashes or not) or a title and author. Authors can be in last-first (preferred) or first-last (fine). You can omit the author and tack it onto the title, eg., "Huckleberry Finn / Twain, Mark." It's very forgiving about punctuation, capitalization and so forth. It doesn't make wild guesses, but it makes sensible ones.

Labels: ,

Monday, August 04, 2008

API to Common Knowledge

In case you don't subscribe to the main blog, there's a development there of interest to readers of this blog: We've unwrapped a free public API to all our Common Knowledge data—series, fictional places, characters, author educational histories, etc.

I'd love to see some of this data appear in library catalogs. The series coverage is really quite excellent.

At one point I made a series widget for LibraryThing for Libraries--listing other members of the series--but I didn't deploy it. There was some concern that LT's series data would fight with the libraries' own series data. If an LTFL library wants to use it, however, let me know.

Labels: ,

Monday, July 07, 2008

LibraryThing JSON-based books API

Over on the main blog I posted news about the new LibraryThing JSON-based books API (see here). The new API, which supplements our works API, comes with a small library of functions to manipulate it--all open source.

The API should be of interest to the libraries, as there are a couple of cool things they can do with the API. For example, with a few tweaks, it should be possible for libraries that use LibraryThing to showcase new or selected titles—a very popular thing—to create a widget that links into their OPAC, not to Amazon or whomever.

I'll probably write some basic functions to change linking along these lines, if someone doesn't do it for me first...

Labels: ,

Tuesday, March 25, 2008

First cut: Works JSON API

I've finished a simple Javascript/JSON API to LibraryThing's core work information. In structure and implementation the API resembles Google's recent Book Search API, but for LibraryThing.

Purpose. The API is designed to help libraries and others to add links to LibraryThing when LibraryThing has a book, and omit them when we don't. It's an easy conditional-linking system.

But the API returns other work information too, including the number of copies, number of reviews and average rating (with rating image). It comes with a simple function to insert the data where appropriate, but you can funnel this information to functions of your own devising.

Scope. This is an API to work information. Once I've worked through the kinks here, I plan to release a member API, allowing members to do clever things with their data. For example, members will be able to make their own widgets, not just rely on ours.

How it works. The basic mode of operation is to insert a script as follows:
<script src="http://www.librarything.com/api/json/workinfo.js?ids=*******"></script>
The ******* is reserved for the ISBNs you want to look up on LibraryThing, separated by commas. NOTE: This script should be placed at the bottom of the page.

For example, the JSON API Test includes one ISBN-10, one ISBN-13, one LCCN and one OCLC number.
<script src="http://www.librarything.com/api/json/workinfo.js?ids=0066212898,9780520042728,99030698,ocn8474750911"></script>
The script returns a hunk of JavaScript, including both the simple function and the JSON hash with all the book data. The hash is sent to a function of your choosing, or the simple LT_addLibraryThinglinks by default. To name another callback function add &callback= and the function name to the URL.

The function LT_addLibraryThinglinks looks for elements (DIVs, SPANs, etc.) with the ID "LT_xxx" where xxx is one of your identifiers. If LibraryThing has a work, it adds "(See on LibraryThing)", with link. If not, it does nothing.

Here's the JavaScript returned for the URL above:

LT_addLibraryThinglinks(
{
"0066212898":
{"id":"0066212898","type":"isbn","work":"3702986","link":"http:\/\/www.librarything.com\/work\/3702986","copies":"105","reviews":"7","rating":8.33,"rating_img":"http:\/\/www.librarything.com\/pics\/ss8.gif"},
"9780520042728":
{"id":"9780520042728","type":"isbn","work":"44723","link":"http:\/\/www.librarything.com\/work\/44723","copies":"92","reviews":"3","rating":8.47,"rating_img":"http:\/\/www.librarything.com\/pics\/ss8.gif"},
"99030698":
{"id":"99030698","type":"lccn","work":"32155","link":"http:\/\/www.librarything.com\/work\/32155","copies":"345","reviews":"10","rating":7.8,"rating_img":"http:\/\/www.librarything.com\/pics\/ss8.gif"},
"ocn8474750911":
{"id":"ocn8474750911","type":"oclc","work":"4161224","link":"http:\/\/www.librarything.com\/work\/4161224","copies":"1","reviews":"0","rating":0,"rating_img":""}}
);
More later. It's 2:48am and need to get to bed. There's much more to say, of course.

Labels: , ,

Friday, February 15, 2008

ThingISBN adds LCCNs, OCLC numbers

ThingISBN, our popular ISBN-based API, supports and returns data for two more identifiers: LCCN and OCLC.

At core, ThingISBN—blogged before here and here—takes an ISBN and returns a simple XML list of other ISBNs, corresponding to other "editions" of the work, eg.
http://www.librarything.com/api/thingISBN/0590353403
Now, if you add &allids=1 to the ISBN, the XML will include relevant LCCN and OCLC numbers, eg.
http://www.librarything.com/api/thingISBN/0590353403&allids=1
You can also feed ThingISBN both numbers, eg.,
http://www.librarything.com/api/thingISBN/lccn97039059
http://www.librarything.com/api/thingISBN/ocm37975719
If you feed it an LCCN or an OCLC number you don't need to add "&allids=1" to get back these identifiers.

What's next?
  • I haven't added LCCNs and OCLC numbers to the ThingISBN feed, yet.
  • Although there are some details to be worked out, this advance looks forward to adding support for LCCNs and OCLC numbers to LibraryThing for Libraries.
Tell us what's going on. I know that ThingISBN gets a lot of use, some of it even in accordance with its Terms of Use. If you're using ThingISBN, I'd love to hear how on a new wiki page I've created, Projects Currently Using ThingISBN.

Caveat. ThingISBN is free for non-commercial use. Commercial use requires our say-so. Read more here.

In the news! Coincidentally, LCCNs are in the news this week. Yesterday, the Library of Congress announced a "LCCN Permalink," a smart bid to convert a vital but underused set of permanent, unique IDs, the LCCN (Library of Congress Control Number), into the regnant permanent, unqiue ID, the URL. See Catalogablog for the announcement.

Labels: , , , , ,

Thursday, January 24, 2008

ISBN check API

A smart young programmer from a book-related company and I were talking. It turns out that, to validate ISBNs and get back both 10- and 13-digit versions he was submitting ISBNs to Amazon Web Services. That's like calling NORAD to find out if it's raining.* Nor did he seem likely to hunt around for an ISBN library for Ruby. After all, what he was doing worked.

So I made a quick, very stupid API, ie. http://www.librarything.com/isbncheck.php?isbn=0765344629
  • Give it any old ISBN and it does the math to return the ISBN10 and ISBN13 forms, if both exist.
  • It removes dashes and other junk.
  • It transparently fixes missing initial zeroes. This is a common problem with data from Excel files, which turn 0765344629 into 765344629.
  • If the ISBN isn't valid and can't be easily fixed, it returns an error.
Don't hit it more than 10 times/second. Otherwise, there are no usage restrictions.

*Amazon take note—I got your back, buddy!

Labels: ,

Tuesday, January 08, 2008

While you were sleeping, ThingISBN got better.

LibraryThing does a lot of cool things nobody else does. And, as we grow, we do them better and better.

I've got a very good example for today: the ThingISBN service. It was good when it was launched more than a year ago, becoming LibraryThing's first API, and it's been getting better ever since. (And where its competitor became a paid service, ThingISBN is still free for non-commercial use.)

The ThingISBN service provides something called "edition disambiguation." Give it an ISBN and it will shoot back a list of "related" ISBNs—other editions, other media, and translations. Edition disambiguation is valuable stuff. Retailers use it to aggregate reviews and other data across editions, and to sell you something when the book you searched for is no longer available. Libraries use it to make sure a patron leaves with a copy of a book, even if the edition the patron searched for is checked out.

You can get ThingISBN in two ways:
  • As a REST-based API. Just change the ISBN in this URL as needed.
  • As a complete feed (thingISBN.xml.gz in /feeds). We ask that people not hit the API more than 1,000 times per day. Instead, pick up the full feed.
What's cool here? LibraryThing isn't the only supplier of this data. The other supplier, OCLC, the Dublin-Ohio based library data organization, compiles its data through clever automated analysis of OCLC's billion-plus records. Their data and algorithms do a great job. Unfortunately, they charge for the service, called xISBN.

LibraryThing does it differently, relying instead on members, who add, combine and separate editions by the thousands every day. For doing this, LibraryThing members get better connections with other users. That is, you gain connections and enhanced recommendations by connecting your edition with others. The result is a detailed list set of correspondences between editions, assembled by thousands and improving every day.

You've got to admit it's getting better. If you improve every day, you can get pretty good, and that's what's happened to ThingISBN. OCLC still beats LibraryThing in quantity, but LibraryThing is closer, and, it seems to me, has a clear advantage for paperbacks.

I want to revist some of the examples I gave when ThingISBN debuted:
  • OCLC's canonical example is Frank Herbert's Dune. I don't have the exact counts, but LibraryThing originally trailed OCLC. (I know because I used it as example in a number of talks.) As of now, however, LibraryThing has passed OCLC, with 89 ISBNs to OCLC's 80.
  • Peter Green, Alexander of Macedon. When ThingISBN started, both LibraryThing and OCLC knew the recent hardback, and one other edition. That is, LibraryThing knew the paperback and OCLC knew the 1974 first edition. Since then, LibraryThing has discovered the first edition, giving it three ISBNs; OCLC still doesn't know about the paperback.
  • Lee Strobel, The Case for a Creator. OCLC knew of two editions, LibraryThing eight. OCLC now knows three, LibraryThing eleven. It's about paperbacks, obviously.
  • Emily Bronte, Wuthering Heights. Originally LibraryThing had 92 ISBNs, OCLC a commanding 326 ISBNs. OCLC is still in the lead, with 424 ISBNs, but LibraryThing has more than tripled its count, to 285.
Now, I'm quite sure that, overall, OCLC's xISBN service still beats LibraryThing in coverage. LibraryThing only covers 2.7 million ISBNs. OCLC must cover more.

But LibraryThing is gaining. It's getting better faster.

And while OCLC continues to sink resources into the project, including staff, now a paid service for all but minimal use as part of its Peace-is-War-ish Openly division, I can tell you honestly that I haven't touched ThingISBN in six months. I haven't made it better, even a little. Members made it better.

Now as then, that's pretty revolutionary stuff.

See you next January, OCLC.

Labels: , , , , ,

Thursday, March 15, 2007

thingISBN data in one file

thingISBN is a simple API for discovering related editions. Give it an ISBN and it returns a list of other ISBNs—different formats, translations, etc. We offer the API free for non-commercial use. Today we're releasing thingISBN in one giant feed, under the same conditions.*

thingISBN is based on LibraryThing's first-of-its-kind "work" system, by which regular people—LibraryThing members, mostly—combine and separate editions. Members run over 2,000 work-combination actions per day. Although some do it for pure altruism, combining editions helps LibraryThing users by improving the quality of their connections.

LibraryThing's results compare very favorably with its competition, OCLC's xISBN service (also free for non-commercial use). xISBN's coverage is better, but where LibraryThing is built on the collective judgment of humans, xISBN is just a computer algorithm. As the fella says, xISBN is "based on a world which is built on rules and because of that, [it] will never be as strong or as fast as [thingISBN] can be."**

APIs, while nifty, can be a pain. Both thingISBN and xISBN have a 1,000-per-day limit. So, starting today, thingISBN is also available in feed format—one giant XML file with all the data from over two million unique ISBNs.

Here's a sample file with just 1000 ISBNs:
http://www.librarything.com/feeds/thingISBN_small.xml

As you can see, the format is not ISBN-to-ISBNs. This would involve too much repetition—the full XML file is already 96MB! Instead, it goes work by work, listing the ISBNs inside them:
<work workcode="183">
<isbn>0802150845</isbn>
<isbn>0802143008</isbn>
<isbn>2020006014</isbn>
<isbn>0745300359</isbn>
<isbn>0394179900</isbn>
<isbn>9867574397</isbn>
<isbn uncertain="true">999107371X</isbn>
</work>
This format should go into a database well, e.g.,
CREATE TABLE isbn_to_work (
itw_workcode mediumint(8) unsigned NOT NULL,
itw_isbn char(13) NOT NULL,
itw_uncertain tinyint(4) NOT NULL default '0',
PRIMARY KEY (itw_workcode,itw_isbn)
)
As you can see, some ISBNs are listed as "uncertain." This happens when an ISBN crosses works. In a perfect world, these works would be combined, but LibraryThing doesn't do it automatically. There are a couple ways that can go wrong. For example "great books" sets often sport a single ISBN across volumes. It wouldn't do to combine "Pride and Prejudice" with "Moby Dick" just because their publisher wouldn't pony up for two ISBNs.

So, you can use the "uncertains" if you are willing to accept more errors. Otherwise, ignore them.

The feed itself is in http://www.librarything.com/feeds/ and is called "thingISBN.xml.gz". It is 16MB compressed.

We'd love to hear what people are doing with the data.

*Commercial use requires our permission. See http://www.librarything.com/api.php.
**Okay, the comparison in inexact, but OCLC does have a "Matrix" feel to it.

Labels: , , , , ,

Wednesday, June 14, 2006

Introducing thingISBN

UPDATE: thingISBN is now also availabe in feed format.

Many of you are familiar with OCLC's xISBN service. Give it an ISBN and it returns a list of "associated" ISBNs from WorldCat. So—xISBN's canonical example goes—give it an ISBN for one edition of Dune, and it will return a list of ISBNs of other editions, in XML format. This is red meat for mashups. (Speaking of which, did you know about Talis' Mashing up the Library competition?)

Today I'm releasing "thingISBN," LibraryThing's "answer" to xISBN. Under the hood, xISBN is a test of FRBR, a highly-developed, well thought-out way for librarians to model bibliographic relationships. By contrast, thingISBN is based on LibraryThing's "everyone a librarian" idea of bibliographic modeling. Users "combine" works as they see fit. If they make a mistake, other users can "separate" them. It's a less nuanced and more chaotic way of doing things, but can yield some useful results.

To use thingISBN, point your browser at a URL like this, replacing the ISBN as appropriate:

To compare xISBN and thingISBN add &compare=1

thingISBN vs. xISBN.
UPDATE: OCLC has disallowed comparison.
I've done some preliminary comparisons between the two services. The results are pretty interesting. For starters, OCLC has much broader ISBN coverage. The dataset is orders larger, and "regular people" just don't own certain books. Where the data sets overlap, however, LibraryThing can contribute a lot, particularly when it comes to paperbacks and non-US editions.

Examples:
  • 031228884 (Elizabeth Cook, Achilles). Recently-published novel. OCLC and LibraryThing know about two ISBNs. LibraryThing adds two others, a UK hardback and a UK paperback.
  • 0553212583 (Wuthering Heights). OCLC and LibraryThing share 60 editions. OCLC alone knows 266. LibraryThing alone knows 32.
  • 0520071654 (Peter Green, Alexander of Macedon...). OCLC and LibraryThing both know this hardcover ISBN. LibraryThing knows the paperback, but OCLC includes the 1974 first-edition.*
  • 0310241448 (Lee Strobel, The Case for a Creator). OCLC and LibraryThing know of one hardcover edition. OCLC knows of no other editions. LibraryThing knows of seven others. Wow.**
  • 0393049841 (Jason Epstein, The Book Business). OCLC and LibraryThing share two ISBNs. OCLC knows one by itself. LibraryThing also knows one by itself, but it's to Simple Pineapple Crochet. Yes, you read that right. I'm not sure where the error is from, but it's either a pitfall of the "everyone is a librarian" system, or of LibraryThing's occasionally ratty data.
Mashups? I brought out thingISBN in part to provide more grist for Talis' Mashing up the Library competition. I was careful to make thingISBN's output follow the conventions of xISBN, so that existing xISBN code could be reused. I'm looking forward to see if anyone does anything with it. (One obvious application would be as an addition to LibX, an open-source Firefox extension that leverages xISBN to help you find things in your library. Here's an excellent screen cast of it at work.)

As usual, comments, criticisms, bug reports and feature requests are asked for and gratefully received.

The fine print. By using thingISBN you agree to the following terms and conditions:
  • thingISBN is available for non-commercial use only.
  • You cannot hit thingISBN more than once per second.
  • If you're going to hit thingISBN more than 1,000 times/day, you must notify LibraryThing (we'd love to hear what you're doing). This is the current policy. If thingISBN turns out to be a success I'll optimize the code more, put it on my second server and allow it to be hit as hard as people want to hit it.
  • ThingISBN is provided "as is," without any promises or guarantees. LibraryThing is not responsible for any errors in the data, damages resulting from its use, your teenager's attitude or the state of the world generally.
  • We reserve the right to change these terms and generally make things up as we go.
*Stratch that. LibraryThing knows it now too. A user had it, but it wasn't combined; I went ahead and combined it. Actually, Green changed a lot between editions, but they still qualify as one "work." (This edition, with another ISBN, may also be the same work, but I'm not sure, so I left it.)
**I started look around to see if this disparity was true in general of religious books. I think it isn't, or at least the effect isn't as striking.

Labels: , , , ,