Monday, May 25, 2009

Better statistics, other improvements

I spent the weekend cooking up code, not sausages:

1. Series statistics. By popular demand, the member Series Statistics page can now show your series books you have in context of the complete series. (See talk post.)

2. Awards, characters and places. I've added similar statistics pages for three other "Common Knowledge" categories—Awards, Characters, Places. (See talk post.)

I also added series, awards, characters and places stats in your profile* and the "Your Zeitgeist" box on your home page (see talk post.)

3. More Green Checkmarks. Green check-marks, the mark that shows when you have a work, have spread further. They are now appearing on work-page recommendations, recommendation pages and in other members' catalogs. (See talk post.)

4. Power Edit gets better Previously, you could only Power Edit a page at a time (ie., no more than 100 books at a time). I added a feature to allow you to power-edit all the books in a given result set. So, you can do all your books, all the books that match a particular search, etc.



See the talk post.

5. Message Flagging. I've improved message-flagging in Talk, so that members can reverse their flagging, as well as counter-flag a message, if they think it was wrongly flagged. (See talk post.)

I also proposed making the Wikipedia policy "Assume good faith?" an official LibraryThing policy, triggering a lively debate about community norms, just what spam is and so forth. See the talk post.


*Originally high, but I moved it down when members hollered.

Labels: , ,

Monday, February 09, 2009

One million facts

"Now, what I want is, Facts. Teach these boys and girls nothing but Facts. Facts alone are wanted in life. Plant nothing else, and root out everything else. You can only form the minds of reasoning animals upon Facts: nothing else will ever be of any service to them. This is the principle on which I bring up my own children, and this is the principle on which I bring up these children. Stick to Facts, sir!" — first paragraph of Dickens' Hard Times
Three cheers for LibraryThing's dilligent members. Our Common Knowledge system has hit 1,000,000 member contributions.

Common Knowledge is an innovative "fielded wiki" for book information—collaborative, piecemeal "cataloging" of information about books and authors. We created it back in October 2007—Chris did most of the coding—and it has exceeded our expectations.

The focus is on things not found anywhere else—not cataloged by librarians or publishers. The system's biggest strength is probably is series coverage, 26,890 and counting. More comprehensive than paid series data, it is also often of higher quality. There is surely no library in the world that accounts for the Star Wars series (plural) better than what LibraryThing members have assembled! Common Knowledge also tracks some 8,860 awards, from the Wolfson History Prize to Nestlé Smarties Book Prize.

Fun, if not quite as full, are lists of 78 books with Lincoln in them, and 23 with Emma Goldman and Puck. Almost 1,700 books take place in New York, 90 in Mars and 49 in Hell. Some 626 authors went to Harvard, three were gas station attendants and four were burried in Uppsala Cathedral. No doubt, there are more of all, but the data is starting to really pile up—a confirmation that Social Cataloging is no joke.

Wherever Common Knowledge goes, it will not be locked up. All Common Knowledge data is free for reuse outside the site, with a handy API as well.

Picking up. The one-millionth entry came early. Edits picked up dramatically when, ten days ago, I introduced a Dead or Alive? page for every member, allowing you to find out how your authors break down on the living/dead scale. They went through the roof when I introduced a similar Male or Female? page. CK also attracted some interest from the initial release of distinct authors—a method for distinguishing between distinct, homonymous authors. (It was a busy weekend.)

The one-millionth Common Knowledge entry was added at 6:47pm (EST) by ladybug1983, who assigned the contemporary romance Taking the Heat as the third book in the series O'Neil Family.

Hey LadyBug, want a t-shirt?

Labels: ,

Wednesday, November 19, 2008

Common Knowledge: Names, Relationships and Events

Chris and I have introduced four new Common Knowledge fields, for authors and works.

Author Names. LibraryThing's author system is personally libertarian and globally democratic. You can change your own author names to your heart's delight. On the global level author names are combined and separated by members, with the most common name ending up on top.

That system has two main problems. First, Library has no good method for separatin out homonymous authors. (It's a big problem; it's on our list.) And most-common logic has its limitations, particularly in picking the best name for an author and in laying out what the many variants mean.

To improve things we've added a number of optional name fields. "Canonical name" was already there, as a foolproof way to set the "most common" form. To this we've added "Legal name" and "Other names."

"Legal Name" is provided for users who want to record the most accurate, most fiddly form of a name, eg., "George Gordon Byron, 6th Baron Byron." It can hold multiple names, to capture given names, and so forth.* "Other names" is for pen names, aliases, stage names, etc.

Two examples should illustrate the differences nicely:

Canonical Name:Twain, Mark
Legal Name:Clemens, Samuel Langhorne
Other Names:Snodgrass, Quintus Curtius


Canonical Name:Rice, Anne
Legal Name:Rice, Howard Allan Frances O’Brien
O'Brien, Howard Allen (given)
Other Names:Rampling, Anne
Roquelaure, A. N.

Relationships. We've also added a "Relationships" field, intended to capture when an author's spouse, son or other relative is also an author (eg., Martin Amis). So far at least, it's only intended to capture author-to-author relations, creating author-page links. LibraryThing can't be a all-out genealogy site!*

The result can be rather fun. Starting from Isabel Fonseca, author of Attachment you can now go to well-known British novelist Martin Amis, to his well-known father Kingsley Amis, to his second wife, the British novelist Elizabeth Jane Howard, to her first huband Peter Scott, a popular naturalist whose father was Robert Falcon Scott (Scott of the Antarctic) and godfather Peter Pan author J. M. Barrie, great grandfather of Kevin Bacon (not true).

Events. We've also added an "Important Events" field to works. "Important Events" now follows "Persons" and "Important Places." It was designed for events like the Great Fire of London, World War II or the 2000 Election.

As with Important Places, it is useful to agree on terms. CK's autocomplete function helps there. When in doubt, however, I'd go with the Wikipedia form for both fields.


*Porn names not allowed.
**I'm not so sure about "friend" relationships, although that's currently allowed. I found it difficult enough to reach an end from Isabel Fonseca. With friends, I don't think I could have ever stopped.

Labels: , ,

Monday, August 11, 2008

Series, Awards, Characters, Places

Some time ago we added pages for series. We've now added pages for three other Common Knowledge fields: Awards, Important Places and People/Characters.

All four page types, together with the author pages, now also sport extensive cross-linking, so you can get from Stephen King to the Bram Stoker Awards to Hannibal Lecter to the Marquis de Sade to Cornwall to Guenevere. (Bonus points if you can get back!)

Here are some observations on the various page types:

Awards. Awards are important to a lot of readers. Personally I have no use for them, but they're fun to browse through. And there are so many! Sure, we've all heard of the British Book Awards or the Hugo. But how about the Compton Crook Award, Macavity Award or Printz Award?

Places. Some of the most interesting places are the small ones. Paris is already too much, and even Philadelphia. But Antarctica is small enough to take in, and large enough to be interesting. So too Martha's Vineyard and Petra, Jordan (one part Left Behind, one part Indiana Jones and another academic).

But we need more for Faerie, Hell and particularly Moldova. As for Nuevo Rico, where are the Nuevo Ricans!

Speaking of odd, The Playboy Mansion is currently occupied by Shel Silverstein. What?

Series. Series pages aren't new. But I might as well drop that series are the most complete, best Common Knowledge data. It's not just Harry Potter, Star Wars or His Dark Materials, but also New American Nation, Time-Life: Mysteries of the Unknown and Hellenistic Culture and Society.

People/Characters. A lot of fun can be had here, particularly with characters that cross between fiction and non-fiction, like Lincoln and Alexander the Great and Pope Alexander VI. You will, of course, find familiar faces like Jack Aubrey, Gandalf and Sherlock Holmes.

Fun can be had with minor characters. Take Reepicheep from the Chronicles of Narnia. Can you remember which books he appears in? (It's Prince Caspian, The Voyage of the Dawn Treader and The Last Battle; if you found that easy, how about Jill Pole?)

The "related" boxes can show up scarce data. For example, right now God is showing up related to 69 individuals. Jesus is number one, but he's followed by Bernice Summerfield, apparently a character in Doctor Who. (Incidentally, Jesus is somewhat split between Jesus of Nazareth, Jesus Christ, etc.)

Post here or discuss on Talk.

Tim is gone! Incidentally, I am now on an official "code holiday." I have at least three days without any obligations whatsoever, and I intend to stay in, order pizza, stop answering the door, stop answering the phone, stop writing on Talk, and even—gasp!—stop answering email. I may even put one of those "vacation auto-reply" messages up. After three days, I hope I have something.

Labels: , , ,

First and last words

"Some years ago there was in the city of York a society of magicians."
Recognize that sentence? It is, of course, from Jonathan Strange and Mr. Norrell by Susanna Clarke. How about?
"Now, what I want is, Facts."
That's from Dickens, Hard Times.

We just introduced new work-based Common Knowledge fields for "First words" and "Last words." In the medium-to-long term, I'd love to work the data into a game—pick the sentence that goes with the work. If you're not comparing computer manuals to novels, it can be hard.

Find out more here.

Labels: , ,

Friday, August 01, 2008

Free Web Services API to Common Knowledge

Introducing the LibraryThing Web Services API.

The API will eventually do many things.

For starters it includes all of the data in LibraryThing's Common Knowledge project, our groundbreaking "fielded wiki" for interesting book information (see original blog post). It includes fields like series, important characters, important places, author dates, author burial places, agents, edits, etc. If you're interested in building or enhancing book-data applications, this should be very interesting.

Common Knowledge is always in progress, but the results so far have been quite impressive. Members have made over 500,000 edits, and certain data types have become exceedingly useful and comprehensive. I'm particularly proud of our Series coverage (eg., Star Wars), better—we think—than any commercial series data. 

Oh, and it's free! The data is made available under the highly permissive Creative Commons Attribution Share Alike license.

Architecturally, the Web Services API is a straightforward REST XML-based API.  The back-end is modular, allowing us to easily expand the available methods in the future. It's request and response styles were modeled closely on Flickr's API—Chris is a big fan—so it should make it easier to find similar sample code. The documentation resembles theirs too.

Kudos to Chris for his work on this and let us know what you think (here).

Update: The other big announcement—another data release—won't be happening today. Too much to do!

Labels: , , ,

Wednesday, April 02, 2008

Common Knowledge in your library

What just happened. Yesterday saw two huge announcements I'm loathe to "push down."
(What it didn't see was an April Fools message, although some took the 160% increase in sources for one! Does this mean we get to fool people later on this year?)

Common Knowledge in your library.



Today we've introduced our "Common Knowledge" feature directly into your catalog—allowing members to look at and edit series information, important places and the rest directly in their catalog.

To look at it, go to your catalog and choose the "edit" link to the right of the A, B, C, D, E styles. You'll see a number of CK fields as options. To edit CK fields, just double-click in the cell. A CK editing "lightbox" will pop up (see right).

Some thoughts. On one level, this is a minor feature. The data was always a click away. But I suspect it will substantially change members' relationship to Common Knowledge—and make it grow all the faster. Together with my introduction of pages for member's series, CK now "does" something.

Caveats. Right now you can't sort by CK fields, and you can't search by them. Sorting is doable, although it will take some sort. Searching is going to be harder, frankly. But it's not out of the question. Lastly, we still haven't solved CK language issues, so you may get series information in a language you don't understand.

Discuss it here
.



Labels: , , , ,

Thursday, January 17, 2008

New feature: "Series"

Chris and I have added "series" to our Common Knowledge feature, creating a way to deal with book series like the Chronicles of Narnia, The Sisterhood of the Travelling Pants, Will and Ariel Durant's The Story of Civilization or the Bluffer's Guides.

We've started off simple:
  • A page for every series, with covers and titles.
  • A simple method of ordering works within a series.
  • A series-level tag cloud.
  • A mechanism for showing series overlap, as between the Chronicles of Narnia in publication and chronological order.
There's a lot more we could potentially do. But this is just the sort of feature that should develop over time, with lots of input from users. Each series page has a short section on some of the important issues, and I've set up a Talk post for discussion.

I've also added fields for a work's "Canonical Title" and "Canonical Author." As of now, the values of these fields do not affect work or author titles. They will soon.

Labels: , ,

Friday, October 12, 2007

Common Knowledge explodes

It's been 48 hours since we introduced Common Knowledge, our "social cataloging" initiative and it's been a HUGE success.*

Six-hundred and fifty members have contributed an edit, making 17,437 edits total (adding multiple characters, for example, counted as a single edit). Check out the changelog and watch it happen.

It's our job to support what you're doing. Apart from obsessively adding facts ourselves--Chris and I both made the top 20 contributors!--Chris has been working on UI improvements, and we've both been very active discussing it, bugs, new fields, the gender issue and other topics. There's a lot to do.

More statistics. The top contributor was shortride with an astonishing 1,383 edits. English got the lion share of edits, with second-place German coming in at 441 edits. (We're still working on how to show information from other languages.)

Top contributorsTop fields
Shortride1383Awards and honors4412
MikeBriggs614Character name3398
fleela458Gender2297
realSandy383Important places2255
PhoenixTerran350Places of residence1587
tardis336Birthdate1197
sabreuse311Education869
VictoriaPL301Date of death552
tripleblessings291Organizations430
AnnaClaire277Description200
Rtrace275Disambiguation notice116
andyl247Publisher's editor62
rorrison242Agent60
timspalding238
SqueakyChu234
conceptDawg228


*We're pretty impressed by all the activity, especially considering it hasn't been as blogged as much as some past features.** But I gave it a good push talking yesterday at the Ohio Library Council. (Come see me talk again today.) And something like this can only grow. APIs will be key.
**Tip of the hat, however, to Superpatron, Joshua M. Neff and Wicked Librarian.

Labels: ,

Wednesday, October 10, 2007

Common Knowledge: Social cataloging arrives

Chris has just released Common Knowledge, the innovative, open-data and insanely addictive "fielded wiki" we've been talking about for a month.

Common Knowledge adds fields to every author and work, like:
  • Author: Places of residence, Awards and honors, Agent
  • Work: Important places, Character names, Publisher's editor, Description
All-told there are fourteen fields. But Common Knowledge is less a set of fields than a structure for adding fields to LibraryThing. Adding more fields is almost trivial, and they can be added to anything existing or planned—from tags and subjects, to bookstores and publishers. They can even be added to other Common Knowledge fields, so that, for example, agents and editors can, in the future, sport photos and contact information.* This can lead to, as Chris puts it, "nearly infinite cross-linking of data."

Common Knowledge works like a wiki. Any member can add information, and any member can edit or revert edits. All fields are global, not personal. Common Knowledge diverges from a standard wiki insofar as each field works like its own independent wiki page, with a separate edit history.

Some example:
  • Jonathan Strange and Mr. Norrell. I've been conservative with characters and places. (See Longitude, worked on by Chris for the opposite approach.) But I wish I had her editor!
  • The history page for "important places" in Jonathan Strange and Mr. Norrell, showing improvement over time.
  • David Weinberger. Half-filled. He mentions his agent, but I can't tree his major at Bucknell and the honors section is empty.
  • Hugo Award Winners. This is going to get very cool.
  • The global history page. Mesmerizing.

Right now we're basically slapping fields on pages, but this structure is built for reuse. The license is also built for reuse. We're not asking members to help us create a repository of saleable, private data. Whatever you add to Common Knowledge falls under a Creative Commons Attribution license. So long as you include a short notice (eg., "Powered by the LibraryThing community"), you can do almost anything you want with the data—take it, change it, remix it, give it to others. You can even sell it, if someone will buy it. Regular people, bookstores, libraries--even our competitors--are free to use it. We'll be adding APIs to get it out there all the more. Go crazy, people.**

Common Knowledge isn't the answer to everything. Some data, like web links, requires a more structured approach; some, like our "work" titles, works best when it "bubbles up" from user data; and some, like page counts, have yet to be extracted from the MARC and ONIX information we have. But the possibilities are great. Series information? Blurbers? Cover designers? Books about an author? Tag notes? Other classification schemes?*** Bookstore locations? Publicists? Venues? Book fairs? Pets? Pets' vacination dates?

Anyway, we've done our thinking, but this is the ultimate member-input feature. We're going to have to figure it out together. Fields will need to be added (and removed?). Rules will be debated, formatting discussed. Although the base is solid, the feature set is still skeletal.****

Go ahead and play. Chris, John and I spent the evening playing with it, and we guarantee it's addictive. Or talk about. Leave a note here. I've also changed the WikiThing group into a Common Knowledge and WikiThing group. I've started a first-reactions topic and another for bug reports.

Why I'm excited. LibraryThing means a lot of things to a lot of people. Some come for the cataloging, some for the social aspect. A lot come for what happens between those two poles. As I see it, Common Knowledge is the perfect LibraryThing feature. I don't mean it's good; I mean it's in tune with what makes LibraryThing work. It's social, sure, but it's based in data. It's not private cataloging and it's not MySpace-like "friending."

LibraryThing is sometimes called a "social cataloging" site. When I used this term at the American Library Association, it became an unintentional laugh line. Social cataloging sounded impossible and funny, like feline water-skiing. This more than anything else got me fired up about doing this. True "social cataloging"; it was an idea that had to be tried!*****

Details, acknowledgements and caveats. Common Knowledge is deeply unstructured. This is going to give some members hives! Names aren't in first-middle-last format, but free text. You can enter places however you want. We've arranged some careful "hint" text, and fields have a terrific "autocomplete" feature, but we're not validating data and returning hostile error messages. We're aiming for accessibility and reach, not perfection. This is Wikipedia, not the Library of Congress. It scares us too, but we're also excited.

Abby, Casey, Chris and I planned this feature during the Week of Code. We worked through the issues together, and Casey, Chris and I all wrote the initial code. When we broke up, the rest of the coding and the interface design all fell to Chris. Although it was a team effort, this is really his feature. I'm very pleased with what he did with it.

We decided to work on this (and on our standard wiki, WikiThing, which grew out of it) because it was an ideal project for the entire group to tackle. This jumped it past collections. I still think this was a good idea, but there has certainly been some grumbling. We heard you. Collections is next on our list, with nothing new in between.


*So far we have only three data types—radio buttons (gender), long fields (book descriptions and author disambiguations) and short fields (everything else).
**Competitors who use it might want to stop asserting copyright over everything posted to their site. This was legally bogus already, but it certainly would conflict with a Creative Commons license... Incidentally, we haven't decided whether to go with CC-Attribution Share-and-Share-Alike or straight CC-Attribution (discussion here), but it's going to be one or the other.
***This particular one may happen very soon.
****And yes, we can discuss the whole radio-buttons-for-gender topic. See here, here. I'm of the opinion that two genders plus maybe "unknown" and "n/a" (for Nyarlathotep?) are the best you can get without consensus-splitting disagreement. You'll note we aren't including other potentially-contentious fields, like sexual orientation or religion.
*****In conception, Common Knowledge most closely resembles the Open Library Project, the Internet Archive's incipent effort to "wikify" the library catalog. Open Library is also a "fielded wiki," based on Aaron Schwartz's superior Infogami platform. You'll notice that we've mostly steered clear of the "traditional" cataloging fields that Open Library is starting from. We do cataloging differently, and we don't want to duplicate effort. Anyway, we're hoping they and others mash up the two data sets, and others.

Labels: , , , , ,