Harvard metadata now searchable via OverCat!
Join LibraryThing to post.
This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.
See the blog post for details, but the short version is that 12.3 million MARC records from Harvard are now included in OverCat searches (bringing total OverCat coverage to more than 50 million records!)
Processing the records took more than a week. It was like eating an elephant!
(I saw the tweet about Harvard records being added and thought here come the vinyls! ;)
Thanks to Harvard for releasing it and to you at LT for importing that data into OverCat. I'm sure it was fun in a programmer/librarian geekish sort of way and I'm also sure it was a lot of work and I know for a dead-certain fact we all just received a substantial benefit.
I'm really geeked at the addition of more academic titles, as alluded to in the blog post.
Curious: what odds does the LT Hive Mind put on Harvard's decision being a spur to other libraries and/or a shift in OCLC policy? I assume there is more to this opening gambit but I'm not in the game.
I think Harvard was pushed by a number of very copyleftish people there—David Weinberger and John Palfrey, especially. They also knew that OCLC wasn't going to throw them of the cooperative or sue them. Harvard leaving would be a serious blow, and could be the nucleus of a real competitor. So they bit their tongue. The trick is whether it breaks the dam, and others do it too.
This REALLY makes me wish for a "retarget" mode for my books. To be able to take an existing book and switch it from a less desirable data source (amazon) to a more desirable one (Harvard via Overcat) without having to redo all my collections, tags, reading dates, comments, etc.
We're working on it. Chris Catalfo is now 50% on the adding books project.
Good to hear. Wait... you've been talking about this redo since before I signed up for LT. 50% done ... crap!
(Of course, you probably meant he's spending 50% of his time on it. I'm still leaving my comment up because it made me smile while writing it. It's also good to hear it will be more than just add books but will also include a "retarget" function. That is what I heard, right?)
Yeah, I know.
Yeah, he's spending 50% of his time on it. You'd have liked to be in today's conversation. I said "go! go! go!" Our sysadmin, Brian, said "you're going to kill the box!"
Wooohooo! I'll almost certainly make use of these records next time I'm working on Pushkin (probably this weekend).
Happy to hear the enthusiasm around this.
Systems Librarian, Harvard University
Thanks for being a part of it!
Tim - As part of this add books redo, it'd be nice if we could prioritize certain OverCat sub-sources. For example, making it such that when there's several page of results for a search, I could make the Harvard data come first.
Right now, there's a bit of a gap between "search only one source" and "search overcat but get all the results jumbled together". And, of course, there's no way to "search only one source" for the Harvard data since it's not a standalone source.
The jumbled nature of the results reduces the value of OverCat a bit in general, I'm afraid. I'd love if the new add books to do a much better job of letting me search once and then pick the one I want, rather than having to do all clicking and re-searching manually.
Yes, I think that's a decent idea. Right now we prioritize the record that's, basically, had the most edits. Usually that's the best. Not always.
One thing we're going to do is separate add books into a super-simple and an expert search. That'll allow us to go crazy with options, without going crazy.
One thing we're going to do is separate add books into a super-simple and an expert search.
That's fantastic! To be honest I had been a little worried that the redo was going to simplify things for the reading-list userbase at the expense of the cataloging userbase; glad to see that won't be the case.
This group does not accept members.
This topic is not marked as primarily about any work, author or other topic.