Preparing for Dutch book data

TalkLibraryThing in het Nederlands

Join LibraryThing to post.

Preparing for Dutch book data

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1timspalding
Edited: Jul 24, 2007, 9:45 pm

LibraryThing is likely to get a large body of in-print Dutch-language book data. This will be our first forray into non-library data outside of Amazon. We hope that it will make LibraryThing.nl more attractive to Dutch-language bibliophiles.

So, assume that we get the day a week or ten days from now. What can we do over the next week to make LibraryThing.nl a more attractive place? What are the most important translation problems or failures to connect with Dutch members?

2thorold
Jul 25, 2007, 5:34 am

Brew a pot of strong Douwe Egberts coffee, buy some nice flowers, hang a birthday calendar on the inside of the toilet door... :-)

If you've got hold of a good source of bibliographic data, I should think you've overcome the main hurdle for users with Dutch books. The other big thing was sorting out the language that titles display in, and I know you've more or less got that fixed now.

Dutch is my fourth language, and I prefer to use the English site, so I don't have much to say on the translation problems.

It must look a bit odd to first-time visitors that the "About" tab takes you to a page in English. Obviously you could also enhance the experience of new visitors by hiding the archived squabbles of the Dutch translation group...

3royalhistorian
Jul 25, 2007, 6:07 am

I guess with an Dutch infosource the main problems are solved/covered. The Dutch will be very happy! And if the source also provides covers, they will be over the moon!

If the source is the source I think it is, it will be BIG. It's hugely popular with the Dutch and does not make mistakes in data as Amazon (wrong author names) does.

I guess you will be holy in the Netherlands ^_^

4yvoseule
Aug 3, 2007, 9:15 am

Hello everybody, I'am new on this site. I started on the English site and it took me some time to understand that there is also a translated Dutch site. It was more or less by accident that I came there. I think to make it more attractive, you need a lot of publicity and information and a good source for books and data. I catalogizes my books by hand and Í am not finished yet. Apart of that, I've read the messages from the translation group. What a sarcastic discussions and what a strange language (Belgium?) and translations. I have family in Belgium, but they don't speak like that: polsslag for zeitgeist and alaam for tools, not even the Flamish people. Very strange.
For the rest, I find it an interesting site. Greetings

5timspalding
Aug 3, 2007, 11:00 am

Thanks.

Are odd translations continuing? We put a democratic system in place, so they should be tamed somewhat.

We're going to go live with Dutch data very soon. I can give you it now—just you—so you don't catalog by hand. But the data will be partial. I want to get better data. Bol and Bruna are weak-team when it comes to cataloging.

6timspalding
Aug 3, 2007, 11:48 am

I would love help with this one:

Op de ter beschikking gestelde Titelbeschrijvingen berusten intellectuele eigendomsrechten en databankrechten. De Titelbeschrijvingen mogen enkel in functie van de opdracht van de openbare bibliotheek worden gebruikt. Het gebruik van de Titelbeschrijvingen is onderworpen aan de Vlacc Gebruikersovereenkomst zoals ook raadpleegbaar op

Getting the right answer on this is fairly important to the future of Dutch book data...

7royalhistorian
Aug 3, 2007, 2:23 pm

It says that the bookdata has copyright (orginal owner-wise) and is protected by the law for databases. The bookdata is to be only used in function with consent of the public library.

I guess copyright doesn't suprise you, but you might be suprised by the law for databases. It isn't something special, the common copyright, but it is added that a user is not allowed to do something that is harmful to the database or it's creator.

So, in brief you have to ask permission to use data from the database since LT isn't quite non-profit.

However, most Dutch people are not that difficult, and contacting them about use of data might often give a positive response. Contacting them is just the best way to go.

Maybe someone with better translating skills can get the message better across. Hoped this helped though.

8thorold
Edited: Aug 3, 2007, 2:44 pm

I take no responsibility for correctness:

"The Title Descriptions {bibliographic data?} made available are covered by intellectual property rights and database rights. The Title Descriptions may only be used for the purposes of the public library's assigned duty. The use of the Title Descriptions is subject to the VLACC User Agreement, which is also available for inspection on..."

More loosely, I think it means that public libraries can use the data, as long as they limit themselves to activities compatible with the role of a public library.

Edit: sophies_choice beat me to it and I see we came up with different translations of "opdracht". Looking at the VLACC user agreement I think mine might be slightly more plausible, since it clearly isn't talking about a specific library "enkel in functie van de opdracht van de openbare bibliotheek of daarmee gelijkgestelde instelling". But maybe a third opinion wouldn't be a bad idea?

9timspalding
Edited: Aug 3, 2007, 10:31 pm

Thanks. Someone I know is making inquiries for us. I hope we get permission.

The terms are irritating insofar as I think European database copyright is a bad idea, and not in agreement with the moral underpinnings that give copyright its legitimacy. The basic idea is that I can gain copyright over something uncopyrightable if I put it in a database. Under American law, a fact is a fact, and can't be transformed into something copyrightable by aggregating it with other facts.

Further, the US government can't copyright anything; by definition it's a public resource. Maybe LibraryThing should have to pay for it, but I'd be steamed if I were Belgian and couldn't use data I paid to create.

10snellius
Aug 4, 2007, 3:51 am

> 4,5

"Are odd translations continuing? We put a democratic system in place, so they should be tamed somewhat."

Yes, odd translations are back again. Yesterday, after reading this thread, I changed it to the more understandable translations, as it was before, but today the "odd translations" are alrea dyback again. The democratic system isn't working, when one user isn't following this system.

12thorold
Aug 6, 2007, 2:06 am

Excellent news!

I tried a small selection of (oldish) books from my guidebook shelf - ISBN search in BOL/BRUNA gave nothing, but I managed to get them all with title search using KB.

13xtien
Aug 6, 2007, 2:37 am

Tim, this is great stuff!

Guys, if this is all happening, we should definitely go through the translation and do a quality check. It's important that the translation is consistent - consistency is more important than the particular choice of words, imho - and that the site "feels" similar to the English site.

Please post in this group on significant changes, so at least everybody knows why a change is made.

14timspalding
Edited: Aug 6, 2007, 3:07 am

> thorold

Good. I'm guessing you didnt have much luck with ISBN search on KB. It's funny that nobody's made a really decent Dutch catalog. It's not like you guys aren't readers!

(Speaking of which, the Dutch people I've known have all read in multiple languages. I suppose that's not typical, but it would be simply absurd in the US. Does Dutch high multilinguality cut into the market for native literature?)

15royalhistorian
Aug 6, 2007, 3:50 am

#14 not really. Most books are bought in the native language. Only students and people who like the original books better than the Dutch translation buy books in the original language. It doesn't have an impact on the sale of Dutch books. If you walk into a store, you won't find walls with English books. But sometimes books aren't available in the Dutch language, only in English (most of my princess Diana collection for example. Or some of the computer/webdesign books). And sometimes it takes long to get an English book translated into Dutch, so some people get the English book.

And Dutch people love to be international oriented :-)

16thorold
Aug 6, 2007, 4:08 am

I haven't yet found a book published in the Netherlands that isn't in the KB catalogue - not surprising, really, as it's a copyright library. But obviously it doesn't present its data in an LT-friendly way, which is a pity.

The Dutch public library system has a public search tool that seems to work well (see bibliotheek.nl) but I imagine that's another one that doesn't provide an LT-compatible feed, otherwise you'd be using it already.

Speaking as an expat who's been living in the Netherlands for the best part of twenty years, I'm sure I would be a lot less ignorant about Dutch literature if it weren't for Dutch high multilinguality and the easy availability of English and German books here!

17timspalding
Aug 6, 2007, 7:50 am

> bibliotheek.nl

They use a standard called MARC XML, which is a close cousin to the main standard on LibraryThing, MARC. But they have a scolding paragraph about use, so we're asking first.

18thorold
Aug 6, 2007, 12:21 pm

...after trying a few more with KB search (I agree, it's as fast as slow-drying paint):

- I don't see any "expand" triangles on the search results
- the publication dates are there in the search result display but don't get imported
- the "Nothing found" bar always appears at the bottom of the search results

19xtien
Aug 6, 2007, 12:31 pm

>14 timspalding:
Dutch high multilinguality is caused by the small size of the country. You can't drive more than 200 miles without having to speak either German or French, or even Danish. Not so long ago, high school students would all take classes in English, French and German. Unfortunately, today they have to learn only English.

It doesn't cut into the market for Dutch literature, because reading novels in a foreign language requires better skills in that language than what you need for speaking it. Most people (> 90%) prefer to read translations.

The size of the Netherlands does cut into the market, it's harder to make a living as a writer. A lot of books sell only 5000 or so copies. There's fewer writers, and books are more expensive.

20timspalding
Aug 6, 2007, 3:12 pm

>I don't see any "expand" triangles on the search results

That's intentional. We're trying to get away from them. Most users don't understand them.

The rest I'll fix as soon as I can.

T

21edwinbcn
Aug 6, 2007, 6:42 pm

> #18
"it's as fast as slow-drying paint"

I noticed that it is faster and gives better results when you put your search term (book title) between inverted commas.

Without the inverted commas it searches the whole catalogue for each word separately and displays very many results. Putting "the book title" between inverted commas searches for strings and results faster in better results.

adding to Thorold's findings:
- the name partcles "van der" are displayed but not imported, so

search result "Heijden, A.F.Th van der"

is imported as:

"Heijden, A.F.Th."

22timspalding
Aug 6, 2007, 6:47 pm

Interesting stuff. Do you think it should add the quotes automatically? If that's faster, that would be a good thing.

I'll check out the van der problem. What's the deal with Dutch alphabetization anyway--is van der part of the last name for alphabetization purposes, or not?

23xtien
Aug 6, 2007, 6:58 pm

The "van der" is a separate part. It precedes the last name. The last name is "van der Heijden", but in the phone directory you'll find it under H. In a US phonebook, you'll find Vanderheijden instead of "van der Heijden". Same for "de Vries", "op 't Veld", etc.
German has the same thing: "von Neuman", "van Beethoven". French: "le Grand", "de Gaulle".

All CRM systems have trouble with it. I enter the "van der" as mid initials, or as the last part of the first name.

Is it possible to add a "prefix" field? Like US systems sometimes have a suffix field? Or, you could add "mid initials" where we could put both mid initals and prefix. if we do that, we do want to make sure that the mid initials always appear when the last name appears.

24xtien
Aug 6, 2007, 6:59 pm

If you add quotes automatically, LT would always search for the whole string. I think currently it searches for individual words, which is different.

25timspalding
Aug 6, 2007, 7:45 pm

The problem is that much of the data we get doesn't have any semantics. Amazon, or example, is just a string. So we guess at how the words break between first and last, and try not to get into death dates or whether the individual was the second duke of Somethinshire.

Here's the KB data: here. Some browsers will show it well; some won't. It has both a semantically rich and a semantically poor version of the name. Clearly we need to use the richer one.

Adding this to my list.

26xtien
Aug 7, 2007, 6:38 pm

This is very hard to solve. Been there, done that. Any software you write will eventually get it 95% right, or even 99% right. But the remaining 1% is a lot of authors. Would it help if you had an almost complete list of all prefixes for Dutch names? I'm sure that in a joint effort we can quickly create such a list.

27westher
Aug 15, 2007, 8:13 am

>17 timspalding:

The Dutch expansion of Librarything raised a lot of interest in the Netherlands, so I imagine the VOB will be interested in cooperation. Also, their data are searchable through a Medialab Aquabrowser application, so maybe that would be a second route to approach them. (I did read LT is cooperating with Medialab, didn't I?)

28timspalding
Aug 15, 2007, 10:58 am

Yeah, no word from them yet. I'll ask for a contact over there, unless someone has one.

T