HomeGroupsTalkZeitgeist
This site uses cookies to deliver our services, improve performance, for analytics, and (if not signed in) for advertising. By using LibraryThing you acknowledge that you have read and understand our Terms of Service and Privacy Policy. Your use of the site and services is subject to these policies and terms.
  • LibraryThing
  • Book discussions
  • Your LibraryThing
  • Join to start using.

New tag-based recommendations algorithm

New features

Join LibraryThing to post.

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1timspalding
Aug 25, 2011, 1:35pm Top

Blog post here:
http://www.librarything.com/blogs/librarything/2011/08/new-tag-based-recommendat...

Any thoughts? It's an interesting theoretical problem. I'm interested to head what you think.

2_Zoe_
Aug 25, 2011, 1:58pm Top

The ones I've checked look good. It's interesting that you considered "creating some separation between adult and youth titles" to be a priority; I don't much care when it comes to fiction, though I was a bit surprised to see the children's version of Three Cups of Tea recommended for an adult book.

I don't have any algorithm-related suggestions at the moment, but I will say that I'd spend a lot more time browsing the recommendations if there were a(n optional) more detailed display style. At the very least, I'd like to see cover, average rating, and number of members, as in the site search results; it gets tedious very quickly clicking randomly on individual titles.

32wonderY
Aug 25, 2011, 2:04pm Top

Hey! Earth Abides is not boring! It's certainly superior to On The Beach.

I don't see a problem with the algorithm providing titles with opposing viewpoints to the subject work. in fact I would see that as a plus. I like for my reading to balance in those ways.

4timspalding
Aug 25, 2011, 2:25pm Top

>2 _Zoe_:

Right. The real problem is when adult and true children's get mixed. The "Three Cups of Tea" is, I think, a case where you might want it. After all, maybe you want to teach your children about this wonderful story. Or maybe about lying...

Our next plan with recommendations is to change very little about the data, but sex it up a bit with knobs and with graphics. So I think we're on the same page generally.

>3 2wonderY:

You are wrong! Okay, in fairness I haven't read On the Beach since seventh grade.

5lilithcat
Aug 25, 2011, 2:27pm Top

It was interesting to compare the recommendations for the various volumes of Kenneth Gregory's collections of letters to The Times.

Putting aside the recommendations for the other volumes in the series, there was a fair bit of overlap between the recommendations for The Second Cuckoo and The Last Cuckoo, but not between those two and The First Cuckoo. (Unfortunately, there aren't enough copies of The Third Cuckoo or The Next to Last Cuckoo to include those in the comparison.)

6_Zoe_
Aug 25, 2011, 2:42pm Top

>4 timspalding: Oh, with Three Cups of Tea, I meant that the children's version was recommended for a different adult book (Little Princes). Not really a problem, but it seemed worth noting.

Our next plan with recommendations is to change very little about the data, but sex it up a bit with knobs and with graphics. So I think we're on the same page generally.

Sounds good to me!

I didn't finish On the Beach. Who knew the end of the world could be so boring?

7eromsted
Aug 25, 2011, 4:07pm Top

I've just started looking so I don't have a general response yet. But I noticed that the tag recs list doesn't have the collections check marks.

8timspalding
Aug 25, 2011, 4:09pm Top

Ah. Good. They have them on refresh, but not when generating. I'll fix soon.

9artturnerjr
Aug 25, 2011, 4:24pm Top

Yay, Tim and Jeremy! I love this stuff.

"We think LibraryThing recommendations are as good as any out there, and are eager to prove it."

Only AS good? Nay, the best I say! :D

10thorold
Aug 25, 2011, 4:29pm Top

I see it doesn't solve the "Pigs is pigs" problem (an unfortunately ambiguous tag used twice on a work with few copies). But you can't expect tags to work reliably for small samples.

Cold comfort farm -> Lark rise to Candleford doesn't sound quite right to me, but A passage to India gives a very nice list, and so does Culture and Imperialism. I think you're onto something.

11TLCrawford
Edited: Aug 25, 2011, 4:33pm Top

Earth Abides is a wonderful book. Stewart's other fiction, Fire (http://www.librarything.com/work/503833) and Storm are also must reads, unless you find old technology disturbing.

I checked some of my underground railroad books and the more popular ones come up with great and relevant recommendations. One YA book came up with mostly related YA material. The Underground Railroad in the Adirondack Regionbrought back mostly good recommendations but it started going to travel books by the end of the list.

12eromsted
Edited: Aug 25, 2011, 7:37pm Top

>8 timspalding:
So it is. I can hit reload and make them appear when they didn't come in the first time.

I've been checking my favorite books and overall the recommendations seem very apt. The least so I've seen thus far is Paul Robeson -> Bing Crosby. Though they did both sing The House I Live In. (ETA - No, wait. That was Frank Sinatra)

I was a bit surprised by the lack of books on criminology/prisons for Foucault's Discipline & Punish. By my quick count it was Other books by/about Foucault, 20; Other postmodernism, 26; Prisons/criminology, 5.

Nonfiction recs are definitely more interesting than those for fiction. Most of the fiction recs go to other books by the author or authors of the same nationality. A couple exceptions: Gaddis's The Recognitions and Morrison's Song of Solomon have a bit more variety.

13_Zoe_
Aug 25, 2011, 4:58pm Top

I think it needs the same-series compression that the combined recommendations have on the main page (but mysteriously not on the subpage).

14brightcopy
Edited: Aug 25, 2011, 5:01pm Top

Bug reported here:
http://www.librarything.com/topic/122679

One thing I like most about this is simply the "NEW" tag on the section. Sometimes it's hard to notice new stuff if you don't follow Talk. Maybe it could have something like that next to "Tags" for a while?

I notice it still falls prey to sequelitis. For example:
http://www.librarything.com/work/27814/recommendations#tags

Not all works are as bad as this, though. And to some degree, it's a philosophical question since those books are books you probably want to read after reading the other books in the series. But then, the recommendation is low value since you already knew that (especially given how LT makes Series obvious).

ETA: Zoe beat me to the Series point while I was filling out the bug report. ;)

15jjwilson61
Aug 25, 2011, 5:10pm Top

I'm noticing how often other works by the same author appear at the top. Some of that may be that the same author often writes similar stuff but some may also be that many people tag books by their authors.

I suggest seeing what happens if you excluded tags that matched any of the authors of a work from the algorithm. I think you'd need to exclude all the variations of the name that have been combined as well as the just the last name.

16brightcopy
Aug 25, 2011, 5:12pm Top

#15 by jjwilson61> Yeah, same-author recommendations in general probably also fall in the low-value recommendation category, don't they? I mean, isn't that what people always do when they decide they like a book by a given author? I think it makes the recommendation algorithm look a little dingy when these are the top ones it comes up with.

17timspalding
Aug 25, 2011, 5:49pm Top

I see it doesn't solve the "Pigs is pigs" problem (an unfortunately ambiguous tag used twice on a work with few copies). But you can't expect tags to work reliably for small samples.

Yes. Good point. It shouldn't have tried it, I think. I'll take a look.

Cold comfort farm -> Lark rise to Candleford doesn't sound quite right to me

Agreed. FWIW, part of the assessment is:

countryside - 100.20712517725
country life - 89.113254898526
rural life - 84.808499984402
rural - 61.233877573976

early 20th century - 21.779731930948
1930s - 20.1348362219
farming - 14.140842643317
20th century fiction - 10.827915997019
pastoral - 10.755680039168
england - 9.316651519836
british fiction - 8.8276254429947
british - 8.565166286806
british literature - 8.5392453524446
modern fiction - 7.1589853458553
english literature - 7.038210516014

The top four are totally too much stress.

I was a bit surprised by the lack of books on criminology/prisons for Foucault's Discipline & Punish

A lot of the Foucault ones are getting boosts for "punishment," but, yes, not many others.

I think it needs the same-series compression that the combined recommendations have on the main page (but mysteriously not on the subpage).

I notice it still falls prey to sequelitis.

I'm noticing how often other works by the same author appear at the top.

Right. So, the theory is that it should have the best data it knows. It should record all the connections, whether or not you want to see them on a given page or not. Chances are, you want to see them elsewhere, and anyway the fact of the connection is used elsewhere--for example, we look at recommendations-of-recommendations.

Whatever the raw data has, however, we should remove same-authors or same-series at the display level. Right now, the only one that really does that is the "Combo recommendations" area on the home page for a work, which rolls up series and excludes authors. The "recommendations" page only has the "raw" data.

My feeling is that all of them need the same display algorithm, that allows you to pack and unpack authors and series.

18jjwilson61
Aug 25, 2011, 5:57pm Top

But should same author works be in the raw recommendations at all if the major reason they're there is because of sharing an author tag? You said that you're doing all sorts of behind the scenes manipulations to figure out which tags are important and which are trivial. I'm just suggesting that author tags are trivial or they should be.

19timspalding
Aug 25, 2011, 5:59pm Top

I'm not so sure about that. Give me an example? Besides, maybe author tags mean the author has a certain thing to them.

20jjwilson61
Aug 25, 2011, 6:05pm Top

How about http://www.librarything.com/work/46108/recommendations#tags/252915?

Steven Pinker wrote 4 of the top 6 recommendations yet they aren't on exactly the same subject.

21jjwilson61
Aug 25, 2011, 6:10pm Top

Or http://www.librarything.com/work/3031/recommendations#tags/405032.

Tom Clancy's not the only technothriller author. Are his works really so similar to each other that he gets the top four slots. Can you do an experiment to see what the list would look like if you excluded the author tag?

22jjwilson61
Aug 25, 2011, 6:14pm Top

But here's one that manages to overcome the effect,

http://www.librarything.com/work/51586/recommendations#tags/390430

I guess that Barbara Tuchman's various works are diverse enough that the tags don't overlap much.

23timspalding
Aug 25, 2011, 6:30pm Top

>21 jjwilson61:

I'm really glad Tom Clancy is not involved in Prince of Tides.

Steven Pinker wrote 4 of the top 6 recommendations yet they aren't on exactly the same subject.

So, I think there's a case to be made for me to start doing to authors what I do to awards--identifying tags that are very highly correlated to them--and screening them out. (The old algorithm was very sensitive to tags like "Booker Prize.")

That said, Pinker is Pinker. Unlike most authors, he writes on a bunch of subjects, but he's always Pinker. I mean, I've read a number of his books, choosing them for him not only for their content. Doesn't that matter?

24brightcopy
Aug 25, 2011, 6:36pm Top

#18 by jjwilson61> I'm just suggesting that author tags are trivial or they should be.

#19 by timspalding> I'm not so sure about that.

(Assuming I got the context right.)

I think the reason they are trivial is that the data from the tags is secondary to the data LT already has. It already knows what other books are by the same author. Why would it matter than someone has tagged them as such. That data is probably likely not even as good as what LT already knows about the other books the author wrote. As such, if a person wanted to know what other books were by the same author, they'd be much better off just clicking the author name than looking at the recommendations based on the (possibly incomplete) tagged author names.

25_Zoe_
Aug 25, 2011, 6:42pm Top

part of the assessment is

Ooh, can we have access to this data? It would be fun if we could look at a recommendation and see why it was there.

Whatever the raw data has, however, we should remove same-authors or same-series at the display level.

Yes.

>24 brightcopy: Agreed.

26brightcopy
Aug 25, 2011, 6:56pm Top

I guess another way of stating it is that this isn't so much a "recommendations" algorithm as it is a "relatedness" algorithm. Those things are very similar and overlapping. Sometimes things that are related are things that should be recommended. But for authors and series, tags aren't as good in terms of relatedness as the LT author and Series data. Those are far better and should be used anytime you're looking to display a relation of that sort.

27_Zoe_
Aug 25, 2011, 7:03pm Top

I guess another way of stating it is that this isn't so much a "recommendations" algorithm as it is a "relatedness" algorithm. Those things are very similar and overlapping.

This.

28jjwilson61
Edited: Aug 25, 2011, 7:28pm Top

That said, Pinker is Pinker. Unlike most authors, he writes on a bunch of subjects, but he's always Pinker. I mean, I've read a number of his books, choosing them for him not only for their content. Doesn't that matter?

But if that's what you wanted to do you wouldn't need the fancy LT recommendation algorithm, you'd just go to the author page.

ETA: Just like if you wanted to find more books that had won the Booker prize you'd go to the Award page.

29eromsted
Aug 25, 2011, 7:32pm Top

One problem with screening out books by author tag is that you will screen out books about the author along with books by the author.

---

And sorry, but I'm still a bit stuck on Discipline & Punish, perhaps because it's the only book I've looked at that really seemed off.

Here's a little analysis of the most common tags:
Foucault/Pomo
484 philosophy
152 theory
95 Foucault
59 critical theory
40 poststructuralism
33 cultural studies
32 postmoderism
895

Prisons/Criminology
118 prison
72 punishment
47 criminology
30 crime
28 discipline(?) - mostly books on parenting and education
24 law
20 surveillance
339

So it's about 2.5 to 1 - Foucault/pomo to prisons/criminology. And prison is the third biggest tag. But the recommendations are going 9 to 1 in favor of Foucault/pomo. My only guess is that there is significantly more synergistic cross-tagging in that group.

30timspalding
Aug 25, 2011, 7:42pm Top

Why would it matter than someone has tagged them as such.

Well, my theory is that people use author tags disproportionately when the author has a "body" of work that's coherent or a personality that's somehow coming through specially.

I thought of a counter-example--Chomsky. Chomsky's work is divided between linguistics and politics in a way that it wouldn't make sense to cross. But I investigated and it looks like his linguisics books don't suggest his politics books. Although they have "chomsky" they have virtually nothing else in common.

Ooh, can we have access to this data? It would be fun if we could look at a recommendation and see why it was there.

Yeah, I thought about it. Honestly, I'm not sure I want to expose so much. I mean, our recommendations really are better than our competitors, and nobody else--to my knowledge in the history of mankind--has done serious multi-tag recommendations. I think it's because our data is good and we have taste, but also they don't know what the hell they're doing. So...

But for authors and series, tags aren't as good in terms of relatedness as the LT author and Series data. Those are far better and should be used anytime you're looking to display a relation of that sort.

I hear you. Do you see my point, though? I see some value. I'm not sure that air-dropping bonus points for things in the same series would be as valuable.

I guess another way of stating it is that this isn't so much a "recommendations" algorithm as it is a "relatedness" algorithm

True enough, but "people who have X have Y" is also just relatedness. Recommendations are a guess from the relatedness pattents. (Member recommendations aren't used, except that they are "entered" in the running. Basically, the algorithm gets a list of some 100,000-200,000 books to analyze by tag patterns, the list it gets being based on tags to begin with. In some rare cases that may miss a decent book, so those books are entered as contenders, but not given any special boost.)

But if that's what you wanted to do you wouldn't need the fancy LT recommendation algorithm, you'd just go to the author page.

Okay will SOMEONE understand what I'm trying to say here? Argh.

ETA: Just like if you wanted to find more books that had won the Booker prize you'd go to the Award page.

The trick there is that some awards do carry semantics. Two books that don't share the right tags but do share the "McFinster Award for Books about Gay Mariners of Canada"... well, you see my point. But it excludes them now.

31_Zoe_
Aug 25, 2011, 8:15pm Top

Honestly, I'm not sure I want to expose so much.

Fair enough.

Recommendations are a guess from the relatedness pattents.

And I've said many times that this often fails in obvious ways. This is why you "correct" the overall recommendations by grouping books in the same series. Likewise, there should be an option to "correct" the for-user recommendations by excluding authors already in our catalogues or books that are generally thought to be terrible.

I do see your point about author tags, etc. I think it's fine to leave the algorithm to do its thing, as long as you do make some basic adjustments afterwards.

32brightcopy
Aug 25, 2011, 8:55pm Top

I do see what you are saying here, Tim. I think it falls into two categories:

1) Books by an author that are tagged with that authors names are better recommendations than simply looking for other books by the same author (even if you rank it by # of copies or something).

2) Books tagged as author A written by author B are good recommendations for a book by author A.

Seem like a good summary or am I missing anything?

I think #1 is an interesting hypothesis but without seeing any data to support it I have only my gut instinct, which just doesn't agree.

On the other hand, you may have something with #2. In which case, you should filter out other books by author A so you are left with only items tagged as author A but written by other authors.

Though for something by an extremely widely written-about author like Dickens, you may drown in titles ABOUT Dickens.

33timspalding
Aug 25, 2011, 9:07pm Top

>32 brightcopy:

Right. That's a second question. Do I exclude Dickens generally? I think not. So probably I just note that Dickens and "dickens" are correlated. This could get tricky, as probably many authors are well correlated with a single author. Of course, I could just look at the tag and the name, but that probably fails somewhere too.

34brightcopy
Aug 25, 2011, 9:08pm Top

Tim, just one more thing:

True enough, but "people who have X have Y" is also just relatedness. Recommendations are a guess from the relatedness pattents.

Right, I said recommendations and relatedness overlap. But just because I have two books and both are in hardcover, there is relatedness but no basis of recommendation on them. So all the relations have different kinds of values. My point was that the same-author relationship has little value, as the user would already have thought on their own to look for other books by the same author. This is, of course, more to the point of #1 from message 32.

35timspalding
Aug 25, 2011, 9:10pm Top

But just because I have two books and both are in hardcover, there is relatedness but no basis of recommendation on them.

If we did it by edition, it might be...

My point was that the same-author relationship has little value, as the user would already have thought on their own to look for other books by the same author.

No, I hear you.

36jjwilson61
Aug 25, 2011, 10:06pm Top

Well, my theory is that people use author tags disproportionately when the author has a "body" of work that's coherent or a personality that's somehow coming through specially.

Can you get any data to support that theory. My limited understanding of why people would tag a book with the author of that same book is so they can do tag searches within their catalog that includes the author. I can't think of a reason why someone would tag all the books authored by Dickens in their library with Dickens but not do the same for other authors in their library. Can you?

37Mareofthesea
Aug 26, 2011, 2:01am Top

Everyone uses tags differently. While I personally don't see the need to tag a book by the author, someone else who uses tags more than I do may. Or they may wish to use tags instead of the search function.

One thing that pops up in my mind why someone may use an author's name as a tag is when they are considered to be a founder of a certain style of writing. But even then I would tag it as "name style" or "nameesque"

38hdcclassic
Aug 26, 2011, 4:06am Top

Looking at things, indeed would be nice to see the books by the same author or especially the same series compressed in some way.
Nationality does indeed seem to play a strong importance, War with the Newts generates a bunch of Czech authors, The Third Policeman Irish (though Beckett and Joyce are good fits, not so sure about Kundera for Capek).

The list for The Strange Case of Dr Jekyll and Mr Hyde starts with four collected editions which include that book, that is something which needs to be rid of.

39AnnieMod
Aug 26, 2011, 4:17am Top

>38 hdcclassic:

None of them have work-to-work relationship added so how the site is supposed to guess that they are collected editions containing that one? I just added them so it will probably take some time to get caught up(or so I hope)

Tim, does the new feature take the work-to-work relationship in consideration (especially the contains one)?

40anglemark
Aug 26, 2011, 5:12am Top

As far as I can see, it seems to fail quite often on non-English books, because they are so often prominently tagged "Swedish", "Swedish literature" (mutatis mutandis, of course, for other countries). Not sure what to do about that, though.

41thorold
Aug 26, 2011, 5:27am Top

>38 hdcclassic:,40
Yes, nationality/language tags do seem to drown everything else: I served the king of England gives you 30 books with Czech authors and one with a Czech setting, but no picaresque novels about waiters or the hotel trade.

42hdcclassic
Aug 26, 2011, 7:32am Top

39> Yeah, if the algorithm automatically removes the "contains" books, then there is no problem when those connections are made (and when I have looked at bunch of other books, I didn't come across any others so I guess it works that way).

Nationality/language can indeed be a problem...I looked at the non-English nonfiction books I have read with 20+ raters and came across some where the tag recommendations work poorly:
Pollomuhku ja Posityyhtynen (really about translating books, specifically Harry Potter series), Dictionary of Accepted ideas (really a dictionary of cliches), In Praise of Shadows (really about architecture and aesthetics)...

Several working connections too though.

A common problem tag seems to be "XXXX literature" where XXXX is language or nationality, because it has been applied to both fiction books and non-fiction books about the literature of said language (and is usually one of the bigger or biggest tag for those books). Which can be a problem when eg. a literature essay collection by an Italian author gives a bunch of Italian fiction.
And it is also used for non-fiction books which are only marginally about literature or have some different take on the subject (like being about translating Harry Potter series).

43jjwilson61
Aug 26, 2011, 9:45am Top

As far as I know Tim hasn't used the work-to-work relations for anything but display. In fact, aren't they still in Beta?

44jbd1
Aug 26, 2011, 9:46am Top

>43 jjwilson61: - no, they're not in beta.

45eromsted
Aug 26, 2011, 10:04am Top

I agree that the language and nationality matches are not that interesting. To some degree this is simply saying that people's tagging practices are boring. Dozens of people have tagged I served the king of England with language and nationality tags but only a handful have used waiters, hotels, picaresque, etc.

But I suspect there's another problem more related to the matching algorithm. There are often a group of similar tags indicating language and nationality. Here's the set for I served the king of England:
53 Czech
40 Czech literature
26 Czechoslovakia
13 Czech fiction
10 Czech Republic
2 czechoslovakian literature
2 Czechoslovakian
1 Czech stuff
1 Czech/E Europe
1 Literature Translated Czech Eastern Europe
1 Au. Nat. Czech

Tags from this group are very likely to appear on books together. And because multiple tags match, the relatedness factor goes way up. But this is a false positive because all of the tags in the set are indicating more or less the same concept. Not sure anything could be done about this though.

46lorax
Aug 26, 2011, 10:13am Top

And to some degree, it's a philosophical question since those books are books you probably want to read after reading the other books in the series. But then, the recommendation is low value since you already knew that (especially given how LT makes Series obvious).

Well, I'd say it's low value, but people seem to love that sort of thing -- look at how many thumbs-up a manual recommendation of that sort will get. For a lot of people, obvious and low-risk recommendations beat less obvious but riskier suggestions.

47lorax
Aug 26, 2011, 10:17am Top

That said, Pinker is Pinker. Unlike most authors, he writes on a bunch of subjects, but he's always Pinker. I mean, I've read a number of his books, choosing them for him not only for their content. Doesn't that matter?

I think it does matter. Another author who does this, to a much larger degree, is John McPhee. The right recommendation for someone who liked his Oranges isn't another book about citrus fruit; it's another book by McPhee. (So in this case, the tag-based recommendation doesn't do a very good job. That's okay, though, it isn't a case where it would be expected to work.)

48brightcopy
Aug 26, 2011, 11:00am Top

#46 by lorax> Well, I'd say it's low value, but people seem to love that sort of thing -- look at how many thumbs-up a manual recommendation of that sort will get. For a lot of people, obvious and low-risk recommendations beat less obvious but riskier suggestions.

Good point. But even so, I think LT would be much better off using its own author and series information for those sorts of recommen-DUH-tions than the tag data which can be more scattershot.

49_Zoe_
Aug 26, 2011, 11:30am Top

Hehe, I love "recommend-DUH-tions".

The members recs don't seem as bad as I'd expected; Twilight doesn't have another one in the series until #20, and another non-series book by the same author is way down at the bottom.

50casvelyn
Aug 26, 2011, 12:33pm Top

>49 _Zoe_: Could this be in part because there's so much recent vampire fiction out there and vampires are "in" right now? I know I've seen some books where almost all the member recommendations are the rest of the series; unfortunately I can't remember which books I was looking at when I noticed this. They were substantially less popular than Twilight, though.

51staffordcastle
Aug 26, 2011, 1:32pm Top

Interesting; I looked at the recommendations for Women in England 1500-1760 by Anne Laurence, and noticed that the tag-based recommendations pulled up Family, Sex and Marriage in England 1500-1800 (Abridged, no footnotes) by Lawrence Stone, while the Special Sauce and People With This Book Have recommendations pull up the unabridged version, Family, Sex and Marriage in England 1500-1800. I guess they are tagged sufficiently differently?

52jlelliott
Aug 26, 2011, 2:07pm Top

I looked at a number of fiction and non-fiction works and the tag recommendations looked interesting and relevant. It seemed especially good at pulling up similar works with memoirs, like Final Exam. I really like how it pulls in multiple genres; several of the fiction works had lists including relevant non-fiction, such as books on consumption for The Air We Breathe. I love that, but looking up the thread it seems that others might think that is a false-match. Similarly, I don't think that matches based on nationality are uninteresting. There is no denying that there are national or regional trends in literature that are a perfectly sound basis for generating recommendations. In summary, I think these recommendation lists are very interesting, nice work!

53hdcclassic
Edited: Aug 26, 2011, 5:22pm Top

Nationality suggestions are not always bad, as said those Joyce and Beckett are good hits for O'Brien. But recommending Giuseppe di Lampedusa for Primo Levi or Milan Kundera for Karel Capek or the recommendation list of, say, Sudenmorsian (ridiculously generic selection of Finnish literature across the years and topics) are in my opinion no more justifiable than, say, F. Scott Fitzgerald and Siri Hustvedt for Philip K. Dick.

(actually that Sudenmorsian has a similar problem as said in #45, it has "Finnish literature" and bunch of tags in Finnish, which means it has more connections with other Finnish books while such big tags like "werewolves", "fantasy" and "feminism" are ignored)

54lilithcat
Aug 26, 2011, 5:20pm Top

> 23

I mean, I've read a number of his books, choosing them for him not only for their content. Doesn't that matter?

Yeah, I'm like that with Garry Wills and Colm Tóibín. I'll read just about anything either of them writes, on whatever subject, and they both range pretty far afield.

55timspalding
Aug 26, 2011, 11:21pm Top

Container/contained

To confirm, it uses container/contained to remove all such books. The problem is that some series exist in omnibus editions, and some don't.

But this is a false positive because all of the tags in the set are indicating more or less the same concept.

Right. That's a very challenging problem. I can calculate the degree of relatedness between tags (eg., "czech fiction" often occurs with "czech literature), but it's unclear to do it. All sorts of semantically different tags also cluster together.

Anyone have any ideas?

I agree that the language and nationality matches are not that interesting.

I can correlate original language with some tags. That might help. I'll do that, I think.

So in this case, the tag-based recommendation doesn't do a very good job. That's okay, though, it isn't a case where it would be expected to work.

Right. Because this is generally true, tag recommendaitons from fiction books have less effect on combo recommendations, both absolutely and because their percentage-matches tend to be lower.

56anglemark
Aug 27, 2011, 5:45am Top

Anyone have any ideas?

Some sort of user-contributed combinations of tags, that are only used for recommendations, if that's technically possible?

57Crypto-Willobie
Aug 27, 2011, 8:54pm Top

> 36, 37
I use "author tags" for several dozen authors in my catalog. My primary reason for doing this is so that the tag will allow me to bring up a list that combines books by the author (or containing some of that author's work) with books about the author, or that include significant material on his or her work.

58geitebukkeskjegg
Aug 28, 2011, 4:40am Top

55> I can correlate original language with some tags. That might help. I'll do that, I think.

But original language is notoriously unreliable, being calculated data. 37% of my works have no original language set, and for those that have the value is often wrong.

If "original language" was made a CK field (and why should it not be?), that would help.

59TLCrawford
Aug 28, 2011, 8:35am Top

#57 That is exactly how I use an authors name as a tag. of the 21 books I have tagged Sinclair Lewis a few are collections of his short stories but there are none of his novels. Biographies, collections of his letters, books by collaborators and ex-wives, even books about events that inspired him but none of his novels. Well, no there is one YA novel he wrote using a pen name.

60artturnerjr
Aug 28, 2011, 2:11pm Top

I just got a skazillion new recommendations on my automatic recs page (okay, actually 38). Anyone know if this has anything to do with the new tag-based recs?

61AnnaClaire
Aug 28, 2011, 5:53pm Top

>60 artturnerjr:
I was wondering the same thing. I'm up to 90 of them, and that's after I got rid of a few of them.

62timspalding
Aug 28, 2011, 10:25pm Top

Yes, certainly so.

63artturnerjr
Aug 28, 2011, 10:36pm Top

>62 timspalding:

Thanks, Tim.

Group: New features

45,201 messages

This group does not accept members.

About

This topic is not marked as primarily about any work, author or other topic.

Touchstones

Works

Authors

About | Contact | Privacy/Terms | Help/FAQs | Blog | Store | APIs | TinyCat | Legacy Libraries | Early Reviewers | Common Knowledge | 134,796,082 books! | Top bar: Always visible