Give higher weight to exact title matches for Touchstones

This is a continuation of the topic Give higher weight to exact title matches for Touchstones.

TalkRecommend Site Improvements

Join LibraryThing to post.

Give higher weight to exact title matches for Touchstones

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1MarthaJeanne
Feb 8, 2017, 1:28 am

In the past two years the discussion has gotten up over the 200 mark, which is traditionally the time to start a new topic.

In the meantime, we still have the problem that touchstones often give ridiculous results, in some cases the work wanted doesn't show up at all or is so buried in a long list that it is difficult to find. Many users don't know how to change the result, and even more don't know how to force it if an easy change doesn't work.

Touchstones are an important part of how discussions of books happen on LT, so having them be so unreliable for so long is a real issue.

2klarusu
Feb 8, 2017, 2:30 am

>1 MarthaJeanne: Well said & thanks for the new thread!

3gilroy
Feb 8, 2017, 7:26 am

Oh, I'll add an example to the new thread, just cause:

The Forgotten Girls by Alexa Steele comes up with Pride and Prejudice by Jane Austen.

4lesmel
Edited: Feb 8, 2017, 9:29 am

Company Town by Madeline Ashby gets you The Adventures of Tom Sawyer -- b/c hello, someone's review is in the title and other person put the publisher details in the title. (psssst: that's BAD DATA)

5lorax
Feb 8, 2017, 9:31 am

Here, for the record, is the original text of my post #1 in the original thread, dated from March 2015:


If I recall correctly, Touchstones use the work search feature under the hood, and provides the results sorted by overall popularity of the work. This results in frequent mind-boggling results where the default result, which many users don't know how to change, doesn't actually match any of the words in the title, but connects because each of the words appears in some edition or another.

(Recent case in point: the first Touchstone suggestion for my new ER win, The New Wild, is "The Portrait of Dorian Gray". Which does admittedly contain the word "The"; one edition has "New" in the title, and a couple people have misspelled the author's name without the "e". That's all it takes to rocket this book with 22,000 copies to the top of the list.)

Touchstone searches, though, are different from title searches where people may be uncertain of the title or not have a particular title in mind; the user has a particular title in mind. A minimal change would be to do a phrase search rather than a words-search, so the Touchstone logic searches for, in this case, "The New Wild" - providing four results including the correct one. (The correct one doesn't show up at all in the existing version.)


The New Wild now gives "Wuthering Heights" as the first touchstone, with Dorian Gray as the second and the correct work in the 23rd position.

6Petroglyph
Feb 8, 2017, 2:22 pm

It's been two years, and it's been getting progressively worse. Please fix this! Please!

A recent example for me: Paris stories by Mavis Gallant comes up as Dickens' A tale of two cities. Some of the other titles suggested by the algorithm are: Madame Bovary; Candide; A moveable feast; The travels of Marco Polo; In our time; For your eyes only; Green hills of Africa; The mystery of the 99 steps; César Birotteau; Kobbe's complete opera book; ...

7jnwelch
Feb 11, 2017, 11:04 pm

Good to see a new thread for this. I wonder what the status is, as I understood the problem was being worked on.

8lorax
Feb 13, 2017, 9:28 am

>7 jnwelch:

Really? Nothing in the other thread ever gave me any indication or hope that the problem was being worked on. Loranne said she would put it on her list to mention to the developers, but that's far from meaning there's actual code being written or even looked at.

9jnwelch
Feb 13, 2017, 5:07 pm

>8 lorax: It probably was wishful thinking. I thought someone said they hoped to have some changes in January. (Ha!) Maybe I dreamed it.

10laytonwoman3rd
Edited: Feb 25, 2017, 6:58 pm

>9 jnwelch: @lorannen did say she'd tried to bring it up to the development peeps in mid-January, Joe. http://www.librarything.com/topic/189572#5802687

We must keep agitating.

11jnwelch
Feb 25, 2017, 10:15 pm

>10 laytonwoman3rd: Ah, excellent, thanks, Linda. I'm really good at annoying agitating people.

12lorax
Feb 27, 2017, 9:30 am

10>

Yeah, I saw that too, but that's a long way from "being worked on".

13wester
Feb 27, 2017, 10:21 am

Just a few recent exemples then, to remind everybody how bad it is.

On Looking yields as touchstone The Girl With The Dragon Tattoo.
How we die yields The Very Hungry Caterpillar.
Cheap yields The Hound of the Baskervilles.

14laytonwoman3rd
Feb 27, 2017, 10:57 am

*washing machine agitation sounds*

15jnwelch
Feb 27, 2017, 1:25 pm

16charl08
Edited: Feb 27, 2017, 6:10 pm

My favourite one recently: looking for Julia by Otto de Kat.
Brings up: Fahrenheit 451.
Then...

Lord of the Flies
Hamlet
Romeo and Juliet
Les Misérables
Mansfield Park
Charlie and the Chocolate Factory

17EBT1002
Mar 18, 2017, 1:07 am

:-|

18Storeetllr
Mar 21, 2017, 6:19 pm

My latest "favorite": The Children. A loooong list of off-the-wall titles without the correct one ever showing up! Here's the image of the first 22. There are at least 250 titles listed. (Here's a link to the actual book: http://www.librarything.com/work/13779.)

19jnwelch
Mar 22, 2017, 3:27 pm

>18 Storeetllr: Wow. *shaking my head*

20lorax
Mar 22, 2017, 5:01 pm

That may be a new record - my previous "best" was one where the correct title eventually showed up at spot 218.

21laytonwoman3rd
Apr 30, 2017, 12:50 pm

*bump* Not enough grumbling going on around here lately!

22Petroglyph
May 1, 2017, 4:58 pm

It's been two years and a month since the original thread got started, and things have gotten progressively worse. If any staff members read this: please do something about it.

23timspalding
Edited: May 1, 2017, 5:50 pm

The simple truth is that this is a hard problem. The search structures that power sitesearch generally have a hard time with works--and all their editions. Setting up a whole new search system, which is what we feel is necessary, and which we did for catalog search, is resource constrained--we don't have enough servers for it (and for the second, parallel system you always need). I have been playing with the issue within the constraints of what we have, and will continue to do so, but I can't say a solution is on the horizon.

24gilroy
May 1, 2017, 8:43 pm

>23 timspalding: Is there any way to tweak the existing algorithm so it isn't so ... bloody far off?

25lorax
Edited: May 2, 2017, 10:57 am

>23 timspalding:

First, thank you very much for weighing in.

Several years ago, I suggested that doing a phrase search rather than a words search would get most of what we need, and that's something that already exists for works search - for one of the examples I gave in my initial post so long ago, The New Wild, the correct title comes up second out of six (and has the most copies) on a sitewide works search with the phrase, versus fifteenth for a sitewide works search with the words (i.e. with vs. without quotes); it's currently 21 on the Touchstone search. Would this particular tweak really require a new search system, or is this a case where the best is the enemy of the good and you want the perfect solution, which we'll never get, rather than a decent one which could actually be implemented?

26jjwilson61
May 2, 2017, 12:23 pm

>25 lorax: I think a phrase search is exactly what Tim was talking about that would require the new search system. But, Tim, could you, after the site search do a local search of just the search results for exact title matches and move those to the top?

27lorax
May 2, 2017, 1:52 pm

>26 jjwilson61:

It already exists as part of the work search, though, which we know Touchstones uses (or at least it did, at one point. They could have changed that, I suppose.)

28jnwelch
Edited: May 2, 2017, 4:50 pm

Good to hear from Tim on this. As gilroy says in >24 gilroy: and lorax says in >25 lorax:, just somewhat better would be a help - not so "bloody far off". If the phrase search is doable and improves things, I'm all for it.

29librisissimo
May 4, 2017, 11:39 pm

23 timspalding

Glad to hear you are looking at it.
In the meantime, after laughing at some of the results, maybe someone would start a game of guessing what Touchstones will come up with for a given title, and giving a prize to the closest or funniest answer?

30lesmel
May 5, 2017, 8:58 am

>29 librisissimo: I think that was already done.

31Petroglyph
May 5, 2017, 9:45 am

>23 timspalding:
Thank you for responding at least.

32librisissimo
May 7, 2017, 12:18 am

>30 lesmel: Fun! Do you have a link?

34librisissimo
May 7, 2017, 8:24 pm

>33 lesmel: Thanks. It didn't seem to catch on, though. No prizes??

35gilroy
May 23, 2017, 3:30 pm

Okay, so here is where we have a serious problem with the touch stones that is really bad.

Green by Ted Dekker comes up as The Hobbit. Clicking on Other ... Not a single volume has the proper author as an option. In fact, not a single option that comes up is the SINGLE WORD title for any work. The closest it comes up with Ann of Green Gables at number 8.

36lorax
May 30, 2017, 11:01 am

Bump. Is there any chance that, even if this specific request cannot be implemented, the new Works search can be leveraged to make Touchstones somewhat less terrible?

37timspalding
Jun 5, 2017, 10:40 am

Okay, this is fixed. See https://www.librarything.com/topic/258659

>36 lorax:

"Okay, Green by Ted Dekker comes up as The Hobbit"

They now work, except that it chooses "Green" by REM. I'm going to continue to tweak the algorithm, but the logic is that "Green" by Decker isn't always called "Green" at all, but very often something with the series, book number and so forth. This makes it a weaker match for a mere "Green" than REM. (See https://www.librarything.com/work/8430269/editions )

38gilroy
Jun 5, 2017, 11:50 am

>37 timspalding: Yes, but Green by REM is closer than the Hobbit. It shows up in the list of selections (I think 8th now) so much better than before.

39lorax
Jun 5, 2017, 11:57 am

>37 timspalding:

Matching the wrong "Green" is fine with me.

Tests:

The Sound Book

The New Wild

Those both come up with the right result as the first option. Thank you SO MUCH.

40laytonwoman3rd
Jun 5, 2017, 12:11 pm

Jane Eyre no longer gives me Pride and Prejudice!

This is wonderful.

41Storeetllr
Jun 5, 2017, 1:16 pm

This is excellent! A big thank you to you and your team, Tim!

42jnwelch
Jun 5, 2017, 1:54 pm

The Underground Railroad works. Jane Steele works. Hmm. This seems like really good news. Thanks!