HomeGroupsTalkZeitgeist
  • LibraryThing
  • Book discussions
  • Your LibraryThing
  • Join to start using.

Search in Russian does not work?

Bug Collectors

Join LibraryThing to post.

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1ponzu
Sep 8, 2010, 2:13pm Top

This may be true of using any Unicode characters in search and may have been already reported and addressed, but I have not found any relevant topics.

I enter Russian books into my library. If the book is in Russian, I prefer to keep the title in Russian and link it as an edition to the corresponding work, whose "canonical title" may or may not be in English, depending on whether or not it has been translated. Even though LT does not officially "support" Russian language, all information entered in Russian displays fine. The problem is with search.

When I search my own library for a Russian title or author's name, what I get back is ALL books in Russian. Weird, right? It is as if LT "knows" that my search is in Russian (or maybe Unicode) and that some of my titles are in Russian, and that's all it cares about. It's almost like it does not care to filter the results by actual textual match.

Also, while searching sources when adding a new book, I sometimes get strange results, but I can usually get it to work for my needs, so that's not the main concern of this post.

Thanks in advance for clarifying this behavior and maybe doing something about i?

2andejons
Sep 8, 2010, 3:18pm Top

Try putting the search terms inside quotes - at least that weeds out a lot of false positives when I'm searching for words containing Swedish characters.

3ponzu
Edited: Sep 8, 2010, 5:45pm Top

Thanks, but that just causes 0 results to be returned.

After trying a number of searches I came up with a theory.

Entering any combination of Russian letters returns any book that has ALL of those letters anywhere in the title or authors name. I have tried any number of searches and the theory holds up. Even if some of the results returned at first appear random, the letters from the search term can eventually be found in hidden fields, such as Edition.

Is there a reason/explanation to this and can this somehow be rectified?

4jjwilson61
Sep 8, 2010, 5:48pm Top

It sounds like its treating each Russian character as a separate word.

5ponzu
Sep 8, 2010, 6:06pm Top

Something like that.

It is also case sensitive. (Which is probably not the case with Latin character-based searches, and shouldn't be.)

6johninvienna
Sep 8, 2010, 11:11pm Top

But I just tried Chinese and it worked just fine.
Searched for: 倚鶴排 集 : 吉林日报书画院书画作品选 ​
Have about 90 books in Chinese (titles in Chinese characters) and the search just pulled that one book. Will try other languages tomorrow.

Search does not work for sources though!

greetings, John

7ponzu
Edited: Sep 9, 2010, 5:49pm Top

Search in Russian does not work with sources, either, if we're talking sources like Amazon (my default). It just returns 32 million books with bestsellers (the Millennium Trilogy, etc.) towards the top. However, it works against Russian sources, of which I use two.

Still hoping for an an explanation from the developers and some hope of a fix.

8ponzu
Edited: Sep 9, 2010, 6:01pm Top

johninvienna, if you have one and only book that has all these characters, then I am not sure you disproved my theory. I randomly rearranged characters in your search and removed some from the middle of the word - it still returns the same book.

If you click on my library http://www.librarything.com/catalog/ponzu

and search for Акунин, you find all books by Акунин (Akunin) as well as a variety of books that are not related to him, but have letters А-к-у-н-и in any of the fields. It is possible to come up with a search that will eventually return just one book, but the letters don't have to be arranged in the same order as they are in the words that make up the title or the author's name. They just have to be there.

Try searching my library for Впоискахгрубэж

It's a meaningless set of letters, only the first few make up a word. It finds one book and one book only that has all these letters in the title. You can add spaces arbitrarily anywhere between these letters: still the same result. As you start removing letters, more books are returned.

9ponzu
Sep 22, 2010, 2:14pm Top

This message has been deleted by its author.

10ponzu
Sep 28, 2010, 12:37pm Top

Any new insight from power users or interest from developers? Thanks!

11BarkingMatt
Sep 28, 2010, 12:47pm Top

#s 3-5 make me think there is a problem with interpreting Unicode - that's case sensitive for instance. Other than that, not a clue. Definitely a nasty bug.

12timspalding
Oct 5, 2010, 3:45pm Top

Deferring this. Sorry, but it's something we've spent enormous time on. The solution we have works to a large degree--enough that we aren't going to visit it in the near future. We'd like to do better, but that's the situation.

13ponzu
Oct 6, 2010, 8:25pm Top

Tim, thanks for acknowledging the issue.

I assume there are no workarounds or tricks for obtaining more precise search results that you can offer coming from the position of inside knowledge?

Other then typing out pretty much the entire author and title while observing capitalization.

14gangleri
Edited: Oct 15, 2010, 5:40pm Top

Hi! First I was thinking it might be related to http://www.librarything.com/topic/99640 "add meta ... Content-Type content="text/html; charset=utf-8 to site"

Then I remembered some issues related to the "clever search"
search_author.php: Gandi returns "Gandi", "Gandy", etc.
search_author.php: Steinbacher returns "Steinbach", "Steinbache", "Steinbacher", etc.
also search_author.php: Bamberger returns "bamberg", "bamberge", "Bamberger", etc.

What might happen is that this piece of code conflicts sometimes with database search. These are speculativ taughts ...

I do not have any explanation why the some searches related to ponzu's http://www.librarything.com/work/details/43365595 ( Смерть Ахиллеса ) work and other do not:
¹: (see P.S. ath the end of the message)
works search_works: Смерть
fails: search_works: Ахиллеса
fails: search_works: Смерть Ахиллеса

Maybe
1) a checkbox should be added to search to activate / deactivate the "clever search" if such a procedure is implemented
2) the activation / deactivation should be possible via a parameter (and values 1 | 0) for url syntax

Regards Reinhardt

refference: ponzu's books in Russian sorted by author

P.S.: The Touchstone tagged ¹ is Смерть Ахиллеса is visible during editing but not after posting the message. However it is linked to http://www.librarything.com/work/5427982 5427982::Смерть Ахиллеса
P.S.S. The attempt creating a second Touchstone in the P.S. failed. It is also visible during editing but not after posting the message.
----
added irrelevant "search_author.php: Bamberger"

15ponzu
Edited: Oct 15, 2010, 1:49pm Top

Thanks, Reinhardt. It seems to me from Tim's reply that he knows what's causing this behavior, but does not have a solution in mind.

And so long as we are posting links, here's what a search through MY books looks like for the same book:

deepsearch: смерть ахиллеса

It returns the needed book -- and nine others, seemingly at random. But my theory as to how these books make the result set still stands. What baffles me is why it works that way. Search in Russian works reasonably well when searching for works on LT (as you illustrated, though with some irregularities), or searching the sources. What makes deepsearch work letter by letter, I don't know.

16gangleri
Edited: Oct 15, 2010, 6:01pm Top

>15 ponzu: I could not see any result looking at your link; just "No books cataloged."
I never understand when deepsearch provides the requested results and when not. Also looking at others catalog sometimes shows only books with ISBN and skipps the ones without ISBN (however counters are inconsistent). But these are other issues; they might be solved now.
Have you tried the link also while beeing logged out?

P.S. added "No books cataloged."

17ponzu
Edited: Oct 19, 2010, 8:07pm Top

>16 gangleri: Me neither! But I could the day I posted it.

This version works (with capitalization):

deepsearch: Смерть Ахиллеса

It doe snot work as intended, as it returns extra books, but it works as expected given our understanding that the deepsearch algorithm breaks Cyrillic search terms down into letters and searches for each letter separately.

All this is very strange.

Group: Bug Collectors

731 members

73,612 messages

Bug Tracking

This topic is part of LibraryThing's in-talk bug tracking.

Join or watch Bug Collectors to get "Bug Tracking" under "The World" in Talk all the time.

Bug (edit)

ID: 98281

Category: Non-English LibraryThing

Assigned to all

Reported by DaynaRT

Status: Deferred

Oct 5, 2010, 3:45pm

4 years since last change

By timspalding

Status log

Reported. DaynaRT (Sep 28, 2010, 12:37am)

Deferred. timspalding (Oct 5, 2010, 3:45pm)

About

This topic is not marked as primarily about any work, author or other topic.

Help/FAQs | About | Privacy/Terms | Blog | Contact | APIs | WikiThing | Common Knowledge | Legacy Libraries | Early Reviewers | 94,363,199 books! | Top bar: Always visible