HomeGroupsTalkZeitgeist
This site uses cookies to deliver our services, improve performance, for analytics, and (if not signed in) for advertising. By using LibraryThing you acknowledge that you have read and understand our Terms of Service and Privacy Policy. Your use of the site and services is subject to these policies and terms.
  • LibraryThing
  • Book discussions
  • Your LibraryThing
  • Join to start using.

Search test, anyone?

Recommend Site Improvements

Join LibraryThing to post.

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1timspalding
Edited: May 9, 2007, 1:35am Top

If anyone is minded, you can try out a new very-beta search capability. I know it isn't perfect, but would like some early feedback.

Useful feedback: What works? What doesn't? if you have an accent-rich library, let me know. I am particularly certain that non-Latin character searching will not work, but examples of it not working (ie., particular books) would help. Your browser and OS don't matter.

How: Go to the Search tab and look down for "Beta search." Update: No longer at the bottom.

Features: These should work.

alexander
alexander -great
"alexander the great"
alex*
*lexander

You can also field it, so:

tag:Greek title:Alexander
isbn: 0671206583
source: Amazon all: Greek

By default, it uses "all."

The fields are: all, title, author, tag, isbn, date, lccn, dewey, source, subject. It doesn't normalize around isbn, lccn or dewey.

Notes: (1) it only works from the search tab. If you use "all fields" it will do it the old way; (2) it should handle at least some accents, by ignoring them.

2lilithcat
May 9, 2007, 12:13am Top

Hmm, just tried it with some accents and it works, sort of. It finds the accents, but the search is not restricted to the word with an accent. For example, I searched on the word "décoration" (to find Dorure et décoration des reliures). The search also returned books with the word "decoration" (sans accent) in a field.

This, however, is not necessarily a bad thing. Particularly with author names on books entered other than manually, it can be a crap shoot as to whether accents are included.

3timspalding
May 9, 2007, 12:17am Top

Yeah, that's the intended functionality. Try that on Google, and you'll get the same result. And yes, accent-inclusion is a crap shoot, particularly on Amazon.

4GreyHead
May 9, 2007, 1:02am Top

Tried it with "+greek -history" and it's gone into an infinite loop - page loading but nothing happens.

No, I tell a lie, several minutes later it has produced a result that looks OK.

5timspalding
Edited: May 9, 2007, 1:07am Top

Well, you have 6,700 books. However, I'm betting it (1) takes longer with the other method*, (2) goes much faster the second time around. But clearly it needs some visual feedback about what's happening.
*Actually, don't try. The problem with the other method is that it relies on MySQL's fulltext engine, which only runs on MyISAM tables. MyISAM tables "lock," so that while it may not be terrible to do one search, the searches can't run concurrently and can back up terribly. Then people start re-searching, etc. and it's the end of the world.

6SilentInAWay
Edited: May 9, 2007, 1:36am Top

nevermind...I see -- you exchanged the old and new search fields. It's still working.

7timspalding
May 9, 2007, 1:34am Top

Hi. No. I think you were confused when I moved it up the page. the beta is now where the old search was. The old search (now labelled "old") is down a the bottom. I saw so many people using the old search, who could be using what should be better.

8timspalding
May 9, 2007, 1:37am Top

Sped it up.

9timspalding
May 9, 2007, 1:55am Top

Now the only option. It's stable enough compared to the alternative. Feedback very much encouraged. (For example, it was misunderstanding "greco-persian" as "greco -persan." Fixed now.)

10boekerij
Edited: May 9, 2007, 1:57am Top

While trying to search "Ierland" (no brackets), the system returned this message :

SELECT bs_booksid FROM booksearch WHERE bf_id = 364416-wait error 1

Search URL was:
http://www.librarything.nl/catalogsearch.php?search=Ierland&searchall=1&...

11timspalding
Edited: May 9, 2007, 2:06am Top

Strange. The database must have gone offline for a moment. Tell me if you get it again?

12SilentInAWay
May 9, 2007, 1:58am Top

I'm getting a different result using the old and new searches (I've got the old search page still open on a tab in IE7). I'll let you know what I figure out. I just wanted you to know why you might see a few more old search calls.

13kathrynnd
May 9, 2007, 1:59am Top

This is great--you can search subjects.
subject:domestic fiction {for example}

Will the comment field be added soon? I can search for the term ISSN in the old system and bring up the books I've noted as having an ISSN but this didn't work in the new system, then noticed it wasn't on your list.

14timspalding
May 9, 2007, 2:03am Top

You can search subjects. I think you'll have trouble with the the > (on other systems --). We store it as a tab, I think. I'll look into it.

Other fields. Yes. I can add reviews, comments.

I'd like some combined field types besides "all." (Subject is effectively two, since it's both the global and the local subjects. Author is also LF and FL.) I'm thinking "titleauthor" is another good one?

15timspalding
Edited: May 9, 2007, 2:07am Top

That was me (testing B's catalog). Fixed in a sec.

Oh, btw: you'll note you can't do (dog|cat), etc. I could do simple alternates, but complex nesting (x|(z|a(1|2))|x), not.

16SilentInAWay
Edited: May 9, 2007, 2:11am Top

Using the old and new searches, I received different results when I searched for the following:

literature -20th century -19th century

Both the original and new searches seemed to handle the double-subtraction, but the new search found about 25 more books than the old. I lost my original search page before I could investigate the differences in the two sets.

Also, it would be nice if the search string could contain more characters (so that one could, say, search for all literature written prior to the 17th century by subtracting several tags).

17SilentInAWay
May 9, 2007, 2:09am Top

> 15

a simple left-to-right union (OR) would be great

18timspalding
May 9, 2007, 2:11am Top

Both should be considering "literature -20th century -19th century" as "books with literature, but not 20th, but with century, but not with 19th but with century again." That's unless you were using quotes around things.

From 45 character to 65 in a sec.

19timspalding
May 9, 2007, 2:15am Top

>17 SilentInAWay: without nesting? I'm sure people would complain.

I could do a simple |, provided it was understood as a simple alternative, so that

dog|cat care

is cat and dog care, but also

chocolate cakes|vanilla cookies

isn't an alternative between chocolate cakes, but between chocolate cookies which are either also cakes or vanilla.

There ought to be a string library that did pattern searches in a "fulltexty" way.

20boekerij
Edited: May 9, 2007, 2:31am Top

>11 timspalding:

Alas.

Search string now was : "Éire ár sinsear" (without the brackets). This string is part of an existing book title in my library.

Message returned (on top of full blank window--i.e. : containing no headers) read :

SELECT bs_booksid FROM booksearch WHERE bf_id = 369356-wait error 1

Search URL was :
http://www.librarything.nl/catalogsearch.php?search=%C3%89ire+%C3%A1r+sinsear&am...

Odd.

Addendum

The book is found and correctly returned when I use the "Search books" Search box.

21SilentInAWay
Edited: May 9, 2007, 2:26am Top

18>

oh, duh -- I had stopped using quotes around mutli-word terms with the old search because it didn't always work the way I expected. With the proper quotes, this works nicely.

literature -"20th century" -"19th century"

Whereas, the results for

literature -20th century -19th century

as you pointed out, returns the same results as

literature century -20th -19th

However, using the old search the results were different. It was probably a problem with the old search, but I have no way of checking this.

BTW: the first search listed above returns nearly 2000 hits in about 5 seconds -- sweet

22timspalding
May 9, 2007, 2:25am Top

Can you try problems again (after a few seconds waiting). I'm trying to figure out if this is something twitchy, or something with a "reason" behind it. I'm not reproducing it.

23timspalding
May 9, 2007, 2:27am Top

>BTW: the first search listed above returns nearly 2000 hits in about 5 seconds -- sweet

Thanks. If I can get this to work, I think it solves a bunch of problems. Time and (most) charset issues.

24SilentInAWay
May 9, 2007, 2:32am Top

>19 timspalding:

You're probably right about the complaints. On the other hand, with nesting, it suddenly requires special knowledge to search for a string containing parentheses.

Also,

"chocolate cakes"|"vanilla cookies" would work, no?

25timspalding
May 9, 2007, 2:36am Top

>"chocolate cakes"|"vanilla cookies" would work, no?

Yes.

In general, people's desire for cool search features outruns the actual use.

26SilentInAWay
Edited: May 9, 2007, 3:36am Top

> 23 If I can get this to work, I think it solves a bunch of problems. Time and (most) charset issues.

I agree, if you can implement the same basic algorithm on all relevant search fields throughout the program.

> 25 In general, people's desire for cool search features outruns the actual use.

No joke. I don't imagine there is a real demand for nested boolean searches, even though the second that you implement both AND and OR, a dozen members will immediately point it out. Of course, if you take the time to write an all-out boolean search, there will be heated comments that you are wasting time (as if it's theirs) on non-essential options. You can't win -- so follow your conscience.

My two cents worth -- implement cat|dog, don't bother with nesting for now, weather the knee-jerk reactions, and keep an eye on future requests to see if there is a real need for nested boolean searching.

Also, it looks as if you increased the allowed search string length by 15 or 20 characters. Thanks. Now that terms can be prefixed, however, you may want to increase it even more.

Everything I've tried so far is working nicely. Thanks, Tim.

27timspalding
May 9, 2007, 3:08am Top

Yeah, this won't work across the site. It works by taking the search OUT of the database. Nobody has 100,000 books, so you can pass everything from the db to PHP and process it there. Because of MySQL's limitations—fulltext on MyISAM, only use one index, etc.—it works for this.

Works, by contrast, have to run through four million variant titles. There's a decent fulltext title search on works, but for various reasons it doesn't handle charset issues as it ought.

28SilentInAWay
May 9, 2007, 3:31am Top

27>

Yeah, I had a feeling that was the case. I've used foreign characters (mostly accents) in my tags, so I love the new search. But there have been a fair number of recent complaints about the lack of support for foreign characters in the site search (as I'm sure you know, it's been cited as one deterrent to widespread European adoption of LT).

29prezzey
May 9, 2007, 7:57am Top

Hungarian and Scandinavian special characters seem to work, but the actual search string is stripped of accents so eg. "u" and "ű" give the same results. I was hard-pressed to find an example where this'd be a problem, but finally I managed to: if I want to search both my Norwegian and my Swedish books, "på" (which means 'in') is a very useful string because they are tagged "på_norsk" and "på_svenska", respectively. However, "på" at present searches for every book containing "pa", which is much more.

But I think this is not a pressing concern at the moment, non-Latin searching would be more important overall. I have a couple of books in Hebrew, but I haven't added them yet because the issues seemed quite daunting. So I can't check if they work.

I hope this helps somewhat! I definitely have an accent-rich library so if you need to test specific accents, ask away and I'll try them. (The couple I tried seemed to work as above, even the usually-problematic ű.)

30mujahid7ia
May 9, 2007, 9:12am Top

The new search works with Arabic letters that I input, FWIW. I assume the top field "Search books (titles/authors/ISBNs)" is now the new search? Because Arabic does not work with the one titled "Beta: Search all fields".

31timspalding
May 9, 2007, 9:18am Top

No, the reverse. I didn't change the first one. It *does* appear to work well with utf8 data—much, but not all out data now. But it's restricted to title/author/isbn.

I'll play with your titles in the new search—figure out why it's messing up. It may not be fixable.

32timspalding
May 9, 2007, 9:18am Top

Oh, how did you get your data? Manual input, right?

33MMcM
May 9, 2007, 9:39am Top

Apparently non-Latin non-alphabetic scripts (Chinese) match every book and non-Latin alphabetic scripts (Arabic, Hebrew, Cyrillic) match no books.

A possibility for accented Latin would be to make it asymmetrical: if key has accent, it is required in data; but key without accent matches any accent. (You'd probably have to build a non-accent index and then filter the results more carefully; maybe that's too expensive.)

34DromJohn
May 9, 2007, 9:40am Top

Cool.

What I liked best was the field and the multi field searching, so that:
title:long subject:civil
got a hit.

I liked the truncation before and after words.

I'd like a placeholder.
radn*ti
did not work, while
radn*
and
*ti
did work.

35DaynaRT
May 9, 2007, 9:55am Top

Neat. I can find Nový Zákon in my catalog by searching on either Nový Zákon or Novy Zakon.

36mujahid7ia
May 9, 2007, 10:15am Top

#32 yeah I inputted manually.

37DromJohn
May 9, 2007, 10:23am Top

Actually,
radnoti
works, finding Radnóti

But
doinas
doesn't get Doinaş

So, the search improvements work and are well appreciated. Now the grunt work of expanding to less common diacritics.

38timspalding
May 9, 2007, 12:10pm Top

>Now the grunt work of expanding to less common diacritics.

No, my "de-accenting" script turns ş into s? as in "s and something crazy!" So I can deal with it. Doinaş now works.

I've added the fields "review" and "comment." By default it now searches the "most" field. "Most" omits subjects, reviews and comments. You can use "all" to get truly all. The goal is to have one search box, which can be parameterized.

39SilentInAWay
May 9, 2007, 1:11pm Top

> 38 I've added the fields "review" and "comment."

Wow. With the new implementation, changes to comments are immediately picked up by searches. No more waiting for an index to update!

I like the new default ("most").

I was pleasantly surprised to see that the new search treats the "double" characters ß and Æ/æ as interchangeable with "ss" and "ae". I don't know if your intent is to eventually support all character combinations or just those that are most commonly used. If the former, then you can add the combinations Œ/œ and Ǽ/ǽ to your to-do list. I haven't seen these used in LT yet, but they will inevitably be needed by someone the moment that they are dismissed as non-essential (I think that's a law of programming, or something).

Also, I notice that the new search is now used by the "Search your library" operation on the Your library page. Was this announced?

40DromJohn
Edited: May 9, 2007, 1:47pm Top

Thanks.

The search of
doinas
now finds
Doinaş
or at least my screen's version
Doina�s, �Stefan Augustin

FYI, my cataloging was an import from LoC, using the search
75-328778

41jjwilson61
May 9, 2007, 1:56pm Top

Thanks Tim. I'm glad to see that you're working on improving some of the more basic features of LT.

42kantelier
May 9, 2007, 2:31pm Top

> 29
But in my catalog klöppel finds something and kloppel not.
Somehow your ű differs from my ü, what is the matter here?

43peterbrown
May 9, 2007, 2:57pm Top

The primary value of Librarything to me is being able to do an "All Fields" search and recall the mass of data I've inputed, primarily relating to Contents pages - so that I can find additional short stories/essays etc and subjects. "all fields" is not doing it the old way, I've tested it on several authors and subjects and the response is now limited to "Book" & "Tag" fields - seemingly no data from the "comments" can now be recalled - this is a disaster for me.

Although in theory I could input the subject data into the Tags field it would be cumbersome and I don't want to do the work again. The author field is very restricted and takes only up to half a dozen authors or so and cannot cope with anthologies, essays/short stories with mutiple authors.

Please bring back the ability for me to search 'Comments' field because without this facility Librarything.com becomes extremely limiting for me!

44boekerij
Edited: May 9, 2007, 3:08pm Top

>11 timspalding:

The problem seems to exist (and maintain) with the "international" versions of LT only.

Take e.g. : Beta Search all fields for "rome" (no brackets)

When using www.librarything.com, there is no problem :

http://www.librarything.com/catalog.php?view=boekerij&deepsearch=rome

is returning 8 hits (and right so).

On the other hand, when trying the same within e.g. LT.nl, I am getting :

SELECT bs_booksid FROM booksearch WHERE bf_id = 434547-wait error 1

Search URL was :

http://www.librarything.nl/catalogsearch.php?search=rome&searchall=1&Sea...

The same kind of error message was returned when trying to search within LT.de :

SELECT bs_booksid FROM booksearch WHERE bf_id = 434555-wait error 1

Search URL was :

http://www.librarything.de/catalogsearch.php?search=rome&searchall=1&Sea...

Or with LT.fr :

SELECT bs_booksid FROM booksearch WHERE bf_id = 434969-wait error 1

Search URL was :

http://www.librarything.fr/catalogsearch.php?search=rome&searchall=1&Sea...

I didn't try out all different versions, but those examples might give a clue.

-----

Something different :
While doing some search tests, I discovered that Tag search (NOT Search all fields) is accepting search strings in i.a. Cyrillic and Greek script, too. Thus, Tag searches (within my own library) for e.g. : "Солжени́цын", "Пушкин" or "Ολλανδικά" all returned correct results. I think this is quite nice. It makes for search possibilities for original titles/author names in non-latin scripts when having added those as tags (within tag lenght limits). Nice.

45timspalding
May 9, 2007, 3:39pm Top

(from blog comments)

Peter: So, by default it searches everything except subjects, reviews and comments. This is partially about speed—most of the time most users don't need that other data, and getting it slows it down. And it's partially about use. I can certainly see some users being annoyed that books were showing up because their review contained a reference to a book; sometimes you want to search the bibliographic data, not your reviews as well.

So, by default, it's "most." If you use "all" it searches all, including the comments. Four extra letters! :)

I think that's a good compromise.---
Boekerij. Okay, that's something to go on. Odd.

46timspalding
May 9, 2007, 3:47pm Top

boekerij: I'm flummoxed. It works for me when I go in as you in NL and DE. And, looking a the logic—well, it's odd.

Can you try the following:

1. Go to http://www.librarything.nl/test_cookies.php
2. Click the delete cookies link
3. Sign in again and try the search

If anyone is having the same problem, let me know?

T

47timspalding
May 9, 2007, 3:50pm Top

Wait. I replicated it. Okay. Until I say so, your account is going to be spewing out debug into. Sorry, but I have to do that.

48timspalding
May 9, 2007, 4:04pm Top

Okay. Should be fixed. Thanks for the feedback.

49timspalding
May 9, 2007, 4:42pm Top

I hacked together some support for non-Latin searching. It works on mujahid7ia's library, at least some Arabic titles (eg.,
قصص النبيين). This is hard for me to test, so you help is appreciated. Specific exampls, with pasted text would be helpful.

50boekerij
Edited: May 9, 2007, 4:47pm Top

>48 timspalding:

Confirmed. It's fixed. No more error messages and it works as a snap now. Thanks for fixing this.

--
FYI :
1. - It wasn't the cookies, was it ?
2. - Dead link - known bug
Your test_cookies link (46.2) is dead. It tries to link at : >2 lilithcat:" rel="nofollow" target="_top">http://www.librarything.nl/test_cookies.php<br>2 lilithcat: (sic).
This is because of the (AFAIK) known bug that links at Talk adhere everything that follows until the next space or dot character (and maybe some others), thus in this case adhering the line break ("<br>") and the "2".
Possible (temporary) workaround (until the bug is fixed) :
After a link, remember to always add a space character. This way, the line break ("<br>") will be separated from the link.

51AnnaClaire
Edited: May 9, 2007, 4:49pm Top

I tried two versions of the same author's name -- mostly to test accents, since someone brought them up earlier. The search was a bit slow when I entered Régine Pernoud and still didn't find anything. But it was fine (and speedy) with the less "correct" Regine Pernoud. By the way, I got three results, the first two of which -- Regine Pernoud; Translator Peter Wiles and Marie-Veronique Clin Regine Pernoud -- were kinda weird. The second one, I'm fairly certain, is Marie-Véronique Clin.

Edit: Joan of Arc: Her Story lists Pernoud's name with the accent in place.

52timspalding
Edited: May 9, 2007, 6:43pm Top

Hi. The whole site was off for a while. John is writing up a mea culpa. From my perspective, it looked like a problem with search—it wasn't "recording" the books that were found. This may be behind a number of problems, and I made some major changes, so I'm going to end this thread and start a new one.

Here's the continued thread.




Please don't post more. on this thread.

53sunny
Edited: May 10, 2007, 1:50am Top

Here's the continued thread.

54peterbrown
May 10, 2007, 1:18pm Top

Tim,

Take me through this please - where do I type "all" in order to get it to search all fields including Books, Tags & Comments?

Whenever I type "Poster" - I only get up 3 books by him - however I also have him as a contributor to at least one other volume. However I do the search I cannot get all four hits to come up.

I am using the search tab in the Betafield. Do I write: "Poster" or "Poster" + "All" - nothing seems to work.

Help - frustrated!

Peter

55sunny
May 10, 2007, 2:26pm Top

try all:poster

and: the thread has moved to this thread - if you have more questions, it's better to post them there.





56toddje
May 12, 2007, 8:01pm Top

Hi, i posted this as a comment on the original blog post:

I'd like to be able to search on the book using the cuecat to swipe the isbn. ie. I've got the book in hand, and want to edit something.

57ellen.w
May 13, 2007, 11:15am Top

Someone has already mentioned this problem in a general sense, but I'll add my specific case: I searched my library for a Japanese word, and the result was all of my books.

58timspalding
May 13, 2007, 6:17pm Top

Give me the Japanese word? I'll cut and paste it.

T

59timspalding
May 13, 2007, 7:43pm Top

Okay, I think I fixed it. Anyway, I made a big improvement.

I made the change and did some searches on your library, including:

薄紅天女
荻原
規子

I don't know Japanese, but it seemed to work.

Tim

60AnnaClaire
May 13, 2007, 9:25pm Top

I don't know Japanese either, but what the bleep are ???, ??, and ??, anyway?

61sunny
May 14, 2007, 1:39am Top

It's Japanese signs which your computer doesn't display because it doesn't have a font for them.

62AnnaClaire
May 14, 2007, 11:50am Top

...And which show up on my office computer. Go figure.

63kantelier
May 19, 2007, 3:49pm Top

Your Windows installation CD has the knowledge but usually just doesn't install. If you want them, here is the start for the procedure on an XP:

From the control panel select "regional settings and languages"
select the tab languages"
check "script for complex ... left to right ..." and/or the other check box.

Sorry I don't remember the exact English phrasing, nor the rest of the procedure, I'm too lazy to look for the CD.

Group: Recommend Site Improvements

85,760 messages

This group does not accept members.

About

This topic is not marked as primarily about any work, author or other topic.

About | Contact | Privacy/Terms | Help/FAQs | Blog | Store | APIs | TinyCat | Legacy Libraries | Early Reviewers | Common Knowledge | 134,760,237 books! | Top bar: Always visible