HomeGroupsTalkZeitgeist
This site uses cookies to deliver our services, improve performance, for analytics, and (if not signed in) for advertising. By using LibraryThing you acknowledge that you have read and understand our Terms of Service and Privacy Policy. Your use of the site and services is subject to these policies and terms.
  • LibraryThing
  • Book discussions
  • Your LibraryThing
  • Join to start using.

Sorting changed, but the same

New features

Join LibraryThing to post.

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1timspalding
Edited: Nov 25, 2007, 2:33am Top

I've changed how book titles are sorted. The proximate reason was to allow members to correct for special cases—when simple rules about not alphabetizing "the" don't work well (see this post and the example A is for Ox). Anyway, once I know the system is working, we can start allowing that. I think the best method will be to allow members to insert some sign within the title, showing where the sorting should start, if not the default. I'm thinking it should be two vertical lines, eg., "||A is for Ox." Your suggestions for user-interface would be appreciated.

The system is now in place. Nothing should have changed. If it has, I'll notice it after a day or two of running, or when someone tels me about bad sorting behavior.

Anyway, although this will make a few users very happy, and underscore LT's committment to quality cataloging, the real reason is other. By not storing two versions of the title—one for real and one to sort by—but just storing the title and a character number, we'll save a few gigabytes on the database. Considering our relentless growth, jumping back a month or two on file size is going to be a welcome relief for the monkeys.

2thorold
Nov 25, 2007, 4:41am Top

3timspalding
Nov 25, 2007, 8:51am Top

Ah, good. I got it. It had to do with an extra space at the end, which shouldn't have been counted.

I'm fixing any that suffered from this problem. Should be good in a half-hour.

4rebeccanyc
Edited: Nov 25, 2007, 9:50am Top

I am getting books starting with the word "I" (e.g., I Married a Communist by Philip Roth sorting before titles beginning with the number "1". And for some reason, the guidebook Montana (Compass American Guides) is sorting between two titles starting with the word "I", The Magic Mountain sorting between The Age of Napoleon and, in this order, The Baked Apple, Tales of Mendele the Book Peddler, and The Alhambra. These are just some of the errors on just the first two pages of my catalog.

Reverse order is just as bizarre, with Mythology and The Mystery of Numbers appearing between Zen and the Art of Motorcycle Maintenance and Youth and the Bright Medusa.

Edited to add: Interestingly, this is not reproducible, although there are still numerous errors. When I resorted on title, I got a guidebook on Rome, Rome and Environs (Blue Guide), 3d ed. and The Rubber Band sorting between two titles beginning with the word "I".

5AngelaB86
Nov 25, 2007, 11:42am Top

I think this new system may be responsible for why my ...And Now Miguel is listed first in my catalog, when it was originally listed in the A's. Aaaack!

6AnnaClaire
Nov 25, 2007, 12:52pm Top

Wait, how do I fix my alphabetization? I tried changing the title of Les Misérables to "Les ||Misérables" per my understanding of the first message, but that just added a double pipe to the title and didn't move it.

7khms
Edited: Nov 25, 2007, 1:27pm Top

I think Tim meant that this part isn't implemented yet. (That's why "Nothing should have changed" and "I'm thinking it should be ... Your suggestions for user-interface would be appreciated".)

ETA He already has two suggestions from me in the other thread he mentioned. And one from himself that he's apparently abandoned after I didn't like it (to put it mildly).

8timspalding
Nov 26, 2007, 3:24am Top

Hey. I'm working on this. Most problems should be solved, but clearly there are some odd wrinkles. For example, rebeccanyc's titles had extra spaces between words, which I didn't consider but am now.

9rebeccanyc
Nov 26, 2007, 8:00am Top

Tim, it looks fine now. Not sure what you meant by extra spaces between words, but I will certainly try to look for these in the future. (By the way, titles added through the LOC usually have a space between the title and the colon that separates the title from the subtitle; e.g., "This is the title : and this is the subtitle." When I remember, I take this out during editing.)

10timspalding
Nov 27, 2007, 12:27am Top

>9 rebeccanyc:

No, don't worry about it. Fundamentally, the forms allow you to put two spaces between a word. HTML always removes it visually, but I wasn't accounting for the possibility.

The main problem now is that, fundamentally, these should be the same. They're not always now

10,000 Maniacs
10000 Maniass

Pnin : A Novel
Pnin. A Novel

T

11henkl
Nov 27, 2007, 3:36am Top

>9 rebeccanyc:

When there is no space between the title and the colon, I add one.

12GreyHead
Nov 27, 2007, 4:41am Top

Just wondering if this has also fixed the feature where LT treated 'Pnin' and 'pnin' as different works?

13khms
Nov 27, 2007, 2:14pm Top

Hmm. Is there a reason Pnin wasn't touchstoned?

14timspalding
Nov 27, 2007, 2:54pm Top

Actually, yes. I prefer only to use touchstones when it enriches the experience. In this case, you're not really motivated to check out a title because I use it as an example of alphabetization. Worse, it mucks up the "conversations" part of the work page. When you are on that page you want to know what people are saying about some interesting work, not what discussions about alphabetization touch on the title of the work.

15khms
Edited: Nov 27, 2007, 3:05pm Top

Well, in this case, *I* wanted to know what this book was everyone mentioned, because I had never heard of it before. That is, getting from the discussion to the work. Turned out to be something completely unexpected, too.

I looked at "discussions about your books", but as the result seems to be about half of all talk threads (probably less, but it feels like that), I judged that one to be pretty useless for me.

I followed the "discussions about this work" link a few times, but found that usually those discussions I looked at didn't say anything interesting about the work, so this also failed to register under "interesting features" for me.

Which leaves the one in the first paragraph, which *is* something that (usually) looks useful to me.

ETA: I've just started to register those books, and I'm freaking 426 of over 300,000 members? That doesn't feel right!

16AngelaB86
Nov 27, 2007, 3:14pm Top

Could someone please explain to me (in idiot-proof "a 5 year old could do this" words) how to make ...And Now Miguel get off the top of my catalog (before the # titles) and back in the As where it belongs? I don't know how to do that tall lines thing...

17readafew
Nov 27, 2007, 3:22pm Top

Most US keyboards it's on the key over the Enter Key Shift '\'

18reading_fox
Nov 28, 2007, 8:59am Top

SO where should 20,000 leagues under the sea be filed?

Currently it sorts as 20
rather than 20000
or the probable ideal option of twe - this latter might be a bit difficult to code though?

19lorax
Nov 28, 2007, 2:02pm Top

18

"currently it sorts as 20"

So strip the comma from the title of your copy.

"this latter might be a bit difficult to code"

Not sure whether it's really ideal or not, but it looks as if it's been done, sort of (the Perl module Math::BigInt::Named). It's not perfect (you need to fiddle with your code a bit to make it work, it only supports English and German, and "twelve" is spelled wrong) but at least someone trying to do this wouldn't need to reinvent the wheel.

Unless of course they're coding in something other than God's Chosen Language. :)

20timspalding
Nov 28, 2007, 2:24pm Top

Could someone please explain to me (in idiot-proof "a 5 year old could do this" words) how to make ...And Now Miguel get off the top of my catalog (before the # titles) and back in the As where it belongs? I don't know how to do that tall lines thing...

Don't do anything for today, okay? I'm going to play. It should work for that title, but it's not. I need to figure out why.

Unless of course they're coding in something other than God's Chosen Language. :)

If Perl is God's chosen language, Nietzsche was right.

Sorry, you walked into that one ;)

21Lumilyhty
Dec 1, 2007, 6:37am Top

Tim, I'm glad you always have an ear ready to listen to LT users' suggestions.

The solutions put forward in this thread and others seem awfully cumbersome to me. To me, a perfectly simple, useful and easy-to-use way is what the music software iTunes has: just an extra field for sorting the titles. So, taking an example from the field of music, if I have an aria called 'Martern aller Arten' from the opera 'Die Entführung aus dem Serail' by Mozart, I can make it sort itself as 'Die Entführung...', 'Entführung...' (scrapping the German definite article) or 'Martern...' (just going with the more specific part in the work), depending on what I've grown to regard as natural. Users also use this to eliminate the confusing factor of different scripts from their music library. Thus, I have all music by Руслана (Ruslana) conveniently listed between 'Ro' and 'S'.

I refuse to believe this is hard to do programming-wise. Thinking of LT, aside from the already mentioned problems with non-English definite articles and titles starting with non-letter characters, this would also sort out the following problems:

*The above-mentioned different scripts
*Problems with accented latin characters (right now LT does not distinguish between, for instance, e and é, which messes up the order)
*Titles beginning with numerals (I suspect somebody's bound to want them listed as 'Twenty-thousand leagues', not 20,000 leagues - and need I add this is language-specific)

So tell me what you think of these ideas, people...

22timspalding
Dec 1, 2007, 11:55am Top

>21 Lumilyhty:

The problem with your idea, as I see it, is that it puts the information in two places. 99% are totally uninterested in this issue. I don't think they want to fuss with it, or even be confronted with it. And if they change the title for some non-sorting related issue, they're going to be very weirded out if the old position still applies. I suppose we could have a second field and have it be empty by default, and when it was empty it would sort by the actual title.

That might work and I'll consider it. From a db point of view, I prever storing the title and a sorting number. Storing the title twice doubles the storage. LT is already storing more bibliographic data than all but the two or three largest libraries in the world, and with far less technical infrastructure. So where I can save space, I will. The amount are non-trivial. If I had to raise the membership fee for everyone by $1 to cover this, would it be worth it?

The non-English sorting is very, very difficult. LT does treat e and é as the same letter from a sorting standpoint. That's how English does it. But letter order and alphabetization differ in all sorts of interesting ways between languages. In theory it might be possible for users to set one standard for their library, but I don't think you could have all the standards at the same time. That is, the system needs to decide if ç is a separate letter from c (Turkish) or not (French). What's possible is not, however, going to happen. Doing different language sort is not easy, and would move lots of the "work" from the database into PHP, which is not desirable here.

23vpfluke
Dec 1, 2007, 2:41pm Top

Tim:
Are the three largest libraries in the U.S.: Library of Congres, Harvard University, and New York Public. How does the national British Library rank?

On ordering letters, I think treating e and é as the same is quite OK with me. I don't remember much beyond French.

Sometimes, it is hard to do joint cataloging (which we all do in LT) when big libraries have a master cataloger and that person is a final arbiter of the way the information is presented.

24timspalding
Dec 1, 2007, 2:59pm Top

Yes, that's right. Actually, I've never found a good list of world libraries by size. List of that sort are always half-bogus. You can count them in a dozen different ways.

25DouglasAtEik
Dec 10, 2007, 4:47pm Top

>1 timspalding:
Has this change been implemented (i.e. book title "Der ||Something" to sort by "Something" rather than "Der") ?
I understand your post to mean that it is implemented, whereas a quick test appears to demonstrate the contrary ...
?

26timspalding
Dec 10, 2007, 10:49pm Top

No, just the intellectual structure for it. Let me look at it right now.

27AnnaClaire
Edited: Dec 11, 2007, 10:10am Top

Hang on, DouglasAtEiK did the same thing I did in message 6. I'm starting to think there was something confusing in how it was presented, the most likely culprit being:
The system is now in place. (#1)


The sentence after probably didn't help, either (if the system is in place, why would nothing have changed?).

Please, remember that we speak English, not whatever jargonese-sub-variant-dialect you were using in that first message!

28timspalding
Dec 12, 2007, 1:07am Top

I know. Apologies.

29timspalding
Dec 12, 2007, 1:44am Top

Okay, it's in there. You can do:

||A is for Ox

Or:

Ta ||Indika

The trick is that the || is going to show in various places. I think it SHOULD show in the detail screen, at least if it's your book. But not everywhere. I've nuked it from the "your library" view.

30koffieyahoo
Dec 12, 2007, 3:25am Top

Before I start adding these to the books in my library: Since this effectively changes the title of a book and since changing a title may uncombine a book from a work. What happens in this case?

31koffieyahoo
Edited: Dec 12, 2007, 3:46am Top

Right definitely not working correctly yet:

* My copy of De bonkige baarden seems to have become uncombined (at least my library gives 0 shared, while before the edit I used to share it with snellius). Actually this is really weird: I doesn't seem to have become uncombined, the member count is now off by 1.

* Adding the double lines to my copy of Kafka's Das Schloß. Didn't seem to work at first. Now it works, but no double lines show up when I edit the work (either in place in my library or on the details page)

32AnnaClaire
Dec 12, 2007, 10:56am Top

OK, so I've double-piped Le Morte d'Arthur to Le ||Morte d'Arthur, Les Misérables (Signet Classics) to Les ||Misérables (Signet Classics), Les Précieuses Ridicules (Petits Classiques Larousse) to Les ||Précieuses Ridicules (Petits Classiques Larousse).

They're not showing up in the L's anymore, but they're not showing up in the M's and P's, either. Les Misérables is showing up at the end of the list. I can't find the other two.

I'll wait and look again. In the meantime, ?????

33thorold
Edited: Dec 12, 2007, 6:01pm Top

I've managed to get ||A bord de l'Etoile-Matutine and De ||Aanslag up to the top of the list, but Die ||Harzer Schmalspurbahnen. is appearing in the B's ahead of Die ||Blechtrommel. Is the re-sort still running?

ETA: ...and Le ||Chant de l'équipage is now up there in front of all the a's and the numbers.

34AnnaClaire
Edited: Dec 12, 2007, 6:41pm Top

Well, the three books I double-piped this morning are all in the right places now. So, look again a bit later -- it might fix itself.

35koffieyahoo
Edited: Dec 13, 2007, 3:00am Top

31>

Mmm, the problem with "De bonkige baarden" may already have been there now I think about it and the sorting of Das Schloss seems to have recovered. However, changing the title of a work sometimes seems to mess up the member count...

What is definitely a problem is that when I do in library editing of the title of a book, i.e. double click on a title in my library to change it, the the double vertical lines don't show up.

Edit: The double verticals are showing up in the "random books from my library"-section on my profile page. I don't think I want to see them there.

36thorold
Dec 13, 2007, 2:28am Top

Looks fine for me now, except that Die ||Blödsinnigen / The Idiots appears before Die ||Blechtrommel. Since the rules for sorting accented characters are inconsistent from one language to another, there's probably no solution to that type of problem that won't upset another group of users somewhere else...

37khms
Dec 13, 2007, 2:29pm Top

There's no general solution that works for everybody (only picking language-sensitive versions would possibly do that), but there's a fairly good compromise solution described ... just let me get the other window ... here: http://www.unicode.org/reports/tr10/ (or you can go the extra mile and implement the tailoring explained there and let every user specify a locale for sorting - but I'd call that overkill at the present time, unless you can find a library that already does all that).

Group: New features

45,200 messages

This group does not accept members.

About

This topic is not marked as primarily about any work, author or other topic.

About | Contact | Privacy/Terms | Help/FAQs | Blog | Store | APIs | TinyCat | Legacy Libraries | Early Reviewers | Common Knowledge | 134,087,347 books! | Top bar: Always visible