HomeGroupsTalkZeitgeist
This site uses cookies to deliver our services, improve performance, for analytics, and (if not signed in) for advertising. By using LibraryThing you acknowledge that you have read and understand our Terms of Service and Privacy Policy. Your use of the site and services is subject to these policies and terms.
  • LibraryThing
  • Book discussions
  • Your LibraryThing
  • Join to start using.

Cataloging improvement III: Better "Sort character" support

New features

Join LibraryThing to post.

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1timspalding
Edited: Nov 20, 2014, 10:34pm Top

See the blog post.
http://blog.librarything.com/main/2014/11/cataloging-improvement-iii-better-sort...

In short, the long-hidden "sort character" field is now fully accessible and editable, for your sorting pleasure.

2LucindaLibri
Nov 20, 2014, 11:34pm Top

Maybe it's just late and my brain is tired, but it took me a while to figure out that "sort character" literally refers to the # of the character on which I wish the entry to sort. Initially I thought the 1, 2, 3, 4, 5 in the table on the blog post represented different types/categories of sorting that I should know . . .

So, thanks for bringing this into the light, but more explanation might be required, somewhere . . .

Also wondering why this is included in export (i.e., are there apps that could use it from the export file?)

3timspalding
Edited: Nov 20, 2014, 11:38pm Top

Yeah, I should be clearer.

Well, there aren't really ANY apps that can use LT data. But if they could, it would be a useful piece of data. Library (MARC) records include it, so we might as well.

4yoyogod
Nov 20, 2014, 11:46pm Top

Is it a bug that two books I added earlier today had blank sort characters causing them to be listed first in my library?

5timspalding
Nov 21, 2014, 12:00am Top

No, I don't think so. Your catalog sorts correctly—sort by title—and they both have sort characters of 5.

6MarthaJeanne
Nov 21, 2014, 3:04am Top

I don't quite understand the sorting for umlaut characters. They seem to have a default of 2 and sort to the end of the alphabet. If I set it at 1 (or 5 after an article) they end up between A and B. Which is fine for Ä, but not a reasonable place for Ö.

7Louve_de_mer
Edited: Nov 21, 2014, 5:15am Top

Sorry, thread mistake.

8lemontwist
Nov 21, 2014, 5:50am Top

Finally! I can have my Italian books sort correctly without their articles lumping them all together! Thanks so much for this!

9Louve_de_mer
Nov 21, 2014, 6:19am Top

The three first books are not in the right place : http://www.librarything.fr/catalog/Louve_de_mer
But great improvement ! Thanks !

10bnielsen
Nov 21, 2014, 6:41am Top

>9 Louve_de_mer: Please include a bit more information. I'm not sure I see the same as you, when I follow that link. (Viewing style and sort order.) So could you list the first say 5 books and tell us why they "are not in the right place"?

11Louve_de_mer
Edited: Nov 21, 2014, 6:46am Top

Now only the first one is not at the right place after soe "refresh" and changes.
This book : http://www.librarything.fr/work/4582255/book/24736945 is first. Link for my books : http://www.librarything.fr/catalog/Louve_de_mer
Viewing style B and sort by title.
(maybe I made a mistake in counting characters ?)

12MarthaJeanne
Edited: Nov 21, 2014, 6:49am Top

>9 Louve_de_mer: I see N° 6 Revue Planète with sort character 12 as the first book. By my count it ought to be with the Ps. Have you tried what happens with 11 or 13?

After that I see books with numbers, which sort digit by digit, so not the way one might expect, but not wrong.

13Louve_de_mer
Nov 21, 2014, 6:53am Top

>12 MarthaJeanne: : The books with numbers are ok for me. I'm going to try the revue Planète with 11 and 13.

14Louve_de_mer
Edited: Nov 21, 2014, 7:02am Top

>12 MarthaJeanne: The book is no more in first place but... I didn't change anything about the sort character.

15MarthaJeanne
Edited: Nov 21, 2014, 7:41am Top

The sort character is back at 1.

16jjwilson61
Nov 21, 2014, 9:01am Top

>12 MarthaJeanne: I wonder if that degree character takes more than one byte?

17timspalding
Nov 21, 2014, 9:04am Top

>16 jjwilson61:

Yes, that's my guess.

18Louve_de_mer
Edited: Nov 21, 2014, 11:28am Top

>17 timspalding: Now the book is character sort : 1.
The "2" of the "12" I chose disappeared. Maybe don't you allow two digits for the character sort ? (I didn't remove the "2" of the "12" myself.)

19eromsted
Edited: Nov 21, 2014, 9:18pm Top

Some problematic examples:
The "whatever... works. The auto sort character is 6.
"The whatever... does not. The auto sort character is 2, when it should be 6 as well, no?
e.g. "The Good War": An Oral History of World War Two

The book ¡Raza Sí!, ¡Guerra No! : Chicano protest and patriotism during the Viet Nam war era is sorting to the bottom instead of under 'R' even though the auto sort character is 2.

Several works with irregular characters had an auto sort character of 2 for reasons that were unclear. Note that I changed these to 1 before thinking of reporting here.
Çatal Hüyük: a neolithic town in Anatolia
Антология Чартистской Литературы
Ōsugi Sakae, anarchist in Taishō Japan : the creativity of the ego
毛主席论党的建设

Also note: Ōsugi Sakae continues to sort to the bottom even after the sort character was changed to 1. I would think it should go with the 'O's, no?
And, I'm not sure where Çatal Hüyük went after I changed the sort character to 1, but it's not with the 'C's so far as I can tell.

20jjwilson61
Nov 21, 2014, 11:31am Top

It's still an ascii sort, so characters with things hanging off of them are going to sort elsewhere. The best way to handle this, I believe, would be with a separate Title for Sorting field as someone mentioned elsewhere.

21MarthaJeanne
Edited: Nov 21, 2014, 12:06pm Top

>19 eromsted: That fits with what I said in >6 MarthaJeanne:. Except that both Ä and Ö sorted between a and b if I used 1 as the sort digit.

22bnielsen
Nov 21, 2014, 1:34pm Top

>20 jjwilson61: I believe that was me. I think some would like "Twelve monkeys" to sort as "12 monkeys" and you can't do that just by chopping of the first x characters of the title.
BTW I don't think it is an ascii sort, but no sort order will ever satisfy all. Some think that Ø is just a sort of O and wants it to sort like all the rest of the O's. Most Danes will object to that, since we believe it should sort to one of the last positions in the alphabet.
Examples like that are legio.

On the other hand the sorting character fix will fix a lot of weirdnesses, so let us use that until Tim has implemented the third way of fixing. (1. The ||Great Gatsby 2. The Great Gatsby (5) 3. The Great Gatsby {sort as Great Gatsby}). It'll probably take about two weeks.

23timspalding
Nov 21, 2014, 1:44pm Top

Okay, the page now has an Excel export, which produces UTF-16E with an .xls extension. If you've been having problems, try that?

24eromsted
Nov 21, 2014, 1:49pm Top

25Louve_de_mer
Nov 22, 2014, 2:56am Top

>12 MarthaJeanne: Ok with "13". Thanks.

26JerryMmm
Nov 22, 2014, 5:10am Top

>25 Louve_de_mer: that's still a workaround. It's not intuitive to double count weird characters.

27prosfilaes
Nov 22, 2014, 9:42am Top

>26 JerryMmm: It depends on how weird; you could have to triple count them (“ ” quotes, for example), or quadruple count them (😀, for example).

28Cynfelyn
Nov 24, 2014, 10:26am Top

(i) Perhaps I'm a bit slow on the uptake, but when someone makes a 'Sort character' change, are they changing it for their work only, or for CK?

(ii) Does the blog article suggest ("Mostly, the system gets it right in the first place ...") that there is an automatic stop list for, e.g. definite and indefinite articles? If so, is it language site-specific, and if so is it possible to edit so that it applies only within the specific language site?

Users are more likely to use non-filing characters for their mother-tongue titles than for foreign titles. Using the example of, say, Marion Eames's Y rhandir mwyn, users of LT's Welsh site might want to sort the title under Rh, but non-Welsh users can't be expected to know that 'Y' is a definite article. Even with 'larger' and more widely known languages, non-German or French users are likely to look for Die Fledermaus under D rather than F, and Les Miserables under L rather than M.

29lorax
Nov 24, 2014, 10:32am Top

>28 Cynfelyn:

Their book only. There's no way this could happen at the work level, which is what CK is, since it's language-dependent.

30anglemark
Nov 24, 2014, 10:39am Top

Die Fledermaus, die!

31jjwilson61
Edited: Nov 24, 2014, 11:17am Top

>29 lorax: Yet CK is language-aware so this could be done.

ETA: Yet I'm not sure what good it would do. I suppose the site search could use it by the language of the site being used. Perhaps when a new book is added and LT auto-combines it, it could copy the sort offset from the work that corresponds to the site being used to the book?

32prosfilaes
Nov 24, 2014, 6:39pm Top

>29 lorax: Not just language-dependent; book dependent. "The Nabokov Russian Translation of Lewis Carroll's Alice in Wonderland" should sort as N, no matter what the other Russian translations sort as.

33timspalding
Nov 24, 2014, 8:17pm Top

FWIW, some process is still allowing 0s in the sort character. I'm working on it.

34lorax
Nov 24, 2014, 9:43pm Top

>32 prosfilaes:

Is that really the title? I'm sure there are other examples where different translations into the same language differ in sort character, though, so it's not terribly important.

35andyl
Nov 25, 2014, 4:59am Top

>27 prosfilaes:

Could something be done with background colour (something subtle) - so we see which characters are being ignored. I think that would make it easy for people to get right without counting and without trial and error if there are weird UTF8 chars.

36prosfilaes
Nov 25, 2014, 9:50am Top

>34 lorax: That's the title in my library; I believe it came from Amazon data. For another example, Hercule Poirot's Christmas was also published as "A Holiday for Murder".

37Noisy
Dec 2, 2014, 3:09am Top

I like this improvement - nice solution.

38lemontwist
Dec 3, 2014, 10:52am Top

I notice now that every time I add a new book, the default is no starting position so they all float to the top of my list. Could it default to the first character unless otherwise set? So annoying to have to do this manually for every book.

39timspalding
Dec 3, 2014, 12:45pm Top

Going to get some testing on this.

40Stevil2001
Dec 25, 2014, 12:14pm Top

I like the idea here, but don't like how this feature currently works.

I totally missed the rollout of this feature (and I would consider myself a power user!), so I have been frustrated that my new books with a leading article have been sorting incorrectly, and I filed a bug report-- someone directed me to this thread. Can't LT make books with "The" default to 5 so I don't have to manually change it every time? If I am missing it, I am sure many users are.

Also why does editing the title in catalog view and saving it without many changes make it sort correctly?

41timspalding
Dec 29, 2014, 12:21am Top

Can you give an example of a book which starts with "The" and which doesn't sort to five automatically? It's possible--we'd need to but it. But I think you've confusing the ability to change the sort character with the strong tendency for it to get it right at first anyway.

42r.orrison
Dec 29, 2014, 3:12am Top

I recently added three books whose title started with "The ". None of them had anything automatically entered in the sort digit box on the Edit Book page. Of course, I have now changed all of them to 5 manually, so this probably doesn't help.

https://www.librarything.com/work/11041062/book/114937492
https://www.librarything.com/work/13928479/book/114928432
https://www.librarything.com/work/909080/book/114845418

43CDVicarage
Dec 29, 2014, 3:46am Top

>41 timspalding: >42 r.orrison: I had similar problems when I entered some new books recently and I now remember to make a manual entry in the sort box. It would be helpful if I could rely on 'The', 'A' and 'An' to work automatically.

44jjwilson61
Edited: Dec 29, 2014, 12:08pm Top

None of the last 5 books I've entered, which have been since 11/13, have their sort from field set. The last one that does was entered on 11/12.

ETA: Actually the field not being set is what I see on the edit page. When I added the field to a view style on the your books page, they all show up as 1.

45Foretopman
Dec 29, 2014, 12:03pm Top

My two most recently entered books have a sort character of 1 instead of the correct 5.

46timspalding
Dec 29, 2014, 2:21pm Top

Okay, asking Kristilabrie to do a test across different ways of adding.

47jjmcgaffey
Edited: Dec 29, 2014, 4:28pm Top

Hmmm - looks pretty random, to me. I imported a bunch of books on the 27th (tagged _import141227); 16 books, added by import without ISBN. Two correctly have sort characters set to 4 and 5, all the rest are set to 1 including two books starting with The. I did some editing on all the books, but I don't think I changed the title on any of them.

Everything else I can see, back to October, is correctly set; some entered via import file, some in Add Books, some manually. I don't import with ISBNs, so haven't tested that.

48Stevil2001
Jan 5, 2015, 2:23am Top

I have been away from LT over the holidays, but here is a belated set: https://www.librarything.com/catalog.php?view=&tag=dalek+empire&view=Ste...

49Collectorator
Edited: Mar 29, 2015, 2:19pm Top

I have a book that won't sort on the 1.
https://www.librarything.com/work/8524349/book/54843794
It originally began with a single apostrophe. The sort was set to 2 but it wasn't sorting on 2. I removed the apostrophe (and the inexplicable set of quotes that was on the summary) and set it to 1. Still won't sort right.

I should add that the problem appears when I sort on Summary. It should be down with the Hs, but it appears as the first book.

50JerryMmm
Mar 29, 2015, 7:19pm Top

the work title still has the double quotes, perhaps it's still using that?

51Collectorator
Mar 29, 2015, 9:21pm Top

The work title should have double quotes because that is the way the book is titled. I would prefer that mine had the double quotes, too, but I want it to sort on the second character and it won't.

52TimSharrock
Mar 30, 2015, 5:55am Top

>51 Collectorator: Could it possibly be a "smart quotes" issue - using characters that take more than one byte? (there are some examples of this higher in the thread). If so then putting the double quotes in, but trying 3 or 4 as the sort character might work

53timspalding
Mar 30, 2015, 7:23am Top

Right. That's my guess. Smart quotes are mutlibyte, I think.

54Collectorator
Mar 30, 2015, 8:20am Top

Just tried 3,4,5,6,7 and 8. Not working.

55MarthaJeanne
Edited: Mar 30, 2015, 8:37am Top

But does the summary field use the sort number?

Yes, looks like it does.

56timspalding
Mar 30, 2015, 10:38am Top

Okay, what's the book number? I'll look at it.

57anglemark
Mar 30, 2015, 10:52am Top

It says 54843794 above.

58timspalding
Mar 30, 2015, 11:23am Top

So, it works perfectly at 2. The title sorts correctly. The summary sort is not affected by changes to the title sort offset, so it always sorts under ".

59jjwilson61
Mar 30, 2015, 11:35am Top

I wonder why >55 MarthaJeanne: concluded that the summary field does use the sort number then?

60lorax
Edited: Mar 30, 2015, 11:43am Top

>59 jjwilson61:

Yeah, me too. Tim's conclusion in #58 was my first guess as well, until I saw #55.

Edit: Actually in my testing "Summary" sort appears to be pretty badly broken. Reverse-order seems to work, but regular-order just uses the subsort, regardless of whether I use the column headers or the sort arrows.

61MarthaJeanne
Mar 30, 2015, 12:04pm Top

Yes, it is just sorting on the subsort, so of course it looked the same as the previous sort on title.

Sorry.

62lorax
Mar 30, 2015, 12:57pm Top

>61 MarthaJeanne:

It's hardly your fault that you assumed the sort actually worked!

63Collectorator
Mar 30, 2015, 1:22pm Top

>58 timspalding: So there's nothing I can do. It will sort by title in a normal fashion, but it will never sort as an H in Summary. ok. Thanks for looking into it.

64timspalding
Mar 30, 2015, 2:39pm Top

Right. Sorry.

65Kuiperdolin
Apr 17, 2015, 8:19am Top

Bit of a delayed reaction but :

Any chance we could get a similar system for author names ? It would be a more elegant and objectively better solution to the problem of names with particles than putting the particle with the first names.

66AnnaClaire
Apr 17, 2015, 8:44am Top

>65 Kuiperdolin:
Could it be implemented in such a way that would sort Christine de Pisan and Chretien de Troyes under C where they belong?

67Kuiperdolin
Apr 17, 2015, 8:57am Top

>66 AnnaClaire:
Well I'm an end-user, not a dev. But it does not sound impossible.

68AndreasJ
Apr 17, 2015, 9:12am Top

If "Christine de Pisan" is entered as just that, rather than "de Pisan, Christine" or the like, it should sort under C already.

69Kuiperdolin
Apr 17, 2015, 9:19am Top

Yes.

70sunny
May 4, 2015, 2:36pm Top

Ignoring non-English articles works fine.

Now I have left

Über das Glück
Über die Matten gehn ...

which should sort under ue *, but sort after Z if I leave the default (2) and sort under B if I change the default to 'Sort from start'

* (ä = ae, ö = oe)

any way to fix this, too?

That would be great.

-> also within words ä should sort as ae, ö as oe and ü as ue - so Bündner should not come before Badning

71MarthaJeanne
Edited: May 4, 2015, 4:14pm Top

This has been discussed many times before. The problem is that different languages alphabetize these letters differently. Even in German Ä is sometimes alphabetized as Ae, and sometimes after Az. The scandanavians prefer to put all the special characters after Z. No way to please everyone.

If you want Ü to alphabetize as Ue it is easy enough to enter your title that way.

72sunny
May 5, 2015, 1:44am Top

> Even in German Ä is sometimes alphabetized ... after Az

That would be wrong, though.

73prosfilaes
May 5, 2015, 3:37am Top

I would be very surprised if LT ever sorted Ä as Ae. In Swedish, Finnish and Estonian, it appears at the end of the alphabet; in English it's ignorable in sorting; in French it's, I guess, just another accent?; in Slovak it appears after A; in Turkmen it appears after E. The sources I'm seeing say that German dictionaries ignore the umlaut on Ä and sort it with A. The right thing here is very complex, and sorting it as Ae works for German only and won't strike anyone else as expected.

74bnielsen
May 5, 2015, 4:13am Top

And if anyone still thinks sorting is easy, go read this :-)

http://www.unicode.org/reports/tr10/

75prosfilaes
May 5, 2015, 4:30am Top

>74 bnielsen: LibraryThing has no excuse for not implementing that in full. I don't know what languages LT uses, but I'm pretty sure they're all boring old-school ones that have libraries that can do TR10-complaint string sorting for you, or at least new-school ones that can use JavaScript or Java libraries.

76Foretopman
May 5, 2015, 4:52pm Top

>75 prosfilaes: Sunny's catalog has books in Danish, English, French, German, and Swiss German. Can TR-10 correctly sort in five languages simultaneously?

77jjwilson61
May 5, 2015, 5:20pm Top

But Sunny is in Switzerland, so supposedly they would use whatever sort order they commonly use in Switzerland.

78prosfilaes
May 5, 2015, 9:30pm Top

>76 Foretopman: No, but it can provide a default Unicode sort that would be easier to justify then whatever they have now. And it would provide an easy stepping stone to user-chosen sorts.

Group: New features

45,201 messages

This group does not accept members.

About

This topic is not marked as primarily about any work, author or other topic.

About | Contact | Privacy/Terms | Help/FAQs | Blog | Store | APIs | TinyCat | Legacy Libraries | Early Reviewers | Common Knowledge | 134,134,845 books! | Top bar: Always visible