1timspalding
Chris Catalfo and I made a number of infrastructure changes to how tags are dealt with. The result is that tag pages are now much faster. Here, for example, is a chart of seconds-took to times happened for the key database query on the tag pages before:
(0) => 134568
(1) => 4379
(2) => 715
(3) => 416
(4) => 263
(5) => 155
(6) => 135
(7) => 97
(8) => 85
(9) => 81
(10) => 64
(11) => 69
(12) => 43
(13) => 47
...
(162) => 1
(163) => 1
(181) => 1
That is, although most of the time it's fast, some users hit it and it takes 10, 20, 60, even 181 seconds!
This schitzophrenic speed was the result of an extra caching layer. The first time someone hit "fiction" or "mystery" when the cache was cleared somehow got totally screwed. After that, it was fast.
Anyway, here's the new results with a totally clean cache.
(0) => 1758
(1) => 87
(2) => 4
(3) => 3
(5) => 1
It's never taken longer than five seconds. "Fiction" with more than a million works on its list took less than a second.
Anyway, that's why it's faster, if you notice. There are some other places we can do something similar, so with any luck we'll see similar gains elsewhere.
(0) => 134568
(1) => 4379
(2) => 715
(3) => 416
(4) => 263
(5) => 155
(6) => 135
(7) => 97
(8) => 85
(9) => 81
(10) => 64
(11) => 69
(12) => 43
(13) => 47
...
(162) => 1
(163) => 1
(181) => 1
That is, although most of the time it's fast, some users hit it and it takes 10, 20, 60, even 181 seconds!
This schitzophrenic speed was the result of an extra caching layer. The first time someone hit "fiction" or "mystery" when the cache was cleared somehow got totally screwed. After that, it was fast.
Anyway, here's the new results with a totally clean cache.
(0) => 1758
(1) => 87
(2) => 4
(3) => 3
(5) => 1
It's never taken longer than five seconds. "Fiction" with more than a million works on its list took less than a second.
Anyway, that's why it's faster, if you notice. There are some other places we can do something similar, so with any luck we'll see similar gains elsewhere.
2infiniteletters
Yay speed.
3_Zoe_
Speed is good :).
I don't suppose anything similar could make the Tag Mirror calculations more manageable?
I don't suppose anything similar could make the Tag Mirror calculations more manageable?
5sqdancer
>4 rodneyvc:
I'm guessing that Tim had originally used brackets around the numbers in his first post (which created touchstones) and then he changed the brackets to parentheses.
I'm guessing that Tim had originally used brackets around the numbers in his first post (which created touchstones) and then he changed the brackets to parentheses.
6235711
Um... I suppose I should have said this earlier, but I haven't been able to see a tag page in days. In Opera, that is. I was working in IE a bit for other reasons, so I got round it that way and didn't think about posting a bug. I suppose this explains it.
Unless it's an unrelated problem at my end. Anyone else having a problem loading tag pages in Opera?
Unless it's an unrelated problem at my end. Anyone else having a problem loading tag pages in Opera?
7TomVeal
I use Opera (the world would be a better place if more people did) and am not having any trouble loading tag pages.
9235711
My problem is solved. Apologies for hijacking the thread.
Back on topic: improved speed is excellent.
Back on topic: improved speed is excellent.
10timspalding
I don't suppose anything similar could make the Tag Mirror calculations more manageable?
Actually, you know, I think this might be the thing. I'll take a look.
Tim
Actually, you know, I think this might be the thing. I'll take a look.
Tim
11brightcopy
And dare I ask if this could somehow miraculously revive tag alphabetization? *crosses fingers*
13Nicole_VanK
Is this bug (http://www.librarything.com/topic/111377) related?
14lemontwist
Or this bug either? http://www.librarything.com/topic/108173
15timspalding
No, that's something else. But yes, important to fix.
18jjwilson61
16> See posts 3 and 10.
19DaynaRT
>18 jjwilson61:
Tag watch is not tag mirror.
Tag watch is not tag mirror.
20timspalding
>16 DaynaRT:
No, tag watch is a very different beast, but tag mirror's chances are transformed here. Stay tuned.
No, tag watch is a very different beast, but tag mirror's chances are transformed here. Stay tuned.
21DaynaRT
>20 timspalding:
Ah, ok. It was worth a shot. :)
Ah, ok. It was worth a shot. :)
22jjwilson61
19> Agh. Never mind.
26_Zoe_
:D
It would also be nice if the tag name at the top of the pop-up were a link to the tag page.
It would also be nice if the tag name at the top of the pop-up were a link to the tag page.
28staffordcastle
Actually, when I was fooling around with it just now, most of the -0.01% prevalence books were perfectly valid. It just means that not many people tagged them that way.
29_Zoe_
Oops, I just realized this is no longer quite the right thread for talking about Tag Mirror. But since the conversation is already here....
The ones below 1% are often decent, but 0.01% really seems to be just noise. I have the following titles under "vampires":
Pride and Prejudice
Lord of the Flies
Wicked
Northanger Abbey
Coraline
Perfume
The Screwtape Letters
War of the Worlds
I, Robot
Passage to India
Wuthering Heights
Uglies
Dr. Zhivago
The Amulet of Samarkand
Howl's Moving Castle
Dragonsinger
84, Charing Cross Road
Black Powder War
Feed
Those are all single-use tags, going up to 0.04%. I admittedly haven't read all the books, but as far as I can tell that data is just crap.
The ones below 1% are often decent, but 0.01% really seems to be just noise. I have the following titles under "vampires":
Pride and Prejudice
Lord of the Flies
Wicked
Northanger Abbey
Coraline
Perfume
The Screwtape Letters
War of the Worlds
I, Robot
Passage to India
Wuthering Heights
Uglies
Dr. Zhivago
The Amulet of Samarkand
Howl's Moving Castle
Dragonsinger
84, Charing Cross Road
Black Powder War
Feed
Those are all single-use tags, going up to 0.04%. I admittedly haven't read all the books, but as far as I can tell that data is just crap.
30staffordcastle
On the other hand, I would definitely support any titles giving a 0.00% return being left off the list - that's what Pride & Prejudice got in my results.
31timspalding
It would also be nice if the tag name at the top of the pop-up were a link to the tag page.
Oh, I see. Okay, will do.
Oh, I see. Okay, will do.
33jjwilson61
30> Pride & Prejudice has one vampire tag out of 37,560 copies which rounds to 0.00%
34staffordcastle
>33 jjwilson61:
Yeah, I know. Don't you think that might mean you could leave it off the list?
Yeah, I know. Don't you think that might mean you could leave it off the list?
35jjwilson61
From what I've seen, anything 0.05% or less could be left off the list.
36timspalding
Okay, everyone look at your list and give me your cut-off.
37staffordcastle
I've just looked at several tags (since I think the precision is going to vary from tag to tag), and I think 0.05% would be a reasonable cut-off. Some items with less are still okay, but I suspect this will be good enough.
39staffordcastle
By the way, what is the list of books sorting on? It's not the percentage.
40_Zoe_
Yeah, 0.05% seems good for total exclusion. I still like the 1% notification with any impact that has on the size of the tags etc. That is, I want additional exclusion at 0.05%, but I don't want anything between 0.05% and 1% to be treated more leniently than it is now.
41lorax
I think 0.1% prevalance index would be a good cutoff. Below that is the realm of The Satanic Verses showing up for "baseball" (0.02%), or The Annotated Alice for "historical fiction" (0.05%). There are still some idiocies above that cutoff but there's virtually nothing useful below it.
42timspalding
It's not the percentage.
Yeah, it's balancing the percentage and the number of copies. We don't want a book with one copy tagged one time jumping to the top because it's 100%.
Yeah, it's balancing the percentage and the number of copies. We don't want a book with one copy tagged one time jumping to the top because it's 100%.
43staffordcastle
Thanks!
44SylviaC
I'm finding the irrelevant ones start popping up most frequently around 0.07 and 0.06%. So a cutoff anywhere between 0.1 and 0.05 would probably be fine.
(Of course, there was Official Rules of Card Games coming up at 0.20% for the tag "Christian Fiction"...)
(Of course, there was Official Rules of Card Games coming up at 0.20% for the tag "Christian Fiction"...)
45CarmelKilmacud
How do I put my Tag list in alphabetical order
46bnielsen
>45 CarmelKilmacud: By hand, I'm afraid. (I have a script that looks at the TSV export file and warns me if I have any tags out of order, but that is the best you can do, afaik).
47timspalding
>45 CarmelKilmacud: Where do you want them in alphabetical order--in a display list, or book by book?

