
Will need some getting used to, but looks good. Thanks.
Nice! One question, though - are the entries in any particular order, or is it random? They don't seem to be either alphabetical or in order of how many books in the overlap.
It's by a ranking, which is mostly the overlap, but conditioned by how HIGH in the list a book appears, and how many books are on the list. Anyway, it's trying to give you the strongest, most interesting ones.
T
Thanks - I'll have to study the list for a while, to wrap my head around that. It did list one of the ones I made myself as the first!
I'm glad to see that tagmash will be reintroduced. I think it's one of the gems of LT.
Is there an explanation for this somewhere? I'm not sure I understand the function and use of it.
I'm sure we won't reach unanimity on this, but should the order be left to right, not top to bottom? Far more useful info is above the fold that way.
Message edited by its author, Sep 7, 2009, 7:46am.
Ah, with green checkmarks added, as with single tags. And the stats page is very interesting. Thank you.
Checkmark comparison is very nice.
Very interesting - if you had asked me how many of my books others would have tagged as "children" and "christian" and "series" I would have thought - oh, maybe a 1/2 dozen. Nope - 67! Who knew?
This is really interesting. First I knew of tag mashes.
And now I'm trying to figure out the tag 'Fred.'
I want a switch to hide this.
Okay, no, not really. But it shows a depressing number of Children's, juvenile, young adult, etc. A lot of it I have to admit is fair, but it seems like any adventure work more than a hundred years old is automatically juvenile;
Jules Verne,
The Three Musketeers,
Robinson Crusoe, H. G. Wells,
Frankenstein, etc. In a hundred years will
Jurassic Park be dismissed as juvenile?
(Edit: It seems that it is in fact private. Which seems a little odd, since comparable things, like the tag mirror and recommendations, are public.)
Message edited by its author, Sep 7, 2009, 6:02pm.
15: You're assuming Jurassic Park survives. :)
Yeah. I saw this thread earlier, clicked the link, and got lost in my tagmash overlap. An hour later, I have eight more wishlisted books and waaaay too many new tag ideas.
I love this.
Amazing!
Because of the tagmashes, and looking to see what books certain ones covered, I discovered at least five books that I've owned for years but never entered into LT. That is, I expected to see certain books I own show up under the mashes, which they did, but they didn't have the green checkmark.
I'm going to consider tagmashes a fantastic feature, if only because I now know to double check my books against LT before I box them up when I move, since I seem to have skipped over one or two books from each shelf when I was entering everything in last year. :P (strangely, each of the skipped-over books have been read many times, and a few might could use replacing)
>18
That's because of the / in the name of the tag...
>13 Small numbers of oddball tags cause problems for several of the tag based social functions (tag mirror, tag based recommendations). But the fiction/nonfiction distinction on this function would seem to me to be especially useful and especially screwy when it's wrong.
To use Anneli's examples, Crime and Punishment is tagged fiction 2298 times and non-fiction 8 times. The hitchhiker’s guide to the galaxy is tagged 1529/2, fiction/non-fiction.
Would it be possible for LT to check the ratio of fiction/non-fiction tags and ignore the smaller one when computing the tagmash? Since I can think of some good reasons a book could be tagged with both, perhaps only when the ratio is running more than 10/1 in one direction.
> 9 I'm sure we won't reach unanimity on this, but should the order be left to right, not top to bottom? Far more useful info is above the fold that way.
Agreed!
>21
Actually, we store a single value elsewhere for this—fiction, nonfiction or undecided/undetermined—based on the ratio. But tagmash is what tagmash does. There are lots of similar tags people will disagree on, and no way to police the issue. If you say --fiction, well, you've got rid of everything with a single fiction tag—live with it! Instead, use -fiction to "demote" the fictions. And you're done! :)
>22
No. Because if you do it that way, you can't keep everything lined up without making a little self-contained box for each line. Since some lines are two lines long, it looks terrible. I'm not sure I can explain this well without a visual aid.
>23
It does appear that subject tag, non-fiction, -fiction (or vice-versa) gives the desired result.
By the way, is the separately stored fiction/non-fiction/undecided data used for anything at the moment?
>24
Minimally, for recommendations.
> 23 Sad. :(
Stupid formatting issues.
How about if you paginated the results and we could control the page size so it fit on our screens. That way we could see all the strongest matches on one page without scrolling.
I'm sorry, but I don't understand what this means.
It's probably too much trouble for Tim to do it, but he could calculate how many tag-mashes would fit on one page without scrolling and just show that many. Then the next group could be shown when a "next" button is pushed.
>It's probably too much trouble for Tim to do it, but he could calculate how many tag-mashes would fit on one page without scrolling and just show that many.
It will depend on your browser (and what kind of bars you have on it), the font that you use, the resolution and so on. Doing it individually every time? I do not think this would be easily done. It might be easier to do something like the catalog size (everyone specifies at the top of the page or something how many to see...)
Handed to Chris. Sorry. I meant another thread.
Message edited by its author, Sep 8, 2009, 5:15pm.
28: Which part?
30> That's what i meant by "we could control the page size."
> 28
This part: How your books overlap with LibraryThing tagmashes.
I know what a tagmash is. But what does it mean for a book to "overlap" with a tagmash?
34: That it shows up on the list for the tagmash.
The tagmash "overlap" is more about your library than individual books. It shows which tagmashes have a lot of your books on them, like the tag mirror but for multiple tags at a time.
I'm assuming that these aren't all the possible tagmashes that are used, but those that people have actually searched using?
I too would assume the caveat from any specific tagmash page applies here as well, "Tagmashes do not exist until someone enters them."
This text could be added to the top of the tagmash overlap page for clarity.
>20 I know. The quick solution would be to choose the form of a tag (with multiple spellings) to be one without problematic characters; several bugs would be fixed if it use 19th century literature instead of literature/19th century. The better solution, and that would fix a lot of bugs that have been annoying me for a long time, is to consistently hash tag names in a way that avoids problems with special characters; perhaps even convert all of them to a number just like works are, and never send a tag name by URL.
20/38: And series names. Twould be lovely.
It's funny members think every combination of X million tags—X million to the fifth power?—will be pre-generated. There are more potential tagmashes than atoms in the universe, people!
>40.
If there are x tags, there should be x! + (x x-1) + ... + (x 1) different tagmashes, of course (for x > 1).
(where (y z) represents a combination, such that:
(y z) = y! / (z! * (y - z)!)
)
I'd put it into summation notation, but it's scary enough as it is.
I don't think the system allows a single mash to have one thousand tags, though :)
It would be possible if you wanted to do it. Dump all the tags with too little usage would probably leave you with say 100,000 tags, which can be compared pairwise fairly easily, and then three tagmashes, covering the three pairs of three tags, can be assembled pretty easily, then if we want four tag tagmashes, they can be assembled from the four tagmashes that contain three of those tags pretty easily. It wouldn't be cheap, it wouldn't be worth it, but it's not silly to think it could be done.
But 100,000 four ways would be 100,000 to the fourth power. That's 100,000,000,000,000,000,000 possibilities. That's one hundred quintillion. According to somewhere online atoms in the universe is a number with around 80 zeroes. Still, one hundred quintillion is a lot.
#44 - So, is the issue processor power or disc space?
But if you follow the algorithm I gave, you never do 100,000 four ways.
If you're looking at these tagmash overlaps, if a,b,c,and d are tags, you should only look at creating an a,b,c,d tagmash if tagmashes (a,b,c), (a,b,d), (a, c, d) and (b, c, d) were all interesting. You never evaluate any larger tagmash containing Esperanto literature, French literature because that tagmash had one element, and you can probably ignore any tagmash containing medieval literature, science fiction (18 results). The numbers of two-tag tagmashes that have reasonable overlaps is not anywhere near 100,000^2.
I suspect you could create every non-empty tagmash, not reasonably, but certainly within a "Tim Spalding has the brain-fever and is willing to run LT into the ground computing this function" budget. A lot of the stuff you do couldn't be done with naive functions.
>46
The problem would be "read, "unread," "fiction," "nonfiction," etc.
47: Then exclude those 4. :)
If he also excludes etc. he is done :-)
Except for the complaints about lack of content. :)
It's remotely more feasible to create every c(x, 2) tagmash, since, given 100,000 tags, that'd be like 4,999,950,000 combinations.
>47
I'm just trying to work out why on earth someone created the tagmash
reread, unread, and what if anything to deduce from the fact that it's fairly high on my list of overlaps?
I also wonder about some of the redundant tagmashes that people have created: like "glbt, lgbt, queer" and "England, fiction, sex" - obviously the difference between AND and OR isn't universally understood.
It's nice to see that British authors hold three of the top ten places in the tagmash "German, satire", anyway...
52>
Oh, "reread, unread" sounds very interesting, actually -- books that some people adore, and others just can't seem to get to?
I can see how "glbt, lgbt, queer" is likely to be fairly redundant, though "lgbt, --glbt" and vice versa might be interesting, but how is "England, fiction, sex" redundant? Is this a "No sex please, we're British"-style swipe that I'm missing?
> England, fiction, sex
Yeah, the England, good cooking tagmash too
Nonsense, there's lots of English *fiction* about sex. Consider Fanny Hill, Tess of the D'Urbervilles...
How about sex, --friction?
Message edited by its author, Sep 10, 2009, 2:08pm.
Did you mean to put that r in there? It does bring up interesting images...
Talk about someone being rubbed the wrong way...or the right way...
Wooooaaaah. Innocently wander into the thread, and *wide-eyed*.
Oops - I seem to have started something...
One oddity I noticed: between all the tagmashes on my Tagmash Overlap page, there is one single tag: "english fiction" (no comma). Is the system listing it as though it were a tagmash because it happens to contain two words in alphabetical order?
> 60
No, because my tagmash page has one singleton tag as well: comedy
(edit) found two more: action & fantasy fiction
There are some odd tagmashes that I did not expect:
dwarves, non-fiction (I would have expected this to be a very small set)
Message edited by its author, Sep 11, 2009, 9:37am.
Yes, I'm not sure how the singletons are getting there. May be an error left over from long ago.
Yes, I'm not sure how the singletons are getting there. May be an error left over from long ago.
>61.
a very small set
Pun unintended?
This may be due to some on-going tag regeneration, but I'm getting quite a few empty tagmashes (there's a number in parenthesis, but no tags,
http://www.librarything.com/profile/romu...).
For example here's my first row:
epic fantasy, high fantasy, magic (61)
awesome, science fiction (19)
(8)
(24)
I would've expected tags in front of (8) and (24)
Oooh, nice catch. I'll look into it.
And I'm still getting singleton tags. For example, science fiction, and another is british authors. Hm, could it be because they are two word tags?
ETA: I also spotted this pair:
YA, good vs. evil (27)
good vs. evil, ya (27)
Is this because the mashes are still being generated?
Message edited by its author, Sep 14, 2009, 4:54pm.
Is this because the mashes are still being generated?
Yeah. Starting with the a and going to the z... :)
I'm seeing #67 pretty heavily. As for singleton tags, I'm getting mystery and fantasy, so it's not just two word tags.
I'm also seeing D&D, RPG, fantasy (33), and I'm wondering how anyone ever generated that, since using any alias for D&D gets translated to D&D and then brings up the tagmash for D.
(back to top)