Keeping track of on/offs without index-only tables?

TalkPurely Programmers

Join LibraryThing to post.

Keeping track of on/offs without index-only tables?

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

May 4, 2011, 9:50pm

I just posted a question on Stack Overflow.

Help LibraryThing on Stack Overflow: Keeping track of on/offs without index-only tables?

The case is whether or not a book has been indexed. The indexed byte is currently stored in the main book table. The UPDATES are slow--making indexing across a group likely to send us into replication arrears.

We are helped by overall timestamps for the last time a user was indexed and the last time any change was made. But if the changed is larger than the indexed, it must descend to the books, and then update them.

Sep 11, 2011, 10:23pm

Not enough information so I am just guessing but may be you can use a bloom filter? It is a probabilistic structure. Basically use multiple hash functions on an item to create N different indices into the bloom filter bitmap. If the item is *not* a member of the set, at least one of N bits will be clear. if all N are set, this could be a false positive and you must lookup the definitive version (usually slower to get there). So in your case you can still keep the indexed by in the main book table but only need to hit is when the filter has a positive hit. Check out the wikipedia article.

Sep 12, 2011, 1:15am

You index all the books belonging to a particular user at the same time, right? Store a number for each book, and for the user: when the former is greater than the latter, the book needs indexing -- you only need one update on the user table to flag them as indexed.