An immodest proposal, or the first target

Build the Open Shelves Classification

Edited: Jul 16, 2008, 3:28pm Top

Correct me if I am wrong, but we are basically trying to map out all of human knowledge with the OSC so we can arrange the books about that knowledge in an organized system. We want a system where a patron can go to a specific location in the library and all the books that primarily deal with the subject of interest are clustered together. If a patron gives a librarian a call sheet with a half dozen books on approximately the same subject we want the librarian not to have to travel the entire stacks to gather the books. Finding books that only deal with the subject as a tangent will require using ‘the card catalog’ something only those of us in the geriatric set might recognize. Searching the card catalog is not a concern of the OSC, only the book order on the physical shelves.

When I go to a library to browse I locate the 940s and start examining the titles. It works fine for me in all the small county libraries I am most familiar with but I imagine that it would be a fruitless strategy in the library at the D-Day Museum in New Orleans where every book they have would fall into that category. Obviously some libraries need to drill deeper into classifications than others.

Somewhere in another thread it was mentioned that only 8 characters fit across the spine of most books and Tim mentioned that he wants groups to be all numerals or all letters not a mixture. So, we want to keep any groups

If we use all 26 letters and a maximum group length of 8, we come out with a maximum of 208,827,064,576 (26 ^8) categories in an 8-letter group. The first position represents the top-level categories; the second level the 26 ‘children’ categories of that top level. Each position along would be a generation of 26 sub-categories of the category to the left. For example (ABCDEFGH) would give us 26 top categories, the A position, divided into 26 sub categories, the B position, and so on until we have zeroed in on our topic. I am either explaining badly or too much.

For expandability we can do several things, we could start with less than 26 top categories. We could decide to use one letter, say X, as a place holder and then AXCDEFGH would have 25 new paths that could be assigned under each top category. If it is possible to use upper and lower case then the total divisions per group would be 52 ^ 8, every lower case letter could be reserved for expansion.

A library would not need to use all of the possible divisions, for example, in fiction, the only category I feel at all qualified to talk about, AXCD.whatever.whatever could be deciphered as A=Arts, X=open for expansion, C=written works, D=novels. Libraries could drill down farther if they wanted, 5th position = genre, 6th position = sub genre, 7th = sub-sub genre, 8th = sub-sub-sub genre.

Also, a specialty library could drop the top levels from their labels. If it took 4 levels to get to ‘World War II’ the D-Day Museum could drop those only print out the EFGH positions that drill down to whatever detail they want.

To assign a title to the class I would propose author name and year of original publication. I modified this from a suggestion made in the ‘A proposed outline for fiction” thread and posted it there first, it may only be workable for fiction.




on the spine.

The (AXCD) is just the classification from above and then first 6 letters of the main authors last name and their first and second initial. The third group the year the book was first published with the two zeros at the end to accommodate ambitious writers like Walter Mosley who publish multiple titles in the same year. It is simple, it helps to keep all the ‘Smiths’ separate and it puts the writer’s books in order of publication.

Let loose the wolves. I have no expectations that this will survive long but it is a concrete proposal whose shortcomings can be examined and revised. The real work is in designing the “trees of knowledge”, the categories and their divisions to be inclusive and expandable.

DISCLAIMER: I don’t even know for sure that the D-Day Museum has a library. It is an assumption and the examples are just examples.

Jul 17, 2008, 4:11am Top

Overall, yes.

For non-fiction, I would generally want the year before the author's name instead of after. That's just a modularity issue.

Your author notation could put Barbara Kingsolver before Karen Kingsbury, depending on how names are truncated. This should be approached with caution. If we end up having to examine the book covers to establish shelving location, which is issue being addressed, then we are just adding complexity without gain.

Jul 17, 2008, 12:13pm Top

I agree with pwaak that if you want to include author's names in the classification code, then you need to do a little more work there to keep them alphabetical.

The bigger problem I see is that 52 subcategories is going to be insufficient in many cases. For a good example, 52 categories barely includes every state in the US and then what are we doing about DC, Puerto Rico, Guam, and the other territories? I think we need at least three characters (letters or numbers and probably more like four or five for numbers) per sub-category and we'd be much safer to not have any limits on the number of characters per category. We could cap each category at 8 characters for spine label convenience, but keep in mind that is also an artificial construct. I don't know that there is anything particularly ideal about eight characters of width. It's not that tough to create some new labels, if necessary.

Also, the I,i,l similarities would probably disallow the use of those characters for clarity's sake, even in a strictly alphabetical system.

Jul 17, 2008, 3:30pm Top

>3 SatansParakeet:

The problem with creating new labels is that the labels are the size that they currently are because they have to fit on the spine of the book. Already there are some things that can't fit the label on the spine (children's books are a prime example) and it causes headaches for librarians when they shelve and retrieve things.

So the 8 characters per line limit is just a physical reality that we'll have to deal with for this system.

Jul 17, 2008, 10:37pm Top

>4 Tricoteuse:

That's part of what I'm arguing against, though. Eight characters may be a good guideline, but one character would be better for children's books and in most libraries the first three or four characters are sufficient to get you to the general area and then you just look at the cover where the rest of the label is if it's a thin book. There's nothing magical about eight characters, it just happens to work fairly well.

Jul 17, 2008, 11:04pm Top

I see now, I thought you were arguing the opposite, that they should be longer with bigger labels.

You're right, there's no reason to use 8 in particular, and most call numbers now don't have a full 8 characters on any given line, because they're split at the spaces/punctuation for better readability.

Jul 18, 2008, 9:46am Top

Putting the year before the author’s name in non-fiction would keep the newest information to the right, which would make it easer to find the newest information on the subject. I am not even sure we would need to use the author’s name with non-fiction but I could not think of what else to use. It might be possible for the individual institutions to decide if the name group or the year group came first and was the primary sort for the individual titles.

Anytime you have to truncate a name you can get into problems. I thought that by using six letters from the last name, and if the name is shorter than 6 letters use a fillers to get to six then the first initial and the second initial. With the 8 letter limit per line it is the best sorting we can do unless we wrapped to the next line in mid name. It would keep KINGSBury before KINGSOlver but it will fail to correctly alphabetize some names, I don’t think we can achieve perfection with the physical limits the real world puts on us. Using 7 digits of the last name is less workable. Short names like Smith abound and we would loose the middle initial to help keep Sara Smith from being mixed in with Sara Anne Smith. If we dump the fillers that keep the first initial in the same position we get Albert Smithson coming before Thomas Smith. SMITHSOa and SMITHt

If we drop the small i and l from our category scheme we still have over 39 trillion categories we could eventually assign. You presented the solution to the problem of having only 50 sub categories for each value at each level in your question. Use the next two levels. If XXX= United States, XXXAA= Alabama, XXXAB= Alaska and on to XXXB? for Guam.

I am not sure that the best way to represent the states would be as sub categories of the United States. Perhaps ‘Governments of the North of the New Hemisphere’ would be a category that would allow Texas to be presented as an independent nation that came from Mexico and then joined the US and any status changes in the futures. The Cherokee Nation, the French Government of Louisiana all could have representation. I am in no way qualified to design the hierarchy. But I think 26 well-designed top categories would allow most libraries to get by with using only half of the available levels of division. Specialty libraries could drop of the top levels that would be implied by their specialty.

Jul 18, 2008, 10:06am Top

Is a completely identified sort order a requirement of this system? If you could get all the J Smiths together it shouldn't be that hard to find the exact title that you're looking for even if the Jane Smiths are intermingles with the Joe Smiths. If it needs to be perfect than it seems that Cutterizing is the only method that will work within the constraints of the label.

Jul 18, 2008, 10:55am Top

Not at all, in non-fiction I think most of the sorting would be done in the classifications, in fiction, where there is a large number of titles, I expect they will be broken out and shelved by genre which will help eliminate the mixing of names. If the system were ever adopted I expect that publishers would push authors to take a middle initial that would set them off from the rest.

Jul 18, 2008, 12:04pm Top

Theoretically, yes you could just get things into the correct general area and let people figure it out from there, but I suspect that it would be very frustrating to both librarians and patrons.

The advantage to having a completely identified sort order is that it makes retrieving known items much easier - the few seconds saved by being able to go directly to the desired item without having to look through a group of items might not seem like much individually, but cumulatively it saves a lot of time. If it were strictly a browsing collection then it'd be less of an issue, but the call number has to serve as a location device as well.

Jul 18, 2008, 2:07pm Top

Now I am starting to get confused.

Yes, a completely identified sort order is a requirement of this system. It is the entire idea.

Having the authors in exact alphabetical order is not critical in my opinion. The only way we could guarantee that would be to include the full last, first and middle name in every call number. Not practical and not really possible. Six letters of the last name with first and middle initial is as good as I can get it given the physical constrictions (width of average book spine). I think most libraries could get by with a shorter name field and still have the collection, in each category, mostly alphabetical. If I was better with Excel and had more time to spend uninterrupted in front of the computer I would try to run some tests the books and writers in my library.

Jul 18, 2008, 4:46pm Top

If I understand you correctly, you just contradicted yourself. You want a completely identified sort order, which to me means that every unique book would always be put next to the same books (given that they aren't checked out). Then you say that not having the authors in exact alphabetical order is not crucial.

Or do you mean that they will always be in the same order, just not necessarily alphabetical by author? But how do you mean to keep authors whose names truncate to the same thing separate. I believe you were adding the year of publication and a two digit number in case the author wrote more than one book in a year? In that case, it looks like you will need to increment that number for the case of different authors when they truncate to the same name. Will it be important that this number be the same across libraries or not? If so then you will have to have a central authority assigning them.

Jul 18, 2008, 6:34pm Top

Why do fiction spine labels need to have the author on them? Books can be easily ordered by the information on the book. There's no reason, other than laziness of shelvers for Jane Smiths and Joe Smiths to be intermingled (barring of course the occasional mis-shelvings that happen due to normal usage).

Note: I'm ignoring for the moment, book order within an author, series and genres.

Jul 18, 2008, 8:19pm Top

The covers of fiction books, and especially the spine, can be very creative. It is sometimes difficult to read the author's artistically rendered name. Sometimes, the authors name is not on the spine at all. Remember that book covers are not designed for libraries. They are made for book stores and, ideally, face out display. Covers are designed to coax people into spending extra time looking at the book. Only made-for-series books give reliably careful attention to spine information.

Not to mention books that have a person's name as the title. Which name wrote the book? When we shelve 200 books an hour, we don't have time to analyze each book the way a shopper can. So yes, spine labels on fiction are for the shelvers. But it is not laziness. How much are you personally willing to raise your taxes to pay people to spend more time looking at books covers?

Jul 18, 2008, 9:00pm Top

>13 trollsdotter: Intuitively, that should be so but one of the main reasons for all these seemingly fiddly practices in library land is the variation in where author names and titles appear in a book in addition to how they actually appear from one edition to another. In some cases, the names of editors or annotators may appear more prominent than the name of the author of the work. I suppose that it goes against operational efficiency to leave it up to shelvers, who are often students or volunteers, to spend time scanning the material and making decisions about sorting rules while shelving. I've worked in a large public library system as a circ clerk for about three years and while I'm one who would like to expect the best from everyone, there were many times when I had to temper this faith with the fact that some volunteer shelvers just cannot arrange a series of books alphabetically and that's at the plain, straight, a-b-c, 1-2-3 sequence. Now, of course, we can argue that libraries should devote part of their budget to better selection and training of shelvers...

But setting operational issues aside, author names have figured more prominently in fiction call numbers than in non-fiction call numbers (especially in LCC) perhaps because the cataloging community have difficulty assigning content-based classification numbers to fiction on the same level as non-fiction (imaginative content vs. factual content?). So, more formal attributes of a work such as author name and title tend to become the collocating focus for works of fiction. In LC, for example, authors whose works after 1960 have their Cuttered names appear like they are part of the main classification. Authors of American Literature, for example, have specific classification numbers in the PS3551-3576 class. The last digit of a class number in that range is an alphabetical sorting of an author's last name so that PS3551 are for author surnames starting with A, PS3552 are for those starting with B and so on. Stephen King and Barbara Kingsolver are both classed in PS3561. The first cutter number after this class number is derived from dropping the first letter of the author's last name so that I is the first character in the first cutter in the call number for King's From a Buick 8 : a novel ( PS3561.I483 F76 2002) and for Kingsolver's The poisonwood Bible : a novel (PS3561.I496 P65 1998). These works are differentiated in the second cutter where the title is used and in the pub year. For the same author who has more than one work in the same pub year, the cutter number for the titles serve to distinguish. For example, King's From a Buick 8 : a novel (PS3561.I483 F76 2002) and Everything's eventual : 14 dark tales (PS3561.I483 E85 2002).

I'd go aaargh! If I were a computer trying to generate call numbers based on these special and standard Cuttering rules...

Jul 18, 2008, 9:42pm Top

Always be in the same order, just not necessarily alphabetical by author. Thank you, I would have gone through four more paragraphs trying to explain that. As I understand it every library assigns call numbers independently. The extra two digits could be the answer. Is it likely that S. no middle initial Smith will publish over 100 books in one year? They would all have to be in one subject category before they were not unique call numbers. Some libraries may force a lower case 'middle initial' or, for names like Smith that are shorter than 6 letters they might insert a unique sixth letter to identify the specific author. There are ways to work around the duplicate author ID. Perhaps someone can come up with a better, simpler, system.

Why indeed. For the librarian re-shelving a lot of books it is important that the information be in a standard location in a standard format and a standard typeface. Adjusting to the different type, colors and orientation that publishers use would really slow the work down. Some of the Ex-library books I have just use the Dewey classification number and the first to letters of the authors last name. That is enough to get a general alphabetical order. People are used to using names for sorting. It is something familiar and easy for people to work with. Using the year of publication is just because I am compulsive.

Mar 28, 2010, 8:12pm Top

IMHO we need to keep in mind that we are attempting to develop a system to classify the content of publication contained in our libraries. Some these things have very broad content: Ray Bradbury's A Chapbook for Burnt-Out Priests, Rabbis, and Ministers contains poetry, essays (non-fiction on various topics), fiction (stories both science fiction and horror). Presumably many categories would be assigned or what?

Mar 29, 2010, 1:26pm Top

>17 WPL4312:
Actually, no. The purpose was to develop a shelving system geared to public libraries. It was intended to be a replacement for Dewey.

Not that it really matters, as the project seems to have died on the vine.

Mar 29, 2010, 1:31pm Top

>18 sqdancer: It does seem that way, doesn't it? The message about how "the first test round has been closed" has been on the work pages for how long?


