author combining: dormant author urls, shadow author urls

TalkCombiners!

Join LibraryThing to post.

author combining: dormant author urls, shadow author urls

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1gangleri
Edited: Sep 30, 2010, 10:28 pm

>9 gangleri: I try to correct the content from Message 0:

Hi!

Since a few months I am puzzled about combining author urls for authors containing letters as the characters ������ , ������¡, ������¢, ������¤, ������¥, ������¦, ������§, ������¨, ������©, ������«, ������­, ������¯, ������°, ������±, ������³, ������¶, ������¸, ������º, ������¼, ������¾ or some aother characters in their names.

Please read: à, á, â, ä, å, æ, ç, è, é, ë, í, ï, ð, ñ, ó, ö, ø, ú, ü, þ

Example:
leacutevyvalensieacu is such an author url which today (now) is not related to any work.

I found it searching for "Valensi" - 17 matching authors found.

The problem I experienced with "dormant author urls" / "shadow author urls" is their coexistence, if users edit their works these urls get active again. Combining works may activate them as well. In other problems: We have to deal here with unpredictable behaviour.

What could be done? Maybe one idea is to add a link to an active author url in the "Relationships" field together with an unique word as "topic99615". From time to time members of this group can verify if there is a chance to combine the involved author urls using a suitable Common Knowledge search link:

"topic99615" and / or "workaround"

This method is an equivalent for "tagging" author urls.

Please note that for many of the listed characters the "dormant author urls" / "shadow author urls" might be at different locations:

examples:
fooübar may be related to:
foobar, foouumlbar, fooubar, foouebar

fooábar may be related to:
foobar, fooaacutebar, fooabar

What is your opinion? Are there better ideas?

Best regards Reinhardt

P.S. changed / fixed the topic99615 link
P.P.S. It should be fixed now

2Nicole_VanK
Sep 30, 2010, 1:21 pm

My suspicion is that these variants simply occur because some ouside source gives the name that way. We can't really solve it, but we can combine them as much as possible.

In this case: your leacutevyvalensieacu equals amadoleacutevyvalens (http://www.librarything.com/author/amadoleacutevyvalens) + a couple of other possibilities already combined into that one.

3gangleri
Sep 30, 2010, 1:23 pm

Hm ... topic99615 (inserted in the "Disambiguation notice" filed at leacutevyvalensieacu does not show up imediatelly. Is this a cache issue? Would anything show up ever ?

4Nicole_VanK
Sep 30, 2010, 1:27 pm

Must have been a caching issue. I see that disambiguation notice now.

5gangleri
Sep 30, 2010, 1:58 pm

Thanks for the feedback! I do not blame anyone and am looking only for some workarounds.

My question is / was if the idea is nonsense (shmontses) or if it usefull for something.

If at some point in time the author urls are combined one can delete either the "Relationships" or "Disambiguation notice" filked content or both.

I have many seen dozens of such author urls so far. If there are no objections I would like to add a suitable url in the "Relationships" field and a link to this topic ( topic99615 ) in the "Disambiguation notice" field.

----

One addition relating to some characters as: ā, ă, ĉ, č, ı, ij, ł, ń, ō, ő, ř, ś, ş, ș, š, ţ, ț, ż, ž:

Normaly one can experience ommisions and substitutions:

examples:
foołbar may be related to:
foobar, foolbar

fooijbar may be related to:
foobar, fooijbar ( i+j )

fooőbar may be related to:
foobar, fooobar, fooöbar

6r.orrison
Sep 30, 2010, 1:58 pm

The solution is just to combine them, but that doesn't work (you get the "This author is not in the system page"). I've reported this bug: http://www.librarything.com/topic/99617

7gangleri
Edited: Sep 30, 2010, 5:07 pm

> 6: Yes. I experience: "This author is not in the system page" since months.
---
I changed / simplified the tagging. "workaround" should be a helpful link as well.

8gangleri
Sep 30, 2010, 6:04 pm

>1 gangleri: For some reasons unknown to me the originaly posted characters are not displayed properly:
à. á, â, ä, å, æ, ç, è, é, ë, í, ï, ð, ñ, ó, ö, ø, ú, ü, þ

Many other lines are broken as well. :-(

9gangleri
Sep 30, 2010, 6:28 pm

> FYI: http://www.librarything.com/topic/99640

Topic: add meta ... Content-Type content="text/html; charset=utf-8 to site

10gangleri
Edited: Dec 2, 2010, 9:52 am

>0 "topic99615" and / or "workaround" should be useful links now

P.S. 2010.12.02 expanded "topic99615" and "workaround" link with additional parameters "&startNum=0&stepNum=599" - now internationalized / internationalised i.e. without hardcoded "www.librarything.com"

11gangleri
Oct 2, 2010, 11:46 am

note: dormant author url might be also urls generated because of the usage of unusual author name format (given_name surname versus surname, given_name), misspellings etc.

example:
georgegilbertgoodwin versus goodwingeorgegilbert

12gangleri
Oct 5, 2010, 9:48 pm

While searching for suitable author urls to be combined with Szymon Laks I searched also for "Lax" .

With this search one may find many misspelled urls which should point to Halldór Laxness .

I think that it does not make much sense to "tag" all dormant misspelled urls with "workaround" especially because many of Halldórs work have a lot of copies.

To understand what I did you may look at the editions of http://www.librarything.com/work/224413/editions Íslandsklukkan ( Íslandsklukkan ) .

You will find there also a copy
Islands kloke / alldor Kiljan Laxness (ISBN 8252501508)
which I did find at
http://www.librarything.com/author/alldorkiljanlaxness&norefer=1 "alldor Kiljan Laxness" one of the misspelled author urls from the first search link.

I combined the works directly with undisclosed url syntax. (In other words I did not combine the author urls in a first step and the works in a second step).

I hope that no unpredictable edits will permanently reasigne the works to the misspelled author url and I think that the indication "" should suggest to the users to correct the entry in the author field.

Conclusion: Sometimes (if there is a higher number of works at the main author url) it might be more efficient to combine at work level. However there are also a lot of disadvantages (related to author cloud etc).

13gangleri
Oct 5, 2010, 10:22 pm

However http://www.librarything.com/author/halldoacuterlaxnesss&norefer=1
"Island Schriftsteller Halldór Laxness" is not misspelled. It is the author variant from data source NEBIS. It should be "tagged".

Any new work imported drom that source will remain at that author url until some combination will take place.

14gangleri
Edited: Oct 18, 2010, 12:52 pm

>1 gangleri: š (S with caron) is part of the first set of letters:

examples:
foošbar may be related to:
foobar, fooscaronbar, foosbar
----

re: http://www.librarything.com/topic/82494 search can not identify names which are not Unicode normalized - a forgotten topic
a) there are some caching issues related to search_author (they might be discussed at http://www.librarything.com/topic/100713 ) : I remember that I could neither identify newly added author urls nor could I combine them the same day
b) beside the caching effect I remember that I was not able to find urls where diacritical characters where part of the LT author name many days later; this bug might still be present (see Unicode Characters in the Combining Diacritical Marks Block for such characters; LoC is using them a lot)
----
Because of note b) a few lines above my conclusions was to use the "Canonical name" at all "b>risky occurences of such author urls. It happened lots of times that I copied and pasted these names as tags, (both authors and other others mentionaed in a work/ book) and as subsequent bug my tag cloud was broken / showing different spellings for the same target"
However site effects where involved:
c) helpfull site effect: I was able to identify urls for the same author with CK search in "Canonical name" fields (&f=22): Böll, Heinrich
d) hindering / "unwanted" site effect:
http://www.librarything.com/topic/83617 (at Group: Board for Extreme Thing Advances) "spelling author variant not shown at top level spelling"

related topics:
http://www.librarything.com/topic/83298 "« search_author.php » shows incomplete list of author url variants"
http://www.librarything.com/topic/81690 "missing Unicode normalization blocks combining authors" (a ? duplicate of ...)

fixed html