Tag translation : Untranslated and per-member translations

TalkNew features

Join LibraryThing to post.

Tag translation : Untranslated and per-member translations

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

Jun 16, 2012, 12:35am

I've added a number of new ways to look at tag translations.

1. I've added an "untranslated" section, so you can find out what popular tags haven't been translated. It shows everything not translated with more than 1,000 uses, in descending order by count.


2. All member names are now clickable to see what else the translator has done, eg.,


3. Translations now include an "All languages" list

Jun 16, 2012, 3:51am

Could you add a list of all tags that are translated, or is there a list somewhere of the tags that were automatically translated from Wikipedia?

It would make it easier to find the ones that are wrong and need new translations.

Jun 16, 2012, 10:29am

This list is helpful but some of the tags that show up as translated are actually combined with tags that have a translation. For example on the Spanish list ~africa comes up as a tag, which is combined with Africa, which already has translations in Spanish.

Jun 16, 2012, 11:42am

I can only see one page on the by-member. Clicking on subsequent pages takes me to that page for all users.

Jun 16, 2012, 12:15pm

Also, it would be nice to be able to set a default language. I don't know French but I do know Spanish, yet I have to scroll from the default French option every time.

Jun 16, 2012, 2:01pm

This list is helpful but some of the tags that show up as translated

Do you mean some that show up as untranslated? I'm losing you, maybe.

Also, it would be nice to be able to set a default language. I don't know French but I do know Spanish, yet I have to scroll from the default French option every time.

If you used the Spanish site, that would be the default. On English it defaults to the largest. (Translate more Spanish and that can be the default! ;) ) While I get your need, I think that's the right solution, not a new feature.

Jun 16, 2012, 2:38pm

>6 timspalding: Look at the list of untranslated Spanish tags. One of them is ~africa which is combined and autodirects to Africa, which already has a Spanish translation.

Jun 16, 2012, 3:13pm

6> On English it defaults to the largest.

Largest what? Number of members? That's great at first -- until that language is fairly thoroughly translated. At that point, would we not end up with the language first suggested as needing translation being that which already has the most translations?

May I suggest a weighted ratio between the "size" of the language and the percentage of tags translated..."big" languages would still tend to be listed first, but the order would shift based on how many tags have already been translated.

This is not really all that important, I suppose -- but if you are trying to really push tag translation (which is the feeling that I've been getting from these recent changes) in areas where it would be most useful (toward...what?... helping LTFL fulfill its global destiny?) and most needed, I would think that a weighted ratio would fit the bill...

Jun 16, 2012, 3:17pm

Largest what? Number of members?

Nope, read the parenthetical.

(Translate more Spanish and that can be the default! ;)

He means the largest number of translations.

Edited: Jun 16, 2012, 3:46pm

So the language first suggested as needing translation is that which already has the most translations? That doesn't make sense.

Jun 16, 2012, 4:14pm

#10: The language as suggested as the most likely language to be translated into is the language that is most frequently translated into. In the long run, it may be a poor choice, but in the short run, it seems the best way to make the selector match what people want.

Jun 17, 2012, 12:09pm

Why is it that I'm always the one that makes the first mistake?!

If I made a mistaken entry, I take it that I cannot delete my own entry. I just have to wait until it is voted down. Is that right? :(

Edited: Jun 17, 2012, 12:13pm

I'm sure you're not the first one to make mistakes. I've made some pretty bad ones myself. But I think you're right, no way to retract a proposed translation - just wait till it gets voted down.

Edited: Jun 17, 2012, 12:26pm

How embarassing! ;)

This is quite fun, by the way!

Jun 17, 2012, 4:31pm

#12: In my experience with mistaken entries--I translated Israel as Israels, and created a new creature upfantoj ('upfantoj?) to translate werewolves instead of lupfantoj--I've just created the correct entry, and downvoted the old one, and the new one won.

Jun 17, 2012, 11:55pm

Is this a bug? I chose some words in Hebrew and they are listed here as in "Spanish". :(

Edited: Jun 18, 2012, 12:00am

I've a small handful of Spanish translations, which are described here as being in French. (Scroll down past the boxes and read-in tags to see one, and please disregard the typo!)

Jun 18, 2012, 1:54am

>16 SqueakyChu:,17

Yeah, the "all languages" seems to be showing every translation as the same language. Mine are all in Irish apparently. Including the german one I accidentally marked as being French :-)

Edited: Oct 16, 2012, 12:32pm

Hi! Sorry for starting some new comments. Recently I am involved in tag translations focusing mainly on Russian, Hebrew, Esperanto, Greek and Latin.

re: /helpers_tagtranslations.php?member=327540 and /helpers_tagtranslations.php?member=296963

a) it would be nice to be able to sort the translations by the "Tag" column
b) it would be nice to correct the "Language" column" it shows one language only this is /topic/143466
with some efforts
c) it would be nice to indicate if the translations are combined already and /or if voting about tag combinations started
d) source of error indications; to my understanding COMMA, square brackets, QUOTE, the "&"-character and maybe some other characters should not be used in tag translations:
/tag/Anita+Blake now shows
Polish Anita Blake, zabójca wampirów 3 1
Polish Anita Blake (zabójca wampirów) 2 0 gangleri
Polish Anita Blake [zabójca wampirów] 1 1 gangleri
Personally I thing thet "foo (bar)" is the best format if one would like to relate to Wikipedia article names.

e) "flagging" to facilitate voting
examples: At /tag/Harry+Potter and at /tag/Tigran+Petrosian some translations are using a middle name. Please note that there are two Armenian chess players called "Tigran Petrosian" (see links at /author/petrosiantigran i.e. Petrosian, Tigran). This is why I personally think that the tag translations should be "literal" and not add additional constraints / specifications.
The example in Hebrew for /tag/Isaac+Leib+Peretz is using the abbreviated from
Hebrew י"ל פרץ (edited because of BiDirectional constraints in /talk page) 2 0 gangleri
The Hebrew character """ indicated the abbreviation; probably one might use also י"ל" פרץ

If tag translations could filter "flagged" translations voting could be done easier / quicker.

f) last but not leat /tag/Mendele+Moykher+Sforim relates to a pseudonym. Personaly I think that one should use consitently either the pseudonym or the author name.

future todos:
Beside Wikipedia one can use both WorlCat and VIAF pages to identify translations. However these pages might use "Combined diacritical marks" (CDM's) to represent glyphs / characters; Wikipedia is using some "Unicode character normalization" to both avoid titles that titles contain CDM and that the body of the article contains CDM's. Beside this the Wikipedia code is using precombined character normalization. Both normalisations facilitates character by character search (if homoglyphs are coded differently character by character search would only find results where either one or the other coding is used).

Note: This documentation is made using "[" for "[" and "]" for "]".

Oct 16, 2012, 12:59pm

>19 gangleri: continued

g) Please see /helpers_tagtranslations.php?member=530652. Many of the translations relate to translations starting with lower case letters. Only voting / support for the translations is possible.
"Downvoting" for the actual "winner" is not available at the moment.
In order to "downvote" an actual "winner" it is necessary to visit each page as /tag/book.

Oct 16, 2012, 1:04pm

>20 gangleri: continued
g) Please see /helpers_tagtranslations.php?member=number. At such pages it makes sense to link (once) to the profile of the corresponding user.

Oct 16, 2012, 5:09pm

Caution about "copy and paste" of Wikipedia titles (especially for Indic scripts but also for Arabic, Farsi etc):
I noticed it some years ago; the problem may be fixed for certain browsers.

If you copy a "Left To Right" (LTR) text make shure that the mouse / cursor is after the last character and drag the mouse one cm after the last character before you copy (on Windows before use CTRL+C).
Both for LTR and "Right To Left" (RTL) scripts it is wise to test the clipboard content for (supposed) Wikipedia titles by testing the clipboard content in a new tab; in order to asure that the last(s) punctuation character(s) are in your clipboard.

Oct 17, 2012, 8:54am

Note about lower case translations as for Greek, Cyrillic etc. scripts.
At /tag/schizophrenia you may find a Russian and Greek translation starting with lowercase letters.
How to proceed:
a) start from a known language and identify the article at Wikipedia
b) go to the corresponding article in the language of your choice
c) copy the article name (it is starting with an upper case latter)
d) search the page for other occurences
e) copy an occurence starting with lower case letters
f) insert the translation to tag translations
Good luck!

Edited: Oct 18, 2012, 1:37am

Yes, I suppose you could do that. But apparently you have more faith in Wikipedia than I do. Personally if I don't know the language I won't make any proposal - and even within those limits I made some mistakes (getting genders wrong and that sort of stuff).

ETA: For example I noticed you proposed translating "Porajmos" to "Zigeunervervolging" in Dutch. I can understand, since Wikipedia uses it that way (and the word "Porajmos" would indeed not be recognized by Dutch speakers - I had to google it). But Zigeunervervolging really means "Persecution of gypsies" generally, not specifically the Nazi campaign, so the translation isn't very good.

Oct 17, 2012, 7:28pm

You're way more comfortable with polysemy then I am. In the common case of a tag having multiple possible meanings, you can't blindly use Wikipedia to translate, even if the tag in practice only has one. Moreover, even in languages you know, like Esperanto, you're freer with translating polysemy; you translated English, even though there's no way to preserve the of England / English language polysemy in Esperanto.

Edited: Oct 19, 2012, 2:31pm

>24 Nicole_VanK: "But apparently you have more faith in Wikipedia than I do." Maybe, maybe not.
I used to work there on minority language projects and on BiDirectional (BiDi) issues mainly related to bug reporting on pages where scripts written Right To Left (RTL) and / or Left to Right (LIR) where used.

Taking a look at the tag translations from /tag/Mahatma+Gandhi one can see that it is not simply "copy ans paste" of page titles but also usage of "snippets" from the corresponding articles as at
Hebrew: מהאטמה גנדי
and / or proper use of LASTNAME, FIRSTNAME/GIVENNAME information from http://orlabs.oclc.org/identities/lccn-n79-41626 as in
Yiddish: מאהאטמא גאנדי
In Vietnamesee one should not use the page Title but "Mahātmā Gāndhī". To my knowledge Vietnamesee is the language with most html entities.
In Latin "Magnanimus Gandhi" http://la.wikipedia.org/wiki/Mahatma_Gandhi might be used.
In Turkish "Gandı" (using the character "ı") can not be found at http://tr.wikipedia.org/wiki/Mohandas_Karam%C3%A7and_Gandi

At /tag/John+Forbes+Nash only some translations are taken from Wikipedia titles.

re: mistakes
Personally I thing that the translations are an improuvment and a suport of small LT language communities. Mistakes and errors can be corrected at any time.
This is why "Porajmos" today "Zigeunervervolging" in Dutch can be voted down / rejected and (a) better translation(s) / transcriptions can be proposed.
)>25 prosfilaes: "In the common case of a tag having multiple possible meanings, you can't blindly use Wikipedia to translate, even if the tag in practice only has one."
To some extend I try to avoid additions of patarimonial "determinators" mainly usred in Russian articles as for "Tigran+Petrosian".

For me tag translation is a "compilation" (?) from many sources. Example:
http://ratings.fide.com/card.phtml?event=4100638 "Sveshnikov, Evgeny" Sveshnikov, Evgeny.
http://de.wikipedia.org/?curid=701141 using "Jevgēņijs Svešņikovs!". One one hand it is a respect to the nationality of each author and also a respect of his language, culture and name.
I remember that some months ago I tried for hours to identify Evgeny in my FF bookmarks and in LT.
http://de.wikipedia.org/?curid=6905850 contains some ideas about a multiproject syntax to identify persons.

Wikipedians agreed to use "pasport" names. However this neither applies to Kopernikus, Komenius, ancient Greeks, Romans etc.

Last but not least: In Esperanto one was using for years the capitalization of family names. In Wikipedia this capitalization in page titles has been abbolished for better interwiki linking but is still preserved in article texts.

Edited: Oct 18, 2012, 5:38am

Personally I thing that the translations are an improuvment and a suport of small LT language communities. Mistakes and errors can be corrected at any time.

Right. I didn't mean to suggest it's a big deal. No worries.

Oct 18, 2012, 5:37am

>19 gangleri: (continued)

g1) http://il.librarything.com/helpers_tagtranslations.php?member=296963
I wish to see there what tags are translated to Hebrew already and where I need to add a Hebrew translation.
g2) It would be very helpfull to disclose the url parameter to select a particular language only.
Example: I would like to add Hebrew, Greek, Latin etc. translations only at tags where I added a Russian translation previously.

Oct 18, 2012, 11:01am

LT has both /tag/Baha'i Faith and /tag/Baha'i. In English Wikipedia the second redirects to the first.
Until now only at the first one can find tag translation proposals. Half of them are more or less literal translations of either variants.

Edited: Oct 19, 2012, 2:46pm

>24 Nicole_VanK: "But apparently you have more faith in Wikipedia than I do."

I think that somtimes one should consider biside Wikipedia titles also published books.
For /tag/%60Abdu%27l-Bah%C3%A1 i.e. Tag: `Abdu'l-Bahá the Esperanto translation is using "Abdul Baha" because of
i.e. Parizaj paroladoj de Abdul Baha

If possible one should use only the (equivalent) characters for glyphs from the extended Latin blocs.
Example: For /tag/Paul+Erd%C5%91s i.e. Tag: Paul Erdős the Turkish translation is Paul Erdös 5 0.
Neither the Hungarian character "ő" nor the "ö" used in Germanic languages is available in Esperanto, Hexbrew, Yiddish, etc.

P.S. Is there a tag translator group?

If one is not shure if forms starting with lower case letters are used one can use Wikipedia search as in
http://la.wikipedia.org/w/index.php?title=special%3ASearch&fulltext=Search&a... for Latin translation at for /tag/autism.

Oct 19, 2012, 5:28pm

/tag/plague covers both http://en.wikipedia.org/wiki/Plague in German Plage and http://en.wikipedia.org/wiki/Plague_%28disease%29 in German Pest.
What should happen here?

Edited: Oct 19, 2012, 5:33pm

In my opinion you shouldn't try to make the translation more specific than the original.

ETA: That's just opinion though.

Oct 19, 2012, 5:44pm

Fair enough.

Edited: Oct 20, 2012, 8:35am

I noticed that it would be usefull to list case by case related tags. Examples:
/tag/New+York and /tag/New+York+City
/tag/Balcans and ???
/tag/plague and ???

P.S. Such case by case related tags could invite users to translate /tag/mother, /tag/mothers (plural / singular forms), /tag/father (antonims), /tag/parents, /tag/family (hyperonyms and hyponyms) etc.

P.P.S. A good start for translations can be found at http://en.wiktionary.org/wiki/Template:Swadesh_lists see http://en.wikipedia.org/wiki/Swadesh_list .
Maybe Tim / stuff should add an item at /helpers_tagtranslations.php.
FYI: http://www.omegawiki.org/Category:Swadesh_lists and http://www.omegawiki.org/Talk:Swadesh_lists

P.P.P.S. Finaly there are also Google translations and many pages as http://browse.dict.cc/greek-english/%CE%B3%CE%BF%CE%BD%CE%B5%CE%AF%CF%82.html as
γονείς {οι}
parents {noun}

Edited: Oct 20, 2012, 9:33am

re: >34 gangleri: Beside /tag/parents there is also /tag/parent which could be translated with Icelandic "foreldri", German "Elternteil", Yiddish "טאטע אדער מאמע" etc.

P.S. but /tag/grandparent&norefer=1 is part of /tag/grandparents.
This is why the translation proposals might be ambiguous and some seem to mean "grandfather".

Oct 20, 2012, 11:22am

>30 gangleri: "P.S. Is there a tag translator group?"

Such a groupe would be good. One could clarify for instance /tag/Verrat versus /tag/treason and /tag/treachery.

Edited: Oct 21, 2012, 8:30am

A new topic: It would be nice to have tags translated also at pages as
http://ru.librarything.com/tagcloud/gangleri-ro .

P.S. I was looking at a "chess library" at http://ru.librarything.com/profile/ThomasKleinert "Метки" means "Tags". It would be nice to see which tags are already translated and which are not.

The post above would help to identify (missing) translations at http://ru.librarything.com/tagcloud/ThomasKleinert .
Note: This may require to present / generate / have two versions of the page:
a) all links go to the catalogue pages
b) all links go to the tag pages

Oct 22, 2012, 11:50am

In the discussion about capitalization ( /topic/143357 ) there was an example with /tag/smith used both as (family) name and profession.

The translations for /tag/Adler (in German "eagle") might be more simple. Anyhow in German both the bird and the family name are written as all nouns starting with an upper case letter.

Oct 24, 2012, 4:12pm

notes about http://il.librarything.com/tag/Eduard+Buchner and http://lat.librarything.com/tag/Eduard+Buchner

Depending on efforts and resources it might be a good idea to present / list translated tags first. This might avoid multiple wrap arounds between Left To Right (LTR) and Right To Left (RTL) scripts.
Maybe the color indication for translations would help to identify what needs to be translated / fixed / voted on and what not.
http://ru.librarything.com/tag/Eduard+Buchner requires many fixes because of exiting COMMA character in the actual proposals.

Edited: Oct 30, 2012, 2:58pm

Hi! From /tag/Betty+Williams I reached /tag/Williams. Depending where this second tag is used Russian translations / transcriptions might be "Вильямс" or "Уильямс ".

This demonstrates that a "basic model" where only "one to one" relations are used has a lot of constraints.

The same would apply for the Esperanto translation of /tag/Romania where basically two variants can be used /tag/Rumanio and /tag/Rumanujo.

A solution would be to allow multiple "translations and or transcriptions" per language. It is basically a design and later a utilization isssue.

P.S. In Esperanto it is "Ruman...".

Edited: Oct 30, 2012, 3:07pm

Yes, I get it. Similarly "Queens" may refer to female royals, or to some part of New York - no telling what's actually meant. How does one translate that? I agree, the less the context of translation the harder translating.

But as is, tag translation means translating the tag as best you can. I strongly feel we shouldn't even try to improve on it, since we can't possibly know every other possible meaning / usage worldwide.