add meta ... Content-Type content="text/html; charset=utf-8 to site

TalkBug Collectors

Join LibraryThing to post.

add meta ... Content-Type content="text/html; charset=utf-8 to site

This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.

1gangleri
Edited: Sep 30, 2010, 6:26 pm

Hi! Sorry for this!

It does not make much sense to display random / unpredictable content.

I noticed that my posts at http://www.librarything.com/topic/99615 are displayed in an random fashion.

Message 1 does not display what was originaly posted. Message 8 looks fine now but how will it look tomorrow ...

I beleive that adding some meta tgas to the site
meta http-equiv="Content-Type" content="text/html; charset=utf-8"
could save a lot of time ond avoid many troubles.

Best regards Reinhardt

Here some test:

test:_à ,_á,_â,_ä,_å,_æ,_ç,_è,_é,-ë,_í,_ï,_ð,_ñ,_ó,_ö,_ø,_ú,_ü,_þ

Test:_%C3%A0,_%C3%A1,_%C3%A2,_%C3%A4,_%C3%A5,_%C3%A6,_%C3%A7,_%C3%A8,_%C3%A9,-%C3%AB,_%C3%AD,_%C3%AF,_%C3%B0,_%C3%B1,_%C3%B3,_%C3%B6,_%C3%B8,_%C3%BA,_%C3%BC,_%C3%BE

I used the first line as a title in Wikipedia. Then I copied and edited the resulting url using firefox.

2gangleri
Sep 30, 2010, 6:27 pm

>1 gangleri: The bug might relate to the ediiting feature:

again:

test:_à,_á,_â,_ä,_å,_æ,_ç,_è,_é,-ë,_í,_ï,_ð,_ñ,_ó,_ö,_ø,_ú,_ü,_þ

Test:_%C3%A0,_%C3%A1,_%C3%A2,_%C3%A4,_%C3%A5,_%C3%A6,_%C3%A7,_%C3%A8,_%C3%A9,-%C3%AB,_%C3%AD,_%C3%AF,_%C3%B0,_%C3%B1,_%C3%B3,_%C3%B6,_%C3%B8,_%C3%BA,_%C3%BC,_%C3%BE

3gangleri
Oct 7, 2010, 11:57 am

note: this might a a subsequent error or somthing different

The two Talk search differ only by the leading space at the first one:
+leading+space
leading+space

The first shows Unicode Character 'REPLACEMENT CHARACTER's:
gangleri in Bug Collectors : normalization: ignore �� leading / trailing spaces �� �� spaces / whitespace �� in �� Common Knowle (Jul 15, 3:04pm)

The second shows:
gangleri in Bug Collectors : normalization: ignore « leading / trailing spaces » « spaces / whitespace » in « Common Knowle (Jul 15, 3:04pm)

I do not know ho the highlighting might be involved becuase the fist search starts witha leading space.

4brightcopy
Edited: Oct 7, 2010, 12:48 pm

Your original two posts were about a bug that Tim fixed a few days ago (I can dig it up if you'd like).

For the new post, I tried your first example and got:

Warning: strpos(): Empty delimiter. in /var/www/html/search_talk.php on line 355

As the first line. That's definitely a bug. You should post this to Bug Collectors as a new post if you haven't already. Make sure you mention that error.

I don't get quite the results you do, but I think that error may be screwing some things up.

5gangleri
Oct 7, 2010, 1:06 pm

>4 brightcopy: Thanks for your time! http://www.librarything.com/topic/100109

I will make some tests to confirm if the original bug is fixed / "works for me".

6gangleri
Edited: Oct 7, 2010, 1:44 pm

>1 gangleri:, >2 gangleri:, >4 brightcopy: :

The characters from the test:_à,_á,_â,_ä,_å,_æ,_ç,_è,_é,-ë,_í,_ï,_ð,_ñ,_ó,_ö,_ø,_ú,_ü,_þ are reserved now when the message is edited again. Thanks Tim for fixing! Thanks brightcopy for the time. Would be nice to have the Bug Tracking number; however there is no hurry.

This editing is fixed / "works for me".

----

However I beleive that the title request meta ... Content-Type content="text/html; charset=utf-8" to site would help avoiding a lot of subsequent bugs:

http://www.librarything.com/topic/95972 might relate to such "browser misunderstandigs".

Many sites are missing this: at http://www.nndb.com/people/811/000091538/ I can see
Mother: Marija Javer�ek
Son: Mi�o Broz (by Hertha)

7brightcopy
Oct 7, 2010, 2:00 pm

The character fix was part of the html entity fix bug:

http://www.librarything.com/topic/99459

8gangleri
Edited: Oct 7, 2010, 3:00 pm

If I look at the the source code of this page I can see only

<meta name="description" content="LibraryThing catalogs your books online, easily, quickly and for free.">
<meta name="keywords" content="librarything, library, thing, catalog your books, catalogue your books, book cataloging, library, free book catalog, catalogue">

at a / any page from en.Wikipedia I can see:
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" dir="ltr">
....
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta name="generator" content="MediaWiki 1.16wmf4" />
<meta name="robots" content="noindex,follow" />

This is a clear indication. Not everything will / may aply here ...

Oops there are five meta lines at Wikipedia

9brightcopy
Oct 7, 2010, 2:50 pm

Sorry, I'm talking about the characters that weren't properly escaped on editing. I'm not talking about the charset issue.

10gangleri
Oct 7, 2010, 3:02 pm

>7 brightcopy:, >9 brightcopy: I received a phone call. I noticed later. Thanks

11PaulFoley
Oct 7, 2010, 8:21 pm

http://validator.w3.org finds lots of errors in LT pages...

12gangleri
Oct 8, 2010, 7:52 am

content="text/html; charset=utf-8" would assume to fix also a lot of utf-8 issues:
http://www.librarything.com/topic/82757 old bug
http://www.librarything.com/topic/100160 (Bug Tracking)

13PaulFoley
Oct 8, 2010, 10:47 pm

Putting the charset on the content header, while a good idea in itself, wouldn't help at all here. You're already seeing UTF-8 -- that's the problem: the character on the page is not UTF-8. It would need to send "charset=iso-8859-1" if you wanted that to display correctly (but then other characters on the page that are UTF-8 would be screwed). LT seems to have bad data in its language names.

14gangleri
Oct 8, 2010, 11:01 pm

Probably there are more bugs involved. One the table the other ...

16gangleri
Nov 15, 2010, 6:02 am

17gangleri
Edited: Mar 27, 2012, 9:44 am

originally posted: nov 22, 2010, 7:21am

I just found http://www.librarything.com/profile/Perlovka . The talk is illegible ...

2012-03-27: P.S. re: "CharSetUTF-8"

Today one and a half years later using another computer and "Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20100101 Firefox/11.0" I can see the text in Russian properly.