search starting with a leading space may generate Unicode Character 'REPLACEMENT CHARACTER's
Talk Bug Collectors
Join LibraryThing to post.
This topic is currently marked as "dormant"—the last message is more than 90 days old. You can revive it by posting a reply.
1gangleri
Hi! from http://www.librarything.com/topic/99640#2235384 and brightcopy's following Message 4
hints:
a) Highlighting might be involved.
b) The examples relate to other characters then from the Unicode Basic Latin block.
Regards Reinhardt
hints:
a) Highlighting might be involved.
b) The examples relate to other characters then from the Unicode Basic Latin block.
Regards Reinhardt
2timspalding
Okay, where are you talk-searching from?
3gangleri
>2 timspalding: Start at http://www.librarything.com/search
Go to the "Talk messages" field
Insert a SPACE followed by the string leading space
The fourth result I can see shows a link
gangleri in Bug Collectors : normalization: ignore �� leading / trailing spaces �� �� spaces / whitespace �� in �� Common Knowle (Jul 15, 3:04pm)
Go to the "Talk messages" field
Insert a SPACE followed by the string leading space
The fourth result I can see shows a link
gangleri in Bug Collectors : normalization: ignore �� leading / trailing spaces �� �� spaces / whitespace �� in �� Common Knowle (Jul 15, 3:04pm)
4gangleri
additional note: The original post relates to the timestamp "(Jul 15, 3:04pm)" linked to http://www.librarything.com/topic/94962#2084845
Dear Tim; I noticed two issues:
a) The example provided at 3> relates to a post where UTF-8 characters are used. It could be that not all SW pieces are using the same encoding. Please verify.
b) I noticed it yesterday but did not bookmark that link. Anyhow it can be reproduced.
b1) make the search as described at 3> you will see as text for that search result:
Hi! today looking at may events I realised that the event �� Aktion Brandt �� (see de.Wikipedia - ger) is listed twice: a: at 5297264::Nebel im August b: at 10095304::Psychiatrie im Nationalsozialismus I realised that one has one trailing space and the other two. In the past I ...
This is not the text from http://www.librarything.com/topic/94962#2084845 which renders:
...
a: at Nebel im August
b: at Psychiatrie im Nationalsozialismus
...
I want to point that you are cutting as search result pieces from the posted text regardless if this is syntax nor not.
I do not want to dispute if this is the way it should be, if it is an easy implementation or not. I ask myself only if users and newbies will understand that they will see somthing different if they go to the link location.
On the way to this internet café I was thinking that adding a rendering step could improuve this.
So the steps would be:
1) You leave the search us it is today;
2) You render a meaningfull part of it (this can get complicated because you need to make a syntax analysis)
3) Now you use the "rendered result" as link title (and not what the search has found)
----
A simpler example can be found at
search_talk: Diary of a Young Girl
Please look at the result with timestamp "(Sep 29, 2:28am)"
The title directing to http://www.librarything.com/topic/96843#2220229 is
3032251:: Anne FRANK (1929 - 1945 ) The Diary of A Young Girl NEXT: G
The text at this link is:
Anne FRANK
(1929 - 1945 )
The Diary of A Young Girl
NEXT: G
One could generate the following title:
Anne FRANK¶¶(1929 - 1945 )¶The Diary of A Young Girl¶¶NEXT: G" or somthing similar.
The change has not a high priority for me. But make the decision yourself. Regards Reinhardt
changed typo and fixed html
Dear Tim; I noticed two issues:
a) The example provided at 3> relates to a post where UTF-8 characters are used. It could be that not all SW pieces are using the same encoding. Please verify.
b) I noticed it yesterday but did not bookmark that link. Anyhow it can be reproduced.
b1) make the search as described at 3> you will see as text for that search result:
Hi! today looking at may events I realised that the event �� Aktion Brandt �� (see de.Wikipedia - ger) is listed twice: a: at 5297264::Nebel im August b: at 10095304::Psychiatrie im Nationalsozialismus I realised that one has one trailing space and the other two. In the past I ...
This is not the text from http://www.librarything.com/topic/94962#2084845 which renders:
...
a: at Nebel im August
b: at Psychiatrie im Nationalsozialismus
...
I want to point that you are cutting as search result pieces from the posted text regardless if this is syntax nor not.
I do not want to dispute if this is the way it should be, if it is an easy implementation or not. I ask myself only if users and newbies will understand that they will see somthing different if they go to the link location.
On the way to this internet café I was thinking that adding a rendering step could improuve this.
So the steps would be:
1) You leave the search us it is today;
2) You render a meaningfull part of it (this can get complicated because you need to make a syntax analysis)
3) Now you use the "rendered result" as link title (and not what the search has found)
----
A simpler example can be found at
search_talk: Diary of a Young Girl
Please look at the result with timestamp "(Sep 29, 2:28am)"
The title directing to http://www.librarything.com/topic/96843#2220229 is
3032251:: Anne FRANK (1929 - 1945 ) The Diary of A Young Girl NEXT: G
The text at this link is:
Anne FRANK
(1929 - 1945 )
The Diary of A Young Girl
NEXT: G
One could generate the following title:
Anne FRANK¶¶(1929 - 1945 )¶The Diary of A Young Girl¶¶NEXT: G" or somthing similar.
The change has not a high priority for me. But make the decision yourself. Regards Reinhardt
changed typo and fixed html
5timspalding
Update on why this isn't fixed yet (see http://www.librarything.com/topic/107331).
Probably fixed, because search system for talk is completely different.
Probably fixed, because search system for talk is completely different.
6timspalding
Character-set issues are a huge pain.

