Page 1 of 1

Binary mode not displaying all characters

Posted: Thu Apr 13, 2017 11:01 am
by MeepoRatbagger
Using TextPad 8.1.2 (64-bit Edition).

I have a file which is... approximately Windows-1252 encoded (the correct encoding isn't important for this issue).
Opening the file in Binary mode to display what's really there, I see:

Code: Select all

D0: 69 63 65 22 20 22 22 48 61  76 6C ED 9F 6B E0 76 20  ice" "Havlíkàv 


When opened in Default mode, 1252 encoding, the same line appears as

Code: Select all

ice" "HavlíŸkàv 
The character code 9F, Latin Capital Letter Y With Diaeresis, is not appearing in the text side of the Binary display and all subsequent characters are misaligned with their hex codes.

Posted: Thu Apr 13, 2017 11:36 am
by MudGuard
The problem starts earlier. There are three bytes 22, but only two " ...

Posted: Fri Apr 14, 2017 4:49 pm
by bbadmin
It does actually matter which code page the text is in. “í� is in 1250 and “à� in 1252, so a file containing both of them can only be saved in UTF-8 (or UTF-16). Those characters are then stored in two bytes each.

The binary view displays each byte value and its corresponding character from the default system ANSI code page. Hence the character display doesn’t match the Unicode code points of those characters.

I hope this clarifies the situation for you.