Binary mode not displaying all characters

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
MeepoRatbagger
Posts: 7
Joined: Tue Aug 09, 2016 12:07 pm

Binary mode not displaying all characters

Post by MeepoRatbagger »

Using TextPad 8.1.2 (64-bit Edition).

I have a file which is... approximately Windows-1252 encoded (the correct encoding isn't important for this issue).
Opening the file in Binary mode to display what's really there, I see:

Code: Select all

D0: 69 63 65 22 20 22 22 48 61  76 6C ED 9F 6B E0 76 20  ice" "Havlíkàv 


When opened in Default mode, 1252 encoding, the same line appears as

Code: Select all

ice" "HavlíŸkàv 
The character code 9F, Latin Capital Letter Y With Diaeresis, is not appearing in the text side of the Binary display and all subsequent characters are misaligned with their hex codes.
"To err is human; to purr, feline."
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

The problem starts earlier. There are three bytes 22, but only two " ...
User avatar
bbadmin
Site Admin
Posts: 854
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

It does actually matter which code page the text is in. “í� is in 1250 and “à� in 1252, so a file containing both of them can only be saved in UTF-8 (or UTF-16). Those characters are then stored in two bytes each.

The binary view displays each byte value and its corresponding character from the default system ANSI code page. Hence the character display doesn’t match the Unicode code points of those characters.

I hope this clarifies the situation for you.
Post Reply