Hi
got an UTF-8 encoded file.
When opening this the editor (TextPad 5.0.3) says:
WARNING: <file> contains characters that do not exist in code page 1252 (ANSI - Lateinisch I). They will be converted to the system default character, if you click OK.
But I don't want them to be converted!!!
Because doing that the BOM disappears.........
Since I read everywhere that "TextPad automatically detects... UTF-8" why doesn't this apply to my installation.
Opening the file (with conversion) and save_as UTF-8 will not add the BOM.
As well trying to make a new file (UTF-8 ) will as well not add the BOM.
The document class I'm using has:
- the tick on "Write Unicode and UTF-8 BOM"
- default encoding on UTF-8
- create new file as PC
Can anyone tell me what I am doing wrong? Or is this yet another feature?
Thanks a lot for your help....
V 5.0.3 : cannot open UTF-8 file withou error
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
You are not doing anything wrong - you just came across at a limitation in TextPad. TextPad corectly detects UTF-8 files (as the message indicates), but the editor itself is not capable of editing Unicode. So the UTF-8 file must be converted to ANSI character set. TextPad will then convert ANSI back to UTF-8 when saving.
Often this works fine, but some documents contain characters that are not present in your current ANSI codepage (selected in Control panel, Regional Options). They simpy cannot be represented using that codepage, at all. So TextPad must convert them to what it calls "system default character", which really means these characters will be lost. Either that, or it cannot display the file at all.
If you often need to edit UTF-8 or UTF-16 files, you will need to find a text editor that fully supports these encodings - there are several free and shareware ones, but the quality of Unicode support varies.
Often this works fine, but some documents contain characters that are not present in your current ANSI codepage (selected in Control panel, Regional Options). They simpy cannot be represented using that codepage, at all. So TextPad must convert them to what it calls "system default character", which really means these characters will be lost. Either that, or it cannot display the file at all.
If you often need to edit UTF-8 or UTF-16 files, you will need to find a text editor that fully supports these encodings - there are several free and shareware ones, but the quality of Unicode support varies.
Thanks for your anwer. I've been looking for what codes are in charge of this dirty error message and found these:
This is just an excerpt of what is missing I suppose.
Can anyone from TextPad tell me, when these codes might be corrected?
TextPad has much better features than other editors.
- First allow to read UTF-8 files
- Then also allow to write this as an UTF-8 file (incl. BOM).
Therefore it would be fine if one could really use it also for editing UTF-8 encoded files.
Thanks for your answer.......
Code: Select all
What is it: Unicode (Hex): UTF-8 (Hex)
Latin Small Letter T with Caron 00165 C5A5
Latin Capital Letter C with Caron 0010C C48C
Latin Small Letter C with Caron 0010D C48D
Latin Small Letter C with Acute 00107 C487
Latin Capital Letter D with Caron 0010E C48E
Latin Small Letter D with Caron 0010F C48F
Latin Small Letter L with Acute 0013A C4BA
Latin Capital Letter L with Caron 0013D C4BD
Latin Small Letter L with Caron 0013E C4BE
Latin Small Letter N with Caron 00148 C588
Latin Small Letter O with double Acute 00151 C591
Latin Small Letter U with double Acute 00171 C5B1
Latin Small Letter S with Caron 00161 C5A1
Can anyone from TextPad tell me, when these codes might be corrected?
TextPad has much better features than other editors.
- First allow to read UTF-8 files
- Then also allow to write this as an UTF-8 file (incl. BOM).
Therefore it would be fine if one could really use it also for editing UTF-8 encoded files.
Thanks for your answer.......