When you clic save as and choose the UTF8 option in the code section, textpad actually encode special character in UTF-8 format BUT doesn't prefix the UTF-8 stream with the character U+FEFF (ZERO WIDTH NO-BREAK SPACE), or Byte-Order Mark (BOM).
Some programs like flash (and maybe others...) won't read the file as a UTF stream but like a standard ASCII file.
I read this page as reference before posting this suggestion :
http://www.cl.cam.ac.uk/~mgk25/unicode.html#ucsutf
It is mentionned that, i quote :
A good encoding converter will also offer options for adding or removing the BOM:
* Unconditionally prefix the output text with U+FEFF.
* Prefix the output text with U+FEFF unless it is already there.
* Remove the first character if it is U+FEFF.
I hope this help, and congratulation for the editor, it is really a good one, my favorite actually...so i hope this bug could be fix !
Thanks.
UTF-8 encoding
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
But something still seems not logical to me because the document class is based on file extension so if I check the use UTF BOM box in class preferences, all my text files will be prefixed with the BOM and it's really not what it should be, dont you think ? Or i should make a new document class which will open all *.utf file extension, but it's not an ideal solution, a text file encoded in UTF-8 still is a text file, isn't it ?
-
- Posts: 1
- Joined: Mon Jul 18, 2005 9:27 am
This is infact very annoying, i experience it myself. When saving a file as UTF-8, when I open it again its opened as ANSI or DOS.
Why cant this be detected? It means I have to manually use the open dialog each time I need to edit one of theese files. Working with alot of javascript, many of the files are UTF-8 for the moment, but not all - and I think its a bad idea using the class idea making all files UTF-8 by default.
Why cant this be detected? It means I have to manually use the open dialog each time I need to edit one of theese files. Working with alot of javascript, many of the files are UTF-8 for the moment, but not all - and I think its a bad idea using the class idea making all files UTF-8 by default.
Visual Studio 2005 Solution Files
Just a heads up, VS2005 won't recognize solution files saved without the BOM. So people who like to edit their solution files with Textpad are going to run into this issue.
Wrong.jeanflash wrote:because the document class is based on file extension
Whether a file belongs to a document class or not is not determined by the extension.
It belongs to the alphabetically last document class that has a filename pattern matching the filename.
It is no problem to have e.g. bla.ext in a different document class than blubb.ext - just give the full file name as pattern - just because most patterns have the form *.ext does not mean that bla.*, bla*.ext, bla.ext ... are not allowed.