Java files set to UTF-8 but opening as ANSI (cp1252)

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
john1857
Posts: 13
Joined: Wed Oct 09, 2019 6:02 pm

Java files set to UTF-8 but opening as ANSI (cp1252)

Post by john1857 »

In Configure > Preferences > Document classes > Java, I have specified UTF-8 as the default character encoding. However, when I open a Java source file which includes non-ASCII characters such as the copyright sign © (\u00A9), Textpad shows this as the two-character sequence © (which is 0xC2 0xA9, the two-byte UTF-8 representation of \u00A9). When I press Alt-Enter to display the document properties, it cheerfully tells me that the encoding is 1252 (ANSI - Latin 1).

Other editors, even the accursed Notepad, recognise the file as UTF-8 and display it correctly. How can I convince Textpad to do the same and use UTF-8 as I already told it to do?
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

let me have a guess - you are not using a bom, and the copyright char is the only non-ascii character in the file.

It might help to put a comment like /*äöüß*/ into the file (close to the beginning).
john1857
Posts: 13
Joined: Wed Oct 09, 2019 6:02 pm

Post by john1857 »

MudGuard wrote:let me have a guess - you are not using a bom, and the copyright char is the only non-ascii character in the file.

It might help to put a comment like /*äöüß*/ into the file (close to the beginning).
There are some other characters, but essentially yes.

I don't want to fill my files with meaningless guff just to persuade Textpad to treat the file as I asked it to -- so if that's the only solution, I'll just have to use Notepad++ or some other editor whose Unicode support is less flaky. A pity if that's so.
Post Reply