File encoding detection when opening from windows explorer

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
jjmc
Posts: 4
Joined: Thu Jan 28, 2016 3:41 pm

File encoding detection when opening from windows explorer

Post by jjmc »

I have a file with UTF characters in it (currency symbols) and when I open it with Textpad's "File -> Open" dialog the characters display fine, but if I open it by double-clicking the file in windows file explorer, the characters display as different characters. They don't get converted to question marks or converted to the other characters, Textpad just displays them differently. In the open dialog I left Mode as "Auto" and Encoding as "Default".

Here's a picture: http://n54i.imgup.net/textpad-pie142.png

I'd expect that opening a file from windows explorer would be equivalent to opening it from the File -> Open dialog. Is there some reason why Textpad would detect the encoding differently in those two scenarios?
Kiwi6469

Encoding is not correct

Post by Kiwi6469 »

That's because TextPad is opening using an SBCS encoding, not UTF-8. The values you see are the UTF-8 encoded bytes. I am having similar problems where new files use a codepage instead of UTF-8 by default and I keep having to do a save-as after creating them.
User avatar
bbadmin
Site Admin
Posts: 1020
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

You will need to implement the following procedure in TextPad:

From the Configure menu choose:

1. Preferences
2. "+" sign next to Document classes
3. Click on the relevant Document class(not the plus sign next to it)
4. Under Document class options, scroll down to "Default encoding" and "Create new files as", and select the required options
5. Click Apply / OK.

I hope this helps.
Kiwi6469

Been there, done that.

Post by Kiwi6469 »

@bbadmin,

Yes, I've done that, and just now verified that every document class, including the default one have UTF-8 set as the default encoding. Yet still new documents are created with the Windows default SBCS codepage and I've had multiple occurrences existing documents (source files) which are UTF-8 encoded open in this same SBCS codepage causing the UTF-8 encoded characters to be displayed as nonsensical character strings. In all cases the work-around is to do File, Save-As and select UTF-8 encoding. It's rather annoying, to be frank. And I can't use a BOM because the Java compiler does not accept them (at least it doesn't in versions old enough that I can't use them as I have to target some code as far back as Java 5).
Post Reply