File encoding detection when opening from windows explorer

jjmc · Post by **jjmc** » Fri Feb 19, 2016 5:27 pm

I have a file with UTF characters in it (currency symbols) and when I open it with Textpad's "File -> Open" dialog the characters display fine, but if I open it by double-clicking the file in windows file explorer, the characters display as different characters. They don't get converted to question marks or converted to the other characters, Textpad just displays them differently. In the open dialog I left Mode as "Auto" and Encoding as "Default".

Here's a picture: http://n54i.imgup.net/textpad-pie142.png

I'd expect that opening a file from windows explorer would be equivalent to opening it from the File -> Open dialog. Is there some reason why Textpad would detect the encoding differently in those two scenarios?

Kiwi6469 · Post by **Kiwi6469** » Fri May 06, 2016 8:09 pm

That's because TextPad is opening using an SBCS encoding, not UTF-8. The values you see are the UTF-8 encoded bytes. I am having similar problems where new files use a codepage instead of UTF-8 by default and I keep having to do a save-as after creating them.

Post by **bbadmin** » Mon May 09, 2016 9:45 am

You will need to implement the following procedure in TextPad:

From the Configure menu choose:

1. Preferences
2. "+" sign next to Document classes
3. Click on the relevant Document class(not the plus sign next to it)
4. Under Document class options, scroll down to "Default encoding" and "Create new files as", and select the required options
5. Click Apply / OK.

I hope this helps.

Kiwi6469 · Post by **Kiwi6469** » Mon May 09, 2016 7:19 pm

@bbadmin,

Yes, I've done that, and just now verified that every document class, including the default one have UTF-8 set as the default encoding. Yet still new documents are created with the Windows default SBCS codepage and I've had multiple occurrences existing documents (source files) which are UTF-8 encoded open in this same SBCS codepage causing the UTF-8 encoded characters to be displayed as nonsensical character strings. In all cases the work-around is to do File, Save-As and select UTF-8 encoding. It's rather annoying, to be frank. And I can't use a BOM because the Java compiler does not accept them (at least it doesn't in versions old enough that I can't use them as I have to target some code as far back as Java 5).

Community

File encoding detection when opening from windows explorer

File encoding detection when opening from windows explorer

Encoding is not correct

Been there, done that.