Advice about this warning message?

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
terrypin
Posts: 174
Joined: Wed Jul 11, 2007 7:50 am

Advice about this warning message?

Post by terrypin »

Occasionally when I open certain kinds of text file I get this sort of message:

Image

That particular file was prefs.js, the main settings file for Firefox. I had to use Notepad instead to peform the editing I wanted, as I was nervous about the changes implied by TextPad.

I'm using TextPad 4.7.3, a very old version of TextPad, which I'm so comfortable with that I've no desire to upgrade.

Is there some way I can configure TextPad to act like Notepad for this type of file please?

EDIT: (It seems I cannot post a reply to my own post.)

Can I take it that this thread

http://stackoverflow.com/questions/8879 ... ll-support

and this

http://forums.textpad.com/viewtopic.php?t=11019

give an accurate (but disappointing) summary of the issue please?

--
Terry, East Grinstead, UK
User avatar
kengrubb
Posts: 324
Joined: Thu Dec 11, 2003 5:23 pm
Location: Olympia, WA, USA

Post by kengrubb »

I am running TP 7.4 64 bit on Win7 64 bit

Just now installed Firefox, and the prefs.js file is ANSI so I'm not getting the error you see.

Been a TP user for 15 years, since 4.x days

Take a gamble on the latest and greatest. RE is the only major change you'll see.
(2[Bb]|[^2].|.[^Bb])

That is the question.
terrypin
Posts: 174
Joined: Wed Jul 11, 2007 7:50 am

Post by terrypin »

Thanks Ken. Just seen your reply on return from holiday.

I'll give some thought to your recommendation about updating my ancient TP version.

I'm puzzled why your prefs.js encoded as ANSI. But how exactly can you tell? When I open a recent prefs.js in Notepad++ it opens successfully (while TP gives me that warning) and under the Encoding tab I see this:

Image

I assume that means the file is currently encoded as 'UTF-8 without BOM'.

This is the latest version of FF, 31.0, so I would have expected your file to be the same encoding? Or maybe it's down to some XP v Win7 difference?

Frankly, this encoding and BOM stuff is outside my know-how level. Ideally I'd simply like text files to keep it all behind closed doors!

--
Terry, East Grinstead, UK
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

it is not always possible to get the encoding from looking at a file.

if a file only contains characters from the ascii range (0..127), it could be ASCII. But it could also be UTF-8.

The best that can be done is to exclude some encodings - if a certain byte or a certain combination of bytes is not allowed in a certain encoding, but occurs in the file in question, this encoding can be excluded.

For example, in ISO-8859-1, all 256 possible byte values are allowed. So any file could be ISO-8859-1. You can't exclude ISO-8859-1, as there is no byte that is illegal in ISO-8859-1, and there is also no byte-combination not allowed in ISO-8859-1.

In ASCII (the original 7-bit), a byte with a value > 127 can not occur. Thus if your file contains a byte with value 160 you can be sure the encoding is NOT ASCII. But you still don't know which encoding it is.

Thus, for some files, several encodings are possible ...


If the prefs.js does not contain any characters outside the ASCII range, it can be considered ANSI, ISO-8859-x, UTF-8 and several others.
prefs.js is stored as UTF-8 - but as long as there are no characters outside ASCII, you can't discern whether the encoding is UTF-8 or ASCII (or ...)


A BOM (Byte Order Mark) is a certain character in Unicode (Codepoint FFFE). It is placed at the beginning of the file. Depending on the encoding (UTF-8, UTF-16 low -> high, UTF-16 high-low) it places a certain byte combination at the start of the file. If one of these combinations is found, it is highly probably that the file contains UNICODE characters, and it is also possible to determine whether the Unicode is encoded as UTF-8, UTF-16 low->high, UTF-16 high->low ...) If you treat a UTF-8 BOM, it looks like ￾ - a highly unlikely character combination - thus it is a quite strong hint that the file is encoded in UTF-8 ...
terrypin
Posts: 174
Joined: Wed Jul 11, 2007 7:50 am

Post by terrypin »

Thanks, Mudguard, very helpful.

Terry, East Grinstead, UK
User avatar
kengrubb
Posts: 324
Joined: Thu Dec 11, 2003 5:23 pm
Location: Olympia, WA, USA

Post by kengrubb »

In TP, you can do a Save As and alter the encoding when you save.
(2[Bb]|[^2].|.[^Bb])

That is the question.
Post Reply