I'm using the 32 bit 4.7.3 version of TextPad.
I'm baffled why certain spurious characters are appearing in my files. Without truly grasping this topic, I believe they are 'BOM' characters. Having had obscure problems in this area before, I have that setting switched OFF:
Here is the first section of a text file opened in TextPad (created by a freeware tool called Directory Lister, [url]http://download.cnet...4-10397036.html[/url] also showing how it looks in a hex editor:
I then deleted the first two lines and resaved. As you see, the edited file now has those spurious characters at the beginning, spoiling the first line for further processing:
Can anyone suggest what's causing this please?
--
Terry, East Grinstead, UK
Puzzling behaviour - 'BOM' characters?
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
Yes, and that setting was disabled there too. In fact it seems it was disabled for all document classes.
But meanwhile I tried enabling the BOM setting and the result rather adds to the puzzle:
I now have to try recalling what problem enabling it caused a year or two ago, prompting its disabling! I'd have hoped all this obscure BOM and Unicode/UTF stuff would work without my involvement, because it's largely a black art to me.
--
Terry, East Grinstead, UK
But meanwhile I tried enabling the BOM setting and the result rather adds to the puzzle:
I now have to try recalling what problem enabling it caused a year or two ago, prompting its disabling! I'd have hoped all this obscure BOM and Unicode/UTF stuff would work without my involvement, because it's largely a black art to me.
--
Terry, East Grinstead, UK
Bizarre!
I have just stepped through these tests again ... and TextPad is now working as expected! IOW, with the BOM option checkmarked I get the odd characters, with it unmarked I don't.
It's as if the setting had somehow got itself reversed.
For those that appreciate the details:
1. I saved the list from Directory Lister (DL).
2. I examined the hex; no spurious characters, ruling out the possibility that DL was the problem.
3. I opened it in Notepad and saved it with a new name. The hex of that was fine.
5. I opened the original in TextPad. Write Unicode and UTF-8 BOM was disabled. I saved it with a new name. The hex of that was fine.
6. I opened the original in TextPad. I enabled Write Unicode and UTF-8 BOM and I saved with a new name. The hex of that showed the spurious characters.
So, until it happens again (and spotting it will be the challenge), my dilemma is resolved: I'll leave it disabled.
--
Terry, East Grinstead, UK
I have just stepped through these tests again ... and TextPad is now working as expected! IOW, with the BOM option checkmarked I get the odd characters, with it unmarked I don't.
It's as if the setting had somehow got itself reversed.
For those that appreciate the details:
1. I saved the list from Directory Lister (DL).
2. I examined the hex; no spurious characters, ruling out the possibility that DL was the problem.
3. I opened it in Notepad and saved it with a new name. The hex of that was fine.
5. I opened the original in TextPad. Write Unicode and UTF-8 BOM was disabled. I saved it with a new name. The hex of that was fine.
6. I opened the original in TextPad. I enabled Write Unicode and UTF-8 BOM and I saved with a new name. The hex of that showed the spurious characters.
So, until it happens again (and spotting it will be the challenge), my dilemma is resolved: I'll leave it disabled.
--
Terry, East Grinstead, UK
I had a similar problem previously.
To begin with, I would save (export) a section of the registry. Then I would edit it with Textpad and save it. Then, if I would try to "import" the edited file back into the registry using the registry editor, the import would fail with complaints that the file was not of the proper format.
It turns out Textpad was writing the file in Unicode (probably UTF-8), and was including the BOM characters. I don't recall what the state of the setting: "Write Unicode and UTF-8 BOM" was at that time.
The solution for me was to do a "File-save-as" and change the "Encoding" to "ANSI" (or DOS would probably have been OK) on the "File-save-as" dialog box. I believe it was usually defaulted to "UTF-8" but I couldn't say it was always the case.
For me, this is a better solution because it allows editing "Unicode" files that already contain BOM characters to be correctly read/edited/written.
To begin with, I would save (export) a section of the registry. Then I would edit it with Textpad and save it. Then, if I would try to "import" the edited file back into the registry using the registry editor, the import would fail with complaints that the file was not of the proper format.
It turns out Textpad was writing the file in Unicode (probably UTF-8), and was including the BOM characters. I don't recall what the state of the setting: "Write Unicode and UTF-8 BOM" was at that time.
The solution for me was to do a "File-save-as" and change the "Encoding" to "ANSI" (or DOS would probably have been OK) on the "File-save-as" dialog box. I believe it was usually defaulted to "UTF-8" but I couldn't say it was always the case.
For me, this is a better solution because it allows editing "Unicode" files that already contain BOM characters to be correctly read/edited/written.
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm