filename extensions in the document classes

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

filename extensions in the document classes

Post by WayneCa »

I have a few questions:
  1. Are the extension identifiers (*.ext) case insensitive, e.g. is *.ext the same as *.EXT?
  2. If they are not case sensitive, I have the following issue:

    I have a file extension, *.B09.
    Files with the .B09 extension show as type Custom but files with the extension .b09 show up as type B09.
    1. Why is this?
    2. What can I do to correct it so all files with .B09 or .b09 show up as type B09?
Further info: I have the files saved as Mac line endings and UTF-8 encoding. This is because these are source code files to a BASIC programming language on a retro computer. The line endings on that computer are $0D, the same as the Mac paragraph ending. The source files are text (no strange characters), but there are certain "hi-res" characters that are not dealt with correctly in TextPad. Example:
  • two characters I'm using are 0xAE and 0xBE. Looking at the file in a hex editor shows that the characters are saved in the file as 0xC2AE and 0xC2BE, I can strip the C2s from the file, but to ensure they do not return I save the files as UTF-8. So, I have the document class definition set to save new files as UTF-8 with Mac line endings.
FWIW, the files with the .B09 extension are still treated as B09 files in Textpad, even though the document properties shows them as type Custom.

Update: I just looked at a new .b09 file I saved that showed up as type B09 when I saved it. Closing TextPad and reopening it shows the file as type Custom, so the B09 type is only recognized when saving a new file.
Last edited by AmigoJack on Thu Oct 19, 2023 7:08 pm, edited 1 time in total.
Reason: list formatting; using colons
User avatar
AmigoJack
Posts: 532
Joined: Sun Oct 30, 2016 4:28 pm
Location: グリーン ヒル ゾーン
Contact:

Re: filename extensions in the document classes

Post by AmigoJack »

U+00AE and U+00BE are correctly encoded as 0xC2 0xAE and 0xC2 0xBE in UTF-8. Why are you using Unicode at all? Use "ANSI" or "DOS" as encoding and everything is saved as expected, too - U+00AE should be interpreted/displayed as "®" and U+00BE as "¾".

Keep in mind that you need to tell TextPad the encoding when loading the file, as there is virtually no way to tell ANSI from UTF-8 apart. Preferably start TextPad and press CTRL+O - that way you can select an encoding ("ANSI") to load your BASIC source code file properly.
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

I know which characters they are supposed to be. It is what happens to them I am concerned about. I need for them to simply be a single character in the source file (AE or BE) and not have extra bytes appended to them just because windows. I don't think TextPad should be changing anything in a source file that the user didn't specifically change. I copy the source file from the retro computer system and TextPad adds the extra codes without even letting me know it did it or asking me if I wanted them changed. It already has a character (? in a box) for characters it doesn't understand, so why can't it just use that and leave the character as it was?

Also, I have found a work-around. I use the hex editor XVI32 and I can strip the extra characters out with it. That helps, but it is an extra step I could live without.

Also, I started using UTF-8 encoding because I got tired of TextPad complaining that the characters weren't ANSI. Using UTF-8 encoding stopped that.
User avatar
AmigoJack
Posts: 532
Joined: Sun Oct 30, 2016 4:28 pm
Location: グリーン ヒル ゾーン
Contact:

Re: filename extensions in the document classes

Post by AmigoJack »

This can't be consistent:
  • Either your files are already wrongly encoded in UTF-8 (because 0xAE alone is not valid in UTF-8) or it's not UTF-8 to begin with.
  • TextPad changes nothing unless you save the file. Not looking at the encoding you use when saving the file is your fault - why does your own document class "B09" use a default encoding of UTF-8 instead of ANSI or DOS?
  • WayneCa wrote: Thu Oct 19, 2023 4:52 pmLooking at the file in a hex editor shows that the characters are saved in the file as 0xC2AE and 0xC2BE, I can strip the C2s from the file, but to ensure they do not return I save the files as UTF-8.
    I have no idea how you can remotely achieve what you wrote: if you break UTF-8 encoding (removing one byte per character) then you're back at step 1 (original file/text encoding) and TextPad will, as per UTF-8, again save both characters with 2 bytes each.
Have you even tried opening the file in a running TextPad instance? Why not attaching examples of your files to your post so we have a chance to reproduce your issue?

I created this file, having exactly 0xAE 0xBE. Opens just fine in TextPad 8.4.2, even recognizing it as ANSI. Pressing F12 to save it with a different filename, and upon inspection it has no UTF-8 encoding either - both files are identical byte wise. Line breaks are Windows, tho, but that shouldn't matter:
  1. before.txt
    (22 Bytes) Downloaded 299 times
  2. after.png
    after.png (6.33 KiB) Viewed 8011 times
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

AmigoJack wrote: Mon Oct 23, 2023 1:06 am This can't be consistent:
  • Either your files are already wrongly encoded in UTF-8 (because 0xAE alone is not valid in UTF-8) or it's not UTF-8 to begin with.
  • TextPad changes nothing unless you save the file. Not looking at the encoding you use when saving the file is your fault - why does your own document class "B09" use a default encoding of UTF-8 instead of ANSI or DOS?
In a previous response I stated: "Also, I started using UTF-8 encoding because I got tired of TextPad complaining that the characters weren't ANSI. Using UTF-8 encoding stopped that." I'm pretty sure the complaint was due to the characters being a single byte and not an integer value. based on what you said about C2 being a valid character in a previous response to me: "U+00AE and U+00BE are correctly encoded as 0xC2 0xAE and 0xC2 0xBE in UTF-8." However, using UTF-8 got rid of the not ANSI message I was getting.
AmigoJack wrote: Mon Oct 23, 2023 1:06 am [*]
WayneCa wrote: Thu Oct 19, 2023 4:52 pmLooking at the file in a hex editor shows that the characters are saved in the file as 0xC2AE and 0xC2BE, I can strip the C2s from the file, but to ensure they do not return I save the files as UTF-8.
I have no idea how you can remotely achieve what you wrote: if you break UTF-8 encoding (removing one byte per character) then you're back at step 1 (original file/text encoding) and TextPad will, as per UTF-8, again save both characters with 2 bytes each.[/list] Have you even tried opening the file in a running TextPad instance? Why not attaching examples of your files to your post so we have a chance to reproduce your issue?

I created this file, having exactly 0xAE 0xBE. Opens just fine in TextPad 8.4.2, even recognizing it as ANSI. Pressing F12 to save it with a different filename, and upon inspection it has no UTF-8 encoding either - both files are identical byte wise. Line breaks are Windows, tho, but that shouldn't matter:
  1. before.txt
  2. after.png
Yes, using ANSI does leave the characters as they were originally. But the editor complains about them not being ANSI and I don't want to keep seeing that message. Is there a way to get that message to stop being displayed?

Also, my other question has not been addressed. It seems the only time the B09 document type is applied to a document is when the file hasn't been saved before. Once it has been saved it always shows up as custom, whether I use the B09 or b09 extension names. Are the two synonymous, and how can I get TextPad to see them as B09 files instead of custom files?
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

As my previous question concerning document class assignment has not been answered, and since I also now have a new issue concerning document encoding on the same class, I thought I would post images here to provide detail of what I am asking.
  1. What the encoding is set as in the configuration preferences:
    Prefs1.png
    Prefs1.png (43.81 KiB) Viewed 4800 times
  2. What the document preferences shows:
    Prefs2.png
    Prefs2.png (25.48 KiB) Viewed 4800 times
  3. What class the document preferences shows the file as being:
    Prefs3.png
    Prefs3.png (24.19 KiB) Viewed 4800 times
Note that the document preferences will not allow me to change the encoding, even though I re-saved the file (using save as) and ensured the encoding was set to ANSI. Note also that the document preferences show the document class as Custom rather than B09.
Last edited by AmigoJack on Tue Jul 16, 2024 7:35 pm, edited 1 time in total.
Reason: attaching files instead of relying on other hoster; list formatting
User avatar
AmigoJack
Posts: 532
Joined: Sun Oct 30, 2016 4:28 pm
Location: グリーン ヒル ゾーン
Contact:

Re: filename extensions in the document classes

Post by AmigoJack »

AmigoJack wrote: Mon Oct 23, 2023 1:06 amWhy not attaching examples of your files to your post so we have a chance to reproduce your issue?
You didn't address that issue either. Even when making your post you did not use the "Attachments" tab to upload your pictures to this board - so most likely you don't know about it to begin with. If we had your text file(s) as binary safe attachment we could at least try to reproduce what you claim and what none of us can remotely imagine.

The file properties' default encoding being displayed is irrelevant, because you choose the text encoding when saving the file ("Save as..." dialog - please also start using quotation marks for captions and names) - keep an eye on which line ending and text encoding is displayed there and if you even need to adjust it.
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

  1. I didn't know I was supposed to attach any files. I can provide the file named in the images.
  2. Just because the encoding shown in the document properties is "irrelevant" doesn't mean it should be incorrect. It should reflect whatever the file's encoding is. I consider that an error.
  3. I'm not sure what you are driving at with your last statement. I provided 3 images. If you are referencing the text I placed before each image, I'm not understanding how quotes would make it different, or better.
Anyway, I am attaching the file in question, but I'm not sure how to go about providing you with a copy of the B09 document class.
I had to rename the file to SOKOBAN_B09.txt before I could attach it. It will need to be renamed SOKOBAN.B09 before you can do anything with it.
SOKOBAN_B09.txt
(36.77 KiB) Downloaded 126 times
Last edited by AmigoJack on Wed Jul 17, 2024 4:08 pm, edited 2 times in total.
Reason: removing unnecessary full quote; list formatting
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

I have taken screenshots of "Configure->Preferences->Document Classes->B09". I will also include the "b09.syn" file to make it complete. The other sections are immaterial as I didn't edit them.
Config1.png
Config1.png (51.62 KiB) Viewed 4771 times
Config2.png
Config2.png (43.95 KiB) Viewed 4771 times
Config3.png
Config3.png (36.69 KiB) Viewed 4771 times
As with the previous file, I had to rename the syntax file to b09_syn.txt in order to attach it. It will need to be renamed b09.syn before using it.
b09_syn.txt
(1.36 KiB) Downloaded 134 times
Last edited by AmigoJack on Wed Jul 17, 2024 4:10 pm, edited 1 time in total.
Reason: attaching picture files instead of relying on other hoster
User avatar
bbadmin
Site Admin
Posts: 878
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Re: filename extensions in the document classes

Post by bbadmin »

I think you must be using .editorconfig files, as that results in a custom document class. This is what I get without that:

B09.png
B09.png (42.9 KiB) Viewed 4782 times

They could be overriding your default encoding.
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

bbadmin wrote: Wed Jul 17, 2024 8:05 amI think you must be using .editorconfig files, as that results in a custom document class.
I don't even know what .editorconfig files are. I've never purposely used one.
This is what mine looks like now. I've edited the file since uploading a copy here, so the numbers are different, but the encoding and class are what matters.
Prefs4.png
Prefs4.png (24.34 KiB) Viewed 4767 times
And the Preferences tab under the document preferences shows:
Config4.png
Config4.png (25.55 KiB) Viewed 4767 times
If you can tell me how to access .editorconfig files I can check and see if there is one that I wasn't aware of.
Last edited by AmigoJack on Wed Jul 17, 2024 4:20 pm, edited 1 time in total.
Reason: attaching picture files instead of relying on other hoster; shortening full quote to relevant part
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

OK, it's strange. I closed out the file and reopened it. Now it shows ANSI and class B09 in the Document Preferences Document tab and ANSI encoding in the Document Preferences Preferences tab. I did the same thing yesterday and it didn't change anything.
User avatar
AmigoJack
Posts: 532
Joined: Sun Oct 30, 2016 4:28 pm
Location: グリーン ヒル ゾーン
Contact:

Re: filename extensions in the document classes

Post by AmigoJack »

WayneCa wrote: Tue Jul 16, 2024 10:04 pmI'm not understanding how quotes would make it different, or better.
It's in phrases like these (colored blue) where quotation marks around it would make it more exactly instead of just implying/guessing where it starts and how far it goes:
WayneCa wrote: Tue Jul 16, 2024 6:54 pmI re-saved the file (using save as) ... show the document class as Custom rather than B09.
Also please start attaching your pictures instead of hosting them elsewhere - keep an eye to all the edits I made to your posts (you see that remark at the bottom of each post). Also there's no need to full quote a whole post - just use the "Reply" button at the bottom or reduce quotes to relevant parts (as you did before).
WayneCa
Posts: 59
Joined: Sat Aug 16, 2014 2:13 am

Re: filename extensions in the document classes

Post by WayneCa »

OK, I will start attaching my images instead of hosting them on my website. I understand what you are saying about the quotes, so I will try to remember that.

On the "Custom" class, the class shown in Document Preferences Document tab was "B09" for about 5 minutes. Then it reverted back to "Custom". I looked up "editorconfig" in the help, learned where it was and found it checked (I never checked that box myself), so I unchecked it. Quit the editor and relaunched it, no difference. I'll try closing and opening the file again, but I have little faith that it will make a difference.

It changes to "B09" until I switch to a different document. When I come back to the document in question it has reverted to "Custom".
User avatar
bbadmin
Site Admin
Posts: 878
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Re: filename extensions in the document classes

Post by bbadmin »

There are only two ways that a document can have a custom document class:
  1. If its settings are overridden by .editorconfig files.
  2. If you change its settings via its Properties dialog box.
The first one can be disabled by unchecking "Enable .editorconfig" on Configure » Preferences » Editor.
In the second case, those changes only persist if the document is open in a workspace.
Post Reply