Are the extension identifiers (*.ext) case insensitive, e.g. is *.ext the same as *.EXT?
If they are not case sensitive, I have the following issue:
I have a file extension, *.B09.
Files with the .B09 extension show as type Custom but files with the extension .b09 show up as type B09.
Why is this?
What can I do to correct it so all files with .B09 or .b09 show up as type B09?
Further info: I have the files saved as Mac line endings and UTF-8 encoding. This is because these are source code files to a BASIC programming language on a retro computer. The line endings on that computer are $0D, the same as the Mac paragraph ending. The source files are text (no strange characters), but there are certain "hi-res" characters that are not dealt with correctly in TextPad. Example:
two characters I'm using are 0xAE and 0xBE. Looking at the file in a hex editor shows that the characters are saved in the file as 0xC2AE and 0xC2BE, I can strip the C2s from the file, but to ensure they do not return I save the files as UTF-8. So, I have the document class definition set to save new files as UTF-8 with Mac line endings.
FWIW, the files with the .B09 extension are still treated as B09 files in Textpad, even though the document properties shows them as type Custom.
Update: I just looked at a new .b09 file I saved that showed up as type B09 when I saved it. Closing TextPad and reopening it shows the file as type Custom, so the B09 type is only recognized when saving a new file.
Last edited by AmigoJack on Thu Oct 19, 2023 7:08 pm, edited 1 time in total.
Reason:list formatting; using colons
U+00AE and U+00BE are correctly encoded as 0xC2 0xAE and 0xC2 0xBE in UTF-8. Why are you using Unicode at all? Use "ANSI" or "DOS" as encoding and everything is saved as expected, too - U+00AE should be interpreted/displayed as "®" and U+00BE as "¾".
Keep in mind that you need to tell TextPad the encoding when loading the file, as there is virtually no way to tell ANSI from UTF-8 apart. Preferably start TextPad and press CTRL+O - that way you can select an encoding ("ANSI") to load your BASIC source code file properly.
I know which characters they are supposed to be. It is what happens to them I am concerned about. I need for them to simply be a single character in the source file (AE or BE) and not have extra bytes appended to them just because windows. I don't think TextPad should be changing anything in a source file that the user didn't specifically change. I copy the source file from the retro computer system and TextPad adds the extra codes without even letting me know it did it or asking me if I wanted them changed. It already has a character (? in a box) for characters it doesn't understand, so why can't it just use that and leave the character as it was?
Also, I have found a work-around. I use the hex editor XVI32 and I can strip the extra characters out with it. That helps, but it is an extra step I could live without.
Also, I started using UTF-8 encoding because I got tired of TextPad complaining that the characters weren't ANSI. Using UTF-8 encoding stopped that.
Either your files are already wrongly encoded in UTF-8 (because 0xAE alone is not valid in UTF-8) or it's not UTF-8 to begin with.
TextPad changes nothing unless you save the file. Not looking at the encoding you use when saving the file is your fault - why does your own document class "B09" use a default encoding of UTF-8 instead of ANSI or DOS?
WayneCa wrote: ↑Thu Oct 19, 2023 4:52 pmLooking at the file in a hex editor shows that the characters are saved in the file as 0xC2AE and 0xC2BE, I can strip the C2s from the file, but to ensure they do not return I save the files as UTF-8.
I have no idea how you can remotely achieve what you wrote: if you break UTF-8 encoding (removing one byte per character) then you're back at step 1 (original file/text encoding) and TextPad will, as per UTF-8, again save both characters with 2 bytes each.
Have you even tried opening the file in a running TextPad instance? Why not attaching examples of your files to your post so we have a chance to reproduce your issue?
I created this file, having exactly 0xAE 0xBE. Opens just fine in TextPad 8.4.2, even recognizing it as ANSI. Pressing F12 to save it with a different filename, and upon inspection it has no UTF-8 encoding either - both files are identical byte wise. Line breaks are Windows, tho, but that shouldn't matter:
AmigoJack wrote: ↑Mon Oct 23, 2023 1:06 am
This can't be consistent:
Either your files are already wrongly encoded in UTF-8 (because 0xAE alone is not valid in UTF-8) or it's not UTF-8 to begin with.
TextPad changes nothing unless you save the file. Not looking at the encoding you use when saving the file is your fault - why does your own document class "B09" use a default encoding of UTF-8 instead of ANSI or DOS?
In a previous response I stated: "Also, I started using UTF-8 encoding because I got tired of TextPad complaining that the characters weren't ANSI. Using UTF-8 encoding stopped that." I'm pretty sure the complaint was due to the characters being a single byte and not an integer value. based on what you said about C2 being a valid character in a previous response to me: "U+00AE and U+00BE are correctly encoded as 0xC2 0xAE and 0xC2 0xBE in UTF-8." However, using UTF-8 got rid of the not ANSI message I was getting.
WayneCa wrote: ↑Thu Oct 19, 2023 4:52 pmLooking at the file in a hex editor shows that the characters are saved in the file as 0xC2AE and 0xC2BE, I can strip the C2s from the file, but to ensure they do not return I save the files as UTF-8.
I have no idea how you can remotely achieve what you wrote: if you break UTF-8 encoding (removing one byte per character) then you're back at step 1 (original file/text encoding) and TextPad will, as per UTF-8, again save both characters with 2 bytes each.[/list] Have you even tried opening the file in a running TextPad instance? Why not attaching examples of your files to your post so we have a chance to reproduce your issue?
I created this file, having exactly 0xAE 0xBE. Opens just fine in TextPad 8.4.2, even recognizing it as ANSI. Pressing F12 to save it with a different filename, and upon inspection it has no UTF-8 encoding either - both files are identical byte wise. Line breaks are Windows, tho, but that shouldn't matter:
before.txt
after.png
Yes, using ANSI does leave the characters as they were originally. But the editor complains about them not being ANSI and I don't want to keep seeing that message. Is there a way to get that message to stop being displayed?
Also, my other question has not been addressed. It seems the only time the B09 document type is applied to a document is when the file hasn't been saved before. Once it has been saved it always shows up as custom, whether I use the B09 or b09 extension names. Are the two synonymous, and how can I get TextPad to see them as B09 files instead of custom files?
As my previous question concerning document class assignment has not been answered, and since I also now have a new issue concerning document encoding on the same class, I thought I would post images here to provide detail of what I am asking.
What the encoding is set as in the configuration preferences:
Prefs1.png (43.81 KiB) Viewed 4800 times
What the document preferences shows:
Prefs2.png (25.48 KiB) Viewed 4800 times
What class the document preferences shows the file as being:
Prefs3.png (24.19 KiB) Viewed 4800 times
Note that the document preferences will not allow me to change the encoding, even though I re-saved the file (using save as) and ensured the encoding was set to ANSI. Note also that the document preferences show the document class as Custom rather than B09.
Last edited by AmigoJack on Tue Jul 16, 2024 7:35 pm, edited 1 time in total.
Reason:attaching files instead of relying on other hoster; list formatting
AmigoJack wrote: ↑Mon Oct 23, 2023 1:06 amWhy not attaching examples of your files to your post so we have a chance to reproduce your issue?
You didn't address that issue either. Even when making your post you did not use the "Attachments" tab to upload your pictures to this board - so most likely you don't know about it to begin with. If we had your text file(s) as binary safe attachment we could at least try to reproduce what you claim and what none of us can remotely imagine.
The file properties' default encoding being displayed is irrelevant, because you choose the text encoding when saving the file ("Save as..." dialog - please also start using quotation marks for captions and names) - keep an eye on which line ending and text encoding is displayed there and if you even need to adjust it.
I didn't know I was supposed to attach any files. I can provide the file named in the images.
Just because the encoding shown in the document properties is "irrelevant" doesn't mean it should be incorrect. It should reflect whatever the file's encoding is. I consider that an error.
I'm not sure what you are driving at with your last statement. I provided 3 images. If you are referencing the text I placed before each image, I'm not understanding how quotes would make it different, or better.
Anyway, I am attaching the file in question, but I'm not sure how to go about providing you with a copy of the B09 document class.
I had to rename the file to SOKOBAN_B09.txt before I could attach it. It will need to be renamed SOKOBAN.B09 before you can do anything with it.
I have taken screenshots of "Configure->Preferences->Document Classes->B09". I will also include the "b09.syn" file to make it complete. The other sections are immaterial as I didn't edit them.
Config1.png (51.62 KiB) Viewed 4771 times
Config2.png (43.95 KiB) Viewed 4771 times
Config3.png (36.69 KiB) Viewed 4771 times
As with the previous file, I had to rename the syntax file to b09_syn.txt in order to attach it. It will need to be renamed b09.syn before using it.
bbadmin wrote: ↑Wed Jul 17, 2024 8:05 amI think you must be using .editorconfig files, as that results in a custom document class.
I don't even know what .editorconfig files are. I've never purposely used one.
This is what mine looks like now. I've edited the file since uploading a copy here, so the numbers are different, but the encoding and class are what matters.
Prefs4.png (24.34 KiB) Viewed 4767 times
And the Preferences tab under the document preferences shows:
Config4.png (25.55 KiB) Viewed 4767 times
If you can tell me how to access .editorconfig files I can check and see if there is one that I wasn't aware of.
Last edited by AmigoJack on Wed Jul 17, 2024 4:20 pm, edited 1 time in total.
Reason:attaching picture files instead of relying on other hoster; shortening full quote to relevant part
OK, it's strange. I closed out the file and reopened it. Now it shows ANSI and class B09 in the Document Preferences Document tab and ANSI encoding in the Document Preferences Preferences tab. I did the same thing yesterday and it didn't change anything.
WayneCa wrote: ↑Tue Jul 16, 2024 10:04 pmI'm not understanding how quotes would make it different, or better.
It's in phrases like these (colored blue) where quotation marks around it would make it more exactly instead of just implying/guessing where it starts and how far it goes:
WayneCa wrote: ↑Tue Jul 16, 2024 6:54 pmI re-saved the file (using save as) ... show the document class as Custom rather than B09.
Also please start attaching your pictures instead of hosting them elsewhere - keep an eye to all the edits I made to your posts (you see that remark at the bottom of each post). Also there's no need to full quote a whole post - just use the "Reply" button at the bottom or reduce quotes to relevant parts (as you did before).
OK, I will start attaching my images instead of hosting them on my website. I understand what you are saying about the quotes, so I will try to remember that.
On the "Custom" class, the class shown in Document Preferences Document tab was "B09" for about 5 minutes. Then it reverted back to "Custom". I looked up "editorconfig" in the help, learned where it was and found it checked (I never checked that box myself), so I unchecked it. Quit the editor and relaunched it, no difference. I'll try closing and opening the file again, but I have little faith that it will make a difference.
It changes to "B09" until I switch to a different document. When I come back to the document in question it has reverted to "Custom".
There are only two ways that a document can have a custom document class:
If its settings are overridden by .editorconfig files.
If you change its settings via its Properties dialog box.
The first one can be disabled by unchecking "Enable .editorconfig" on Configure » Preferences » Editor.
In the second case, those changes only persist if the document is open in a workspace.