Hi, I encountered another problem in 8.1.2 (64Bit) related to UTF-8 encoding:
If I open a file with only one or two UTF-8 characters, the file is loaded as ANSI which leads to a broken character presentation. Even if the file open dialog is used and the charset is set to UTF-8 explicitly, the file is loaded as ANSI.
If the file has at least three UTF-8 characters, everything works fine.
Example with german umlauts:
ä => ä
äö => äö
äöü => äöü
Best regards
Plasm
Problem with 1 or 2 UTF-8 characters
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
To be clear: The problem occurs if there are only one/two UTF-8 characters amongst others.
Thus:
äbcdefghijklmnöpqrstuvwxyz => äbcdefghijklmnöpqrstuvwxyz (2x UTF-8)
äbcdefghijklmnöpqrstüvwxyz => äbcdefghijklmnöpqrstüvwxyz (3x UTF-8)
[No edit privilige, unfortunately]
Thus:
äbcdefghijklmnöpqrstuvwxyz => äbcdefghijklmnöpqrstuvwxyz (2x UTF-8)
äbcdefghijklmnöpqrstüvwxyz => äbcdefghijklmnöpqrstüvwxyz (3x UTF-8)
[No edit privilige, unfortunately]
- christiandittmann41
- Posts: 5
- Joined: Fri Jul 15, 2016 10:10 pm
- Location: Düsseldorf/Germany, La Nucia, Spain
one, to three utf chars
Hello!
I've tested your problem with Win10 and TP32. All is ok.
The error occurs only in the 64bit version.
So, the workaround is to use the 32bit version of TP.
Why do you think that you really need the 64bit version? This is ridiculous, no one edits such large files and in a dialog program speed is secondary...
So long
Christian, the kraut, from good old germany
I've tested your problem with Win10 and TP32. All is ok.
The error occurs only in the 64bit version.
So, the workaround is to use the 32bit version of TP.
Why do you think that you really need the 64bit version? This is ridiculous, no one edits such large files and in a dialog program speed is secondary...
So long
Christian, the kraut, from good old germany
Re: one, to three utf chars
Thanks for this hint, although it doesn't make that much sense why a different platform compilation should behave differently in its logic.
Because the system is 64bit and every process not being 64bit needs to be adapted, hence running effectively slower.christiandittmann41 wrote:Why do you think that you really need the 64bit version?
I do (i.e. 2.6 GiB files) and I am someone.christiandittmann41 wrote:no one edits such large files
By that you mean speed in your internet browser, your photo editor, your file manager and probably non-fullscreen games as well the speed is not important to you? I have my doubts.christiandittmann41 wrote:in a dialog program speed is secondary
Re: Problem with 1 or 2 UTF-8 characters
My solution is to insert at beginning of for example a PHP file:Plasm wrote:Hi, I encountered another problem in 8.1.2 (64Bit) related to UTF-8 encoding:
If I open a file with only one or two UTF-8 characters, the file is loaded as ANSI which leads to a broken character presentation. Even if the file open dialog is used and the charset is set to UTF-8 explicitly, the file is loaded as ANSI.
If the file has at least three UTF-8 characters, everything works fine.
Example with german umlauts:
ä => ä
äö => äö
äöü => äöü
Best regards
Plasm
Code: Select all
<?php //éèÃÂ
...
?>
If the same sequence is placed too far from the beginning, the encoding could be incorrectly determined.
Problem still persists in 8.2.0 (64 Bit).
Test case:
- Create a new file
- Write: "äeiöu"
- Save the file as UTF-8 without BOM
- Close the file (or Textpad itself)
- Open the file by double-clicking on it, from the open dialog or via dragging it into Textpad (doesn't matter)
- Result: Textpad displays "äeiöu"
The file is saved correctly (tested with other editors). The Problem occurs at opening the file.
If there are more than 2 UTF-8 characters, everything is fine. For example: "äeiöü" results in "äeiöü".
BTW: I saved the file as .txt. The Text document class has UTF-8 charset and no BOM as default settings, if that matters.
Best regards
Plasm
Test case:
- Create a new file
- Write: "äeiöu"
- Save the file as UTF-8 without BOM
- Close the file (or Textpad itself)
- Open the file by double-clicking on it, from the open dialog or via dragging it into Textpad (doesn't matter)
- Result: Textpad displays "äeiöu"
The file is saved correctly (tested with other editors). The Problem occurs at opening the file.
If there are more than 2 UTF-8 characters, everything is fine. For example: "äeiöü" results in "äeiöü".
BTW: I saved the file as .txt. The Text document class has UTF-8 charset and no BOM as default settings, if that matters.
Best regards
Plasm
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
In https://forums.textpad.com/viewtopic.php?t=13253 I suggested:
    Save your session in a workspace and open the file by opening the workspace.
Is that a suitable solution?
(I use workspaces for all my non-transient editing work.)
    Save your session in a workspace and open the file by opening the workspace.
Is that a suitable solution?
(I use workspaces for all my non-transient editing work.)