Hi all,
when 7.0.0 was introduced, "search in files" for an 'umlaut' did not work at all. After some time on Version 7.4.0 or 7.5.0 it worked correct - Thank you for that!
Now there is another problem about umlauts in 8.0.0 which is a quite bigger problem for me.
Some utf-8 files which were displayed and saved correct in 7.6.0 suddenly appeared in 8.0.0 as ANSI files. So the umlauts were distroyed and the file also were distroyed.
I do not know which files were recognized correct as utf-8 and which were recognized wrong as ANSi files because only some of them were recognized wrong. All these files are utf-8 files in 7.6.0 and all were saved without BOM.
At the moment TP 8.0.0 is unusable for me because it distroyed some of my files until i found this bug. So i switched back to 7.0.6.
Regards
Horst ... hoping a solution will be found soon
Umlauts - never ending story
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
Hi all,
it is different and more complicated - after a intensive research i discovered the following :
All files i am talking about, were saved as utf-8 files by TP7.
#First
There is a border of about 4010 chars which TP looks for umlauts. So i have two test files, one file has the first umlaut-chars after char 4010 named after4010.txt and the second file has the first umlaut-chars befor char 4010 named before4010.txt.
#Second
There is a difference on how a umlaut file displays umlauts between opening the file. We have at least 4 types of opening a file:
1. "drag and drop" from explorer to TP
2. opening by clicking on a result of "search in files"
3. "menu > file > open"
4. "menu > file
it is different and more complicated - after a intensive research i discovered the following :
All files i am talking about, were saved as utf-8 files by TP7.
#First
There is a border of about 4010 chars which TP looks for umlauts. So i have two test files, one file has the first umlaut-chars after char 4010 named after4010.txt and the second file has the first umlaut-chars befor char 4010 named before4010.txt.
#Second
There is a difference on how a umlaut file displays umlauts between opening the file. We have at least 4 types of opening a file:
1. "drag and drop" from explorer to TP
2. opening by clicking on a result of "search in files"
3. "menu > file > open"
4. "menu > file
- "
#Third
When opening after4010.txt by opening types 1 or 2, TP8 displays the file as ANSI and replaces the umlauts with the HEX value of the umlaut and saves it as ANSI. When opening after4010.txt by opening types 3 or 4 umlauts are displayed correct and TP saves it correct as utf-8.
#Fourth
When opening after4010.txt by opening type 2 and afterwards closing the file and using opening type 4 opens also file with wrong displayed umlauts. Even if TP8 is closed and started again, reopennig with type 4 opens a file with wrong displayed umlauts. Only when opening - i am always talking about the same file, no saving, just opening and closing it - again with type 3 closing and opening with type 4 the correct umlauts are displayed.
#Sixth
The before4010.txt file does not have any problems on displaying umlauts with any method of opening a file.
#Seventh
TP7 does not make any difference between the 4 opening types on after4010.txt files like TP8 does.
So it depend on how i open a after4010.txt file whether umlauts are displayed correct or not in TP8.
I do have many files which are like the after4010.txt file because these are code files which are containing German comments or German interface text in some parts. So it could easily happen, there is no comment in the first 4010 chars followed by several comments with umlauts.
Now knowing the opening differences is helping little. But TP8 is still unusable for me till type 1 an 2 are behaving like type 3 and 4 especially type 2 i am using very heavy.
Btw.
I wrote "search in files" with umlauts did work since TP 7.4.0. This was complete wrong. It does not work in 7.x.x and it does work 'a little' in 8.0.0. TP 7.x.x does not find any word which contains umlauts. TP 8.0.0 does find words containing umlauts in before4010.txt file types but not in after4010.txt file types.
Thank you for reading
Horst
If a file does not start with a BOM, TextPad reads the first 4Kb and uses heuristics to determine if it contains any UTF-8, UTF-16 or UTF16-BE characters. If none are found, the file is assumed to be in the default system code page. This behaviour can be overridden by selecting the encoding on the Open File dialog box, or by setting the default encoding for the corresponding document class.
I hope this helps.
I hope this helps.
Ii works!
TP 8.0.1 does now:
1. open files by 'find in files' => encoding is used like defined by class
2. detect files when umlaut is in search strings in 'find in files' AND diplays all umlaut correct in the result list
3. open files by double click => encoding is used like defined by class
TP 7.6.1 has still the same problems about umlauts. But i have now switched to TP8.
Regards
Horst
TP 8.0.1 does now:
1. open files by 'find in files' => encoding is used like defined by class
2. detect files when umlaut is in search strings in 'find in files' AND diplays all umlaut correct in the result list
3. open files by double click => encoding is used like defined by class
TP 7.6.1 has still the same problems about umlauts. But i have now switched to TP8.
Regards
Horst