Search in files - wrong encoding for German Umlauts

haeb · Post by **haeb** » Mon Aug 26, 2013 7:52 pm

Hi all,

there seems to be a bug in search in files.

If searching for an word which has German Umlauts, Textpad does not find the word when the charcter set of the file is set to utf-8. There is no difference whether the file is saved with or without BOM.

If the files charcter set is to ANSI TP does find the words.

If searching another word, nearby the umlaut-word in the utf-8 file, TP finds the the other word and displays the umlaut-word in a wrong encoding.

If somebody want to see my search results, i can send screenshot.

Win7 x64 TP 7.0.9 German

Regards
Horst

haeb · Post by **haeb** » Mon Aug 26, 2013 7:54 pm

Hi all,

ADDITION

I meant "search in files" (STRG+F5) not search in A file

Regards

criss · Post by **criss** » Mon Sep 02, 2013 9:48 am

Hi,

keywords in syntax definition with Umlaute are not highlighted in UTF8 files when they contain german Umlaute (like Ã¼Ã¤Ã¶ÃŸ).

haeb · Post by **haeb** » Thu Jan 16, 2014 2:09 pm

Hi all,

this is still a bug in 7.1.0

You can't "search in files" for a string which contains a umlaut

Horst

haeb · Post by **haeb** » Tue May 06, 2014 12:58 pm

Hi all,

... in 7.2.0 still a bug...

You can't "search in files" for a string which contains a umlaut

Horst

kengrubb · Post by **kengrubb** » Tue May 06, 2014 11:31 pm

Works fine with Find, Find Next, and Find Previous. Does not work with Find In Files.

I also found the problem searching for â‚¬ (ASCII Hex 80)

Problem did not occur searching for (ASCII Hex 7F)

Win 7 64-bit
TP 7.2.0 64-bit

haeb · Post by **haeb** » Thu Jul 10, 2014 1:55 pm

In 7.3.0 it does NOT work as expected!

Sorry for this mail (which i have corrected now)!

look to my new mail of today.

haeb · Post by **haeb** » Thu Sep 11, 2014 5:06 pm

New tests...

The function "search in files" do NOT work in any file type, which contains the umlauts but for ANSI files.

It is somehow confusing, but i try to explain what works and what doesn't.

I have tested a folder containing 7 files with all availabe file types. Every files has just one line, which explains the file.
ANSI
utf-8 without BOM
utf-8 with BOM
Unicode
Unicode/Big Endian
Unicode/Big Endian without containing umlauts
Unicode without containing umlauts

If searching for the word "test", next to the umlaut word in the file, TP finds the the other word and displays the whole line. The umlaut word is shown in a wrong encoding:
http://haeberlen.org/privat/tp/textpad_ ... auts_e.PNG

Searching for the word "test" should find 7 files, but it does find only 4 occurrences. Unicode or Unicode/Big Endian or files which are containing umlauts are not in the list. But if the Unicode file do not contain umlauts it will be found by TP.

Confusing - Test it yourself:
http://haeberlen.org/privat/tp/testfiles.zip

Horst

Community

Search in files - wrong encoding for German Umlauts

Search in files - wrong encoding for German Umlauts

the same problem with syntax definition