Search in files - wrong encoding for German Umlauts

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
haeb
Posts: 22
Joined: Wed Oct 27, 2010 4:53 pm

Search in files - wrong encoding for German Umlauts

Post by haeb »

Hi all,

there seems to be a bug in search in files.

If searching for an word which has German Umlauts, Textpad does not find the word when the charcter set of the file is set to utf-8. There is no difference whether the file is saved with or without BOM.

If the files charcter set is to ANSI TP does find the words.

If searching another word, nearby the umlaut-word in the utf-8 file, TP finds the the other word and displays the umlaut-word in a wrong encoding.

If somebody want to see my search results, i can send screenshot.

Win7 x64 TP 7.0.9 German

Regards
Horst
haeb
Posts: 22
Joined: Wed Oct 27, 2010 4:53 pm

Post by haeb »

Hi all,

ADDITION

I meant "search in files" (STRG+F5) not search in A file

Regards
criss
Posts: 1
Joined: Mon Jun 24, 2013 6:58 am

the same problem with syntax definition

Post by criss »

Hi,

keywords in syntax definition with Umlaute are not highlighted in UTF8 files when they contain german Umlaute (like üäöß).
haeb
Posts: 22
Joined: Wed Oct 27, 2010 4:53 pm

Post by haeb »

Hi all,

this is still a bug in 7.1.0

You can't "search in files" for a string which contains a umlaut

Horst
haeb
Posts: 22
Joined: Wed Oct 27, 2010 4:53 pm

Post by haeb »

Hi all,

... in 7.2.0 still a bug...

You can't "search in files" for a string which contains a umlaut

Horst
User avatar
kengrubb
Posts: 324
Joined: Thu Dec 11, 2003 5:23 pm
Location: Olympia, WA, USA

Post by kengrubb »

Works fine with Find, Find Next, and Find Previous. Does not work with Find In Files.

I also found the problem searching for € (ASCII Hex 80)

Problem did not occur searching for  (ASCII Hex 7F)

Win 7 64-bit
TP 7.2.0 64-bit
(2[Bb]|[^2].|.[^Bb])

That is the question.
haeb
Posts: 22
Joined: Wed Oct 27, 2010 4:53 pm

Post by haeb »

In 7.3.0 it does NOT work as expected!

Sorry for this mail (which i have corrected now)!

look to my new mail of today.
haeb
Posts: 22
Joined: Wed Oct 27, 2010 4:53 pm

Post by haeb »

New tests...

The function "search in files" do NOT work in any file type, which contains the umlauts but for ANSI files.

It is somehow confusing, but i try to explain what works and what doesn't.

I have tested a folder containing 7 files with all availabe file types. Every files has just one line, which explains the file.
ANSI
utf-8 without BOM
utf-8 with BOM
Unicode
Unicode/Big Endian
Unicode/Big Endian without containing umlauts
Unicode without containing umlauts

If searching for the word "test", next to the umlaut word in the file, TP finds the the other word and displays the whole line. The umlaut word is shown in a wrong encoding:
http://haeberlen.org/privat/tp/textpad_ ... auts_e.PNG

Searching for the word "test" should find 7 files, but it does find only 4 occurrences. Unicode or Unicode/Big Endian or files which are containing umlauts are not in the list. But if the Unicode file do not contain umlauts it will be found by TP.

Confusing - Test it yourself:
http://haeberlen.org/privat/tp/testfiles.zip

Horst
Post Reply