Page 1 of 1

[\x7F-\xFF] does not appear to work

Posted: Fri Feb 08, 2019 6:54 am
by kengrubb
I'm trying to use the Regex Range of [\x7F-\xFF], but it does not appear to work. I can find some characters in the range, but not everything.

Here are two specifics.

I cannot find the left single quote (‘).
ALT-0145 on the keyboard.

I can find the latin small letter y with diaeresis (ÿ).
ALT-0255 on the keyboard.

\xFF works.

\x91 does not work.

This is perplexing me.

Posted: Fri Feb 08, 2019 10:24 am
by ben_josephs
Internally, TextPad stores characters with their Unicode values.

In the Windows Latin-1 (CP 1525) character set the character LEFT SINGLE QUOTATION MARK has the value \x91. But in Unicode (and ISO 8859-1) the characters with values in the range \x80..\x9F are control codes, not printable characters. In Unicode the value of LEFT SINGLE QUOTATION MARK is U+2018. You can search for it using [\x{2018}].

You can search for all characters with Unicode values \x7F and above with [\x7F-\x{FFFF}].

Posted: Sat Feb 09, 2019 8:23 pm
by kengrubb
That is fascinatingly strange. And it works. Much appreciated.

It appears I'm going to have to become much more connected with UTF-8 and Unicode and BOMs, oh my.

Posted: Sat Feb 09, 2019 8:25 pm
by kengrubb
This also works

[\x{007F}-\x{FFFF}]