Page 1 of 1

Can I turn off multiline matching?

Posted: Wed Sep 23, 2020 10:43 pm
by ztodd
Can I turn off multiline matching for wildcards?

I.e., when I use .* in a regex, I want the .* to only match within one line.

Posted: Thu Sep 24, 2020 8:03 am
by AmigoJack
Can't reproduce in TextPad 8.4.2, and as per regex defaults the dot matches anything but linebreaks. How is it possible for you having this problem at all?

Hint: using "Configure > Word Wrap" to display one original line in multiple virtual lines still doesn't make them multiple lines in real - that's only an optical feature (as can be seen by the line numbers) and out of scope for regular expressions.

Posted: Thu Sep 24, 2020 8:54 am
by ztodd
AmigoJack wrote:Can't reproduce in TextPad 8.4.2, and as per regex defaults the dot matches anything but linebreaks. How is it possible for you having this problem at all?

Hint: using "Configure > Word Wrap" to display one original line in multiple virtual lines still doesn't make them multiple lines in real - that's only an optical feature (as can be seen by the line numbers) and out of scope for regular expressions.
Good question. I have Textpad 8.2.0 64-bit edition- about to upgrade it to see if that helps...

Oh-- I'm sorry- I did mis-state the problem a little bit. The issue is when I search for a negated character group- then it includes new lines. I.e., if my file is this :

ab
cd

and when I search for :

a[^x]*d

Then it selects both lines. I couldn't figured out how to also exclude new lines...

Never mind though- got it to work. I re-started Textpad, then tried searching for :

a[^x\n]*d

and that couldn't find a match.

I was sure I had tried that before restarting Textpad and it would still select both lines. Maybe restarting Textpad helped- or maybe I'm a little crazy.

Thanks. :)

Posted: Thu Sep 24, 2020 11:18 am
by AmigoJack
Again that is no TextPad related behaviour, but a default in regular expressions:
  • The dot matches anything but newlines.
  • Classes match everything, either inclusive or exclusive - that's why \n needs to be added to either something which should be included or excluded.
Example: using . is equivalent to [^\n\r]

Posted: Thu Sep 24, 2020 11:42 am
by ben_josephs
I think you meant that . (dot) is equivalent to [^\n\r] (any single character other than line feed or carriage return).
\p in a [...] character class means just p .

Posted: Thu Sep 24, 2020 1:06 pm
by AmigoJack
Oh yes, I meant \r instead of \p.

\p (and \P) are legit escape sequences for Unicode properties, if TextPad would support them.

Posted: Thu Sep 24, 2020 1:19 pm
by ben_josephs
TextPad does support them. The following are equivalent
\pd
\p{digit}
[[:d:]]
[[:digit:]]
Each of them matches a single digit.

Posted: Thu Sep 24, 2020 8:54 pm
by AmigoJack
(this derails the topic, but)

I made my assumption from examples of something simple like \p{Pf} and \p{Greek} and \p{InHebrew}, which all gave me an error message about a wrong expression (instead of hinting it doesn't support the given property).

Neither the page I linked to, nor Perl itself lists a property named "digit", and unsurprisingly \p{digit} fails to match 一�, but that's the whole sense of Unicode properties: to i.e. match numbers of all languages, not just being another synonym of \d.