Just as an opening bracket needs to be paired with a closing bracket "(" with a ")".
I need similar functionality with quote marks.
For each line of:
question <tab> answer
question <tab> answer
question <tab> answer
For each line highlight any quote mark not part of an open/close pair.
I am dealing with a text file that is intolerant of unclosed quote marks. For each line any opening quote mark must have matching closing quote mark.
I am presently just highlighting all quote mark types and manually checking them.
regex to find unmatched quote marks
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
-
- Posts: 2461
- Joined: Sun Mar 02, 2003 9:22 pm
If you search using this:
TextPad will find the first double quote on the next line containing unmatched double quotes.
If you search using this:TextPad will find the last double quote on the next line containing unmatched double quotes.
With either of them, if you Mark All, TextPad will mark all lines containing unmatched double quotes.
Code: Select all
^[^"\n]*\K"(?=(([^"\n]*"){2})*[^"\n]*$)
If you search using this:
Code: Select all
^[^"\n]*(("[^"\n]*){2})*\K"(?=[^"\n]*$)
With either of them, if you Mark All, TextPad will mark all lines containing unmatched double quotes.
-
- Posts: 34
- Joined: Sat Nov 03, 2007 3:04 am
thanks
ben_josephs you are a prolific poster!
Many thanks, it works well.
I put the regex into an online tester and am stepping through it's operation trying to figure out how it works.
I will try replacing " with ' and running that regex to pick up more errors.
I'm presently just saving my Textpad regex in a text file. Is there a better way to save and document these regex?
Many thanks, it works well.
I put the regex into an online tester and am stepping through it's operation trying to figure out how it works.
I will try replacing " with ' and running that regex to pick up more errors.
I'm presently just saving my Textpad regex in a text file. Is there a better way to save and document these regex?
Re: thanks
Don't be afraid to actually link the website you're using so others who read this get to know of it, too.an online tester
Keep in mind that only one of all possible characters in question must be used in the whole line - if you also want to discover single apostrophs and double quotation marks mixed in one line that don't amount to a multiple of 2 then things will get complicated.I will try replacing " with ' and running that regex to pick up more errors.
Text files need discipline: if you want one expression on a line then strictly stick to it and don't indent it, as then nobody knows if leading/trailing whitespaces are part of it or not. That being said: do not use text file formats that come with their own syntax, requiring you to escape parts of your regex (i.e. XML, RTF...).Is there a better way to save and document these regex?
Code: Select all
^(?:(?:[^"\n]*"){2})*[^"\n]*\K"(?=[^"\n]*$)
^ for the line start anchor
(?: ... )* for any number of occurences of
(?: ...){2} two occurrences of
[^"\n]*" which is any number of non-quote/non-linebreak chars followed by a quote
all this followed by
[^"\n]* any number of non-quote/non-linebreak char
\K to exclude everything we found so far from the match
" the quote we want to find
(?=[^"\n]*$) any number of non-quote/non-linebreak chars till the line end as a lookahead
Thus, exactly the unmatched quote will be selected.
-
- Posts: 34
- Joined: Sat Nov 03, 2007 3:04 am
thanks all
@AmigoJack
That was the very website I used! Once I sort out how the regex works, I will try modifying it, as a learning experience. Thanks for the comments.
@MudGuard
Another example to learn from! Many thanks.
I am getting re-acquainted with Textpad after leaving (years ago) because of regex and unicode weirdness. I will have more questions for you actual programmers later.
many thanks
That was the very website I used! Once I sort out how the regex works, I will try modifying it, as a learning experience. Thanks for the comments.
@MudGuard
Another example to learn from! Many thanks.
I am getting re-acquainted with Textpad after leaving (years ago) because of regex and unicode weirdness. I will have more questions for you actual programmers later.
many thanks