Page 1 of 1

RegEx expression for quote check in British-typesets

Posted: Wed May 01, 2013 7:56 pm
by geoffreykidd
Unlike we Yanks, the Brits use single quotes to delimit dialogue. This poses a challenge for detecting unbalanced quotemarks in their books. The following two paragraphs illustate both a line with genuinely unbalanced quotes and one where the word "I'll" also includes a single quote which should be disregarded.

‘’Yo, Emily,’ she nodded, passing the departmental secretary.

‘Hi! With you in a sec.’ Emily lifted her finger from the ‘mute’ button, went back to glassy-eyed attention. ‘Yes, I’ll send them up as soon as –
’

The first paragraph has unbalanced quotes. I'd like to detect and bookmark such. I don't mind if the convention of using s followed by a single quote as possessive creates false positives. (James' dog)

Any help appreciated.

Posted: Wed May 01, 2013 10:13 pm
by ben_josephs
Does this do it?
^(?!(?:‘(?:[a-z][‘’][a-z]|[^‘’\r\n])*+’|(?:[a-z][‘’][a-z]|[^‘’\r\n]))*$).*

Yes.

Posted: Wed May 01, 2013 10:37 pm
by geoffreykidd
I get a fair number of false positives, but then, I do for american-style typesetting, too. :)

This is gorgeous and will be a great timesaver.

You, sir, are a wizard, and I thank you.