macro with clipping?

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
aznap
Posts: 6
Joined: Wed Jul 02, 2008 10:56 am

macro with clipping?

Post by aznap »

As much as I love TextPad, I prefer reading Gutenberg books formatted.

I have been chugging along, tagging a few texts from Gutenberg for formatting, using the Clip Library to make sure I don't forget any of those pesky closing tags.

Often these texts show emphasis either through ALL CAPS or _faux_ underlining, which I change to more reader-friendly emphasis also using the Clip Library.

While the "underlined" text can be quite easily dealt with using a single search/replace for the whole book, I'm finding that changing the all caps text a bit more tedious to accomplish.

First of all, I have been having trouble figuring out a single regular expression to find (a) all single all capped words, whether they are followed by a space, punctuation, or a hard return; and also (b) all all capped groups of words, whether etc. ... (I know this isn't the right forum for that problem, and I'm still playing around with that anyway and hope to figure out the solution on my own).

Second -- and finally we get to the subject line -- after finding the set of all capped words that need to be changed to different emphasis, theyboth need to be made either lower or title case as necessary, and made the correct tags applied (either italic or bold, depending on the context).

I would love if it were possible to either

(preferably) -- make a macro that would both lowercase and then use a specific clipping

or if that is not possible (as I suspect)

to assign a keyboard shortcut to a specific clippingso I am not jumping back and forth from the keyboard to the mouse constantly.

As I said, I have done a couple of these the hard way already -- meaning select the all caps words, then either Ctrl+L or Ctrl+Shift+U, as needed, and then mouse over to the Clip Library and double click on the correct clipping.

While the results are ok, I am pretty much determined to figure out a better way before I get stuck doing this to another long file.

I suppose another solution would be if I could create a macro with a hot spot like the clippings do, but I also suspect that is not possible in TextPad.

Any suggestions? Am I missing something?
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Re: macro with clipping?

Post by ben_josephs »

I don't fully understand the description of your problem, but these ideas may give you some pointers.

First, use Posix syntax:
Configure | Preferences | Editor

[X] Use POSIX regular expression syntax
It makes the world go round a deal faster.

For all of this you must match case-sensitively:
[X] Match case
To find all fully-capitalised words you can use this regular expression:
\<[A-Z]+\>
or, if you want to exclude single-letter words, such as I, you can use
\<[A-Z]{2,}\>
To find all sequences of fully-capitalised words on a single line you can use something like this:
\<[A-Z]+\>([[:punct:][:space:]]+\<[A-Z]+\>)*
or, to exclude single-letter words standing alone,
\<[A-Z]{2,}\>|\<[A-Z]+\>([[:punct:][:space:]]+\<[A-Z]+\>)+

To make them all lower case and enclose them in <caps> tags (you don't say what the tags should look like), you can use this:
Find what: \<[A-Z]+\>([[:punct:][:space:]]+\<[A-Z]+\>)*
Replace with: <caps>\L\0</caps>
You can't specify optional new lines using TextPad's weak regular expression recogniser (although you can with WildEdit (http://www.textpad.com/products/wildedit/). Perhaps the easiest way to handle sequences of upper-case words that span lines is to handle the individual lines first, as above, and then deal with the spans:
Find what: </caps>([[:punct:][:space:]]*)\n([[:punct:]]*)<caps>
Replace with: \1\n\2
If you want to untag occurrences of tagged I on its own, use
Find what: <caps>i</caps>
Replace with: I
Does this solve part of your problem?

As for the rest:
aznap wrote:they both need to be made either lower or title case as necessary, and made the correct tags applied (either italic or bold, depending on the context).
Can you explain this more clearly?
aznap
Posts: 6
Joined: Wed Jul 02, 2008 10:56 am

Re: macro with clipping?

Post by aznap »

ben_josephs wrote:Does this solve part of your problem?
I think it nearly gets me there (see below)
ben_josephs wrote:As for the rest:
aznap wrote:they both need to be made either lower or title case as necessary, and made the correct tags applied (either italic or bold, depending on the context).
Can you explain this more clearly?
Sorry for the first rambling question.

I think for what I need I cannot do a search/replace by itself to accomplish what I need, since some of the text needs to change from all caps to lower case and other text needs to change from all caps to just capitalized/title case, and still other all cap phrases need to be changed to mostly lower except for a few proper nouns. I think these decisions cannot be automated.

That being said, a regular expression search that will simply find all caps words or phrases would help quite a bit.

After thinking about it some more, I don't think I need to be able to find all caps phrases that go from one line to another, since I have been doing these searches after removing all hard returns except between paragraphs.

I have been using PML to tag the text so it can be read in eReader, but the specific markup doesn't matter, it could just as easily be HTML.

I don't have it set up as POSIX, but continuing from your example I have found that:
Find what: \(\<[A-Z[:space:][:punct:]]+[A-Z]+\>\)
Replace with: <i>\L\1</i>
(or for PML -- Replace with: \\i\L\1\\i)
Both italicizes and makes lower case both single words and phrases.

From this point I think it would be easy enough to then search for <i> (or \i in PML) and Ctrl+F to manually cap those words that need it with Ctrl+Shift+U.

Thus this mess from Common Sense:
Alas, we have been long led away by ancient prejudices, and made large sacrifices to superstition. We have boasted the protection of Great Britain, without considering, that her motive was INTEREST not ATTACHMENT; that she did not protect us from OUR ENEMIES on OUR ACCOUNT, but from HER ENEMIES on HER OWN ACCOUNT, from those who had no quarrel with us on any OTHER ACCOUNT, and who will always be our enemies on the SAME ACCOUNT. Let Britain wave her pretensions to the continent, or the continent throw off the dependance, and we should be at peace with France and Spain were they at war with Britain. The miseries of Hanover last war ought to warn us against connexions.

It hath lately been asserted in parliament, that the colonies have no relation to each other but through the parent country, i. e. that Pennsylvania and the Jerseys, and so on for the rest, are sister colonies by the way of England; this is certainly a very round-about way of proving relationship, but it is the nearest and only true way of proving enemyship, if I may so call it. France and Spain never were, nor perhaps ever will be our enemies as AMERICANS, but as our being the SUBJECTS OF GREAT BRITAIN.
after the two steps becomes:
Alas, we have been long led away by ancient prejudices, and made large sacrifices to superstition. We have boasted the protection of Great Britain, without considering, that her motive was \iinterest\i not \iattachment\i; that she did not protect us from \iour enemies\i on \iour account\i, but from \iher enemies\i on \iher own account\i, from those who had no quarrel with us on any \iother account\i, and who will always be our enemies on the \isame account\i. Let Britain wave her pretensions to the continent, or the continent throw off the dependance, and we should be at peace with France and Spain were they at war with Britain. The miseries of Hanover last war ought to warn us against connexions.

It hath lately been asserted in parliament, that the colonies have no relation to each other but through the parent country, i. e. that Pennsylvania and the Jerseys, and so on for the rest, are sister colonies by the way of England; this is certainly a very round-about way of proving relationship, but it is the nearest and only true way of proving enemyship, if I may so call it. France and Spain never were, nor perhaps ever will be our enemies as \iAmericans\i, but as our being the \isubjects of Great Britain\i.
This still doesn't quite get me where I wanted to be, but it is far, far better than where I was before. No kind of macro was going to be able to recognize and capitalize proper nouns for me in any case.

What I was hoping for was to make a couple macros which I could define keyboard shortcuts for, so one would make selected text lower case and italicized and another would make selected text lower case and bold.

ISTM that being able to use clippings either with a macro or a keyboard shortcut would be useful for other things as well.

No worries. This works.

Thanks!
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

aznap wrote:That being said, a regular expression search that will simply find all caps words or phrases would help quite a bit.
I gave you one.
aznap wrote:I don't have it set up as POSIX, but continuing from your example I have found that:
Find what: \(\<[A-Z[:space:][:punct:]]+[A-Z]+\>\)
Replace with: <i>\L\1</i>
(or for PML -- Replace with: \\i\L\1\\i)
Both italicizes and makes lower case both single words and phrases.
as does my suggestion.

Your regex excludes single-letter words standing alone. Otherwise it's equivalent to the one I suggested. Note that you don't need the outer parentheses in the regex; you can use \0 in the replacement expression to represent the entire matched text. Your regex is equivalent to
\<[A-Z[:space:][:punct:]]{2,}\>
which is simpler and clearer.
aznap
Posts: 6
Joined: Wed Jul 02, 2008 10:56 am

Post by aznap »

ben_josephs wrote:as does my suggestion.

Your regex excludes single-letter words standing alone. Otherwise it's equivalent to the one I suggested. Note that you don't need the outer parentheses in the regex; you can use \0 in the replacement expression to represent the entire matched text. Your regex is equivalent to
\<[A-Z[:space:][:punct:]]{2,}\>
which is simpler and clearer.
You are right, yours works just as well and is a simpler expression.

I admit I didn't try that one because I was afraid it would capture trailing spaces or punctuation, which I didn't want. I didn't take into account that the end-of-word marker would avoid that unwanted error.

Thanks again. Learning to use a tool better makes it a better tool.
Post Reply