Community

Posted: **Thu Jan 07, 2021 2:46 pm**

Hello,

Windows 7, 64bit TP 8.50

I am working on a .txt file of a dictionary which I am converting to .htm and would like to have the first word of each main entry put into boldface.

I am using the TP Search and Replace tool, not Wild-Edit.

Main entries begin a new line.

The problem is that this tool thinks that Unicode characters are word breaks.

Any solutions?

Posted: **Thu Jan 07, 2021 8:16 pm**

What regex are you using to match words?
Please provide examples of words that this regex doesn't recognise as single words.

Posted: **Thu Jan 07, 2021 11:08 pm**

Mike Olds wrote:The problem is that this tool thinks that Unicode characters are word breaks.

So every character is treated as word break?
As every character is a Unicode character ...

Posted: **Fri Jan 08, 2021 11:21 pm**

Hello,

Sorry for the delay in responding. I do not get notices although I have it checked.

Sample lines:

:AkÄ�ca (adjective) [a + kÄ�ca] pure, flawless, clear D II 244; Snp 476; Ja V 203.

:ParicÄ�reti [causative of paricarati]

I think I need to ask a different question as the so called regex I was using appears to only be capturing the first letter.

I was using (:\w)

All the relevant lines begin with :

What I meant by Unicode characters I realize now was too vague. I am speaking about the characters with diacriticals, including compound characters. And since my regex was unsuitable, there might not be any more of a problem than my ignorance.

So can you give me a regex that will capture the first word of a line.

Posted: **Fri Jan 08, 2021 11:28 pm**

Try
(:\w+)

Let us know whether that works.

Posted: **Fri Jan 08, 2021 11:36 pm**

Hello Ben,

Thank you for your quick response. Yes that seems to work.

I had just also found another which appears to work:

^(:\S+)

Posted: **Fri Jan 08, 2021 11:57 pm**

That matches sequences of any characters that aren't white space. Is that what you want?

Posted: **Sat Jan 09, 2021 12:34 am**

Hello Ben,

Yes, sort of. There are some complications. But that seems to get what I need. (It captures 1 which is good.

Thanks again for this help and for tolerating my ignorance. I'm getting way too old for this sort of thing, my mind just can't keep up.

Posted: **Sun Jan 10, 2021 3:21 pm**

Mike Olds wrote:Sorry for the delay in responding. I do not get notices although I have it checked.

Just visit this board regularly (i.e. every weekend) and look out for the "unread post" icon (

) to easily spot activity that is yet unknown to you.

If you need an overview of all your own posts (to see topics you've created or participated) just use the Find all posts by Mike Olds link in your public profile - you could bookmark it.

Community

Capture first word on line: using Unicode

Capture first word on line: using Unicode

Re: Capture first word on line: using Unicode