HTML syntax refuses to highlight characters

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

circeus
Posts: 2
Joined: Sat Nov 06, 2004 2:25 am

HTML syntax refuses to highlight characters

Post by circeus »

For some reason, no characters seeems to highlight in my html files. Syntax highlighting is in function and characters from the HTML characters library highlight just right.

However, any character that is not in this list will never highlight, despite beginning wih & and ending in a semicolon.

The problem seems to be that html documents strings are NOT parsed, and only those amongst a list (wherever it is) truely are.

This is really annoying. The only remedy I have found so far was to use the CommentStartAlt and CommentEndAlt variables to define strings instead.
sbb
Posts: 1
Joined: Mon Nov 08, 2004 5:44 pm

Post by sbb »

I have been frustrated by the same issue for quite awhile. I think the highlight-able entity codes are compiled into the executable, and has nothing to do with the HTML library file at all.

Searching the 4.7.2 executable in a hex editor, I found a table of HTML character entities. I found several characters such as harr, larr, rarr, etc., but did not find mdash or ndash, ldquo, lsquo, rdquo, rsquo.

It would be nice if the grammar allowed a way to customize or add or delete from the list of highlightable keywords. Perhaps in 4.7.4? 4.8? =)
circeus
Posts: 2
Joined: Sat Nov 06, 2004 2:25 am

Post by circeus »

Can one remedy this through a whole new syntax file NOT named "HTML"?

I just though this might work, although maybe not in HTML documents, ironically. I can't test it here unfortunately.
BCMadsen
Posts: 11
Joined: Tue Mar 09, 2004 12:03 am
Location: Los Angeles, CA
Contact:

This is still a problem in TextPad 5.1

Post by BCMadsen »

... and it's something I'd really like to see treated as a high-priority issue that needs to get fixed.

Can we please add the tokens that "sbb" mentions in his Nov 08, 2004 post (mdash, ndash, ldquo, lsquo, rdquo, rsquo) to the set of HTML special characters that TextPad's syntax highlighting knows and recognizes?

If this is something that needs to get compiled into TextPad's source code, then this is my request that we do so.

Alternatively, if the set of HTML special character tokens is in some syntax highlighting file so that we can edit that list ourselves, that would be good too.

I am happy no matter how it happens, but I really need some way to get TextPad to light up those tokens when they show up in my HTML.

Many thanks in advance for your attention to this request.
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

The characters that are highlighted are listed in the syntax file that you have selected.
You can open the syntax file and edit it yourself to add/delete more strings and characters.

From the TextPad Help:
How to Customize Syntax Highlighting
To customize syntax highlighting for a document class:

Choose Preferences from the Configure menu, and the Preferences dialog box will be displayed.
Click the "+" next to Document Classes.
Click the "+" next to the document class you want to modify.
Select Syntax.
Check "Enable syntax highlighting".
Select a suitable syntax definition file from the list.
Click Apply or OK.
Notes:
When selecting and stepping over words, the default set of characters in a word are letters, numbers and those used in keywords in the syntax definition file. If you need to include any others, such as a hyphen, type them in the box labelled "Other characters in words".
If a suitable syntax file is not on the list, check if one is available from http://www.textpad.com/add-ons.
To create your own syntax definition file, see Syntax Definition Files.
To open the selected syntax definition file in an edit window, click the Open button. However, any changes made will not take effect until all documents of that class have been closed. If you do change a file in the SYSTEM or SAMPLES folders, save it to your personal Syntax folder, so that it will not be overwritten by future upgrades.
----------------------------
Note the Help article has active links to the referenced sections, like Syntax Definition Files.
Hope this was helpful.............good luck,
Bob
BCMadsen
Posts: 11
Joined: Tue Mar 09, 2004 12:03 am
Location: Los Angeles, CA
Contact:

Post by BCMadsen »

Bob Hansen wrote:The characters that are highlighted are listed in the syntax file that you have selected.
You can open the syntax file and edit it yourself to add/delete more strings and characters.
Although that sounds great in theory, it doesn't work in practice, because the items we're trying to highlight that DO highlight correctly aren't in that file.

The file in question is html.syn, installed by default in C:\Program Files\TextPad 5\system\html.syn. If you open that file, you'll see that none of the HTML special characters that DO get highlighted are listed there.

Where in that file do you see ", for example? Or ¢? Or é? Or ü?

They aren't there. And yet they're being highlighted by TextPad when it encounters them. How does it know which tokens to highlight in this way?

But TextPad does NOT highlight other valid HTML characters, such as “ or –, and I can't see any reason why not.

I have noticed that these tags DO appear in htmlchar.tcl, installed by default in C:\Program Files\TextPad 5\Samples\htmlchar.tcl. Sadly however, adding “ and ” into that file have no effect, so that's also not the place where these tags need to go.

Wherever that list is, I want to add to it, and neither html.syn nor htmlchar.tcl are the place to do it.

Although I see your point, and although I do know how to edit syntax highlighting files myself, this isn't a problem that can be solved in that way.

There's another solution, and from what I can tell, it's something that needs to get resolved with a source code change inside TextPad itself.

I'd love to learn that I'm wrong about that.
BCMadsen
Posts: 11
Joined: Tue Mar 09, 2004 12:03 am
Location: Los Angeles, CA
Contact:

One more thing I forgot to point out

Post by BCMadsen »

Re-reading sbb's earlier post, I'm reminded that he or she stated:
sbb wrote:Searching the 4.7.2 executable in a hex editor, I found a table of HTML character entities. I found several characters such as harr, larr, rarr, etc., but did not find mdash or ndash, ldquo, lsquo, rdquo, rsquo.

It would be nice if the grammar allowed a way to customize or add or delete from the list of highlightable keywords. Perhaps in 4.7.4? 4.8? =)
This adds support for my suspicion that the list of HTML codes that I'm looking for are actually compiled into TextPad, and are not configurable by us, the end users, in those syntax files.

If those tokens are defined in source code, as appears to be the case, let's please augment that list to include ldquo and so on.
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

Where in that file do you see ", for example? Or ¢? Or é? Or ü?
I don't have access to TextPad right now, but check the syntax file and see if there is a "&" for the character start value. I believe that will produce the highlighting that you are seeing.

Hmmm...
But TextPad does NOT highlight other valid HTML characters, such as “ or –, and I can't see any reason why not.
This would mean my earlier suggestion is not correct ..... stranger and stranger.....
Hope this was helpful.............good luck,
Bob
BCMadsen
Posts: 11
Joined: Tue Mar 09, 2004 12:03 am
Location: Los Angeles, CA
Contact:

Post by BCMadsen »

Bob Hansen wrote:I don't have access to TextPad right now, but check the syntax file and see if there is a "&" for the character start value. I believe that will produce the highlighting that you are seeing.
Yes, html.syn does show the following:

CharStart = &
CharEnd = ;

... but:
Bob Hansen wrote:
But TextPad does NOT highlight other valid HTML characters, such as “ or –, and I can't see any reason why not.
This would mean my earlier suggestion is not correct ..... stranger and stranger.....
Yes, you see what I mean. I can see where the design of syntax highlighting starts down the road of identifying everything between & and ; as a character that should be highlighted, but it's not happening, at least not in the case of those few items that sbb mentioned up above.

I'm not really sure what the CharStart and CharEnd designators are accomplishing, so I don't want to change them, but they aren't enough by themselves to highlight all of the valid HTML character tokens.
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

I just checked the registry settings for Helios and see nothing that would provide answers, so the control does appear to be part of the source code.
Hope this was helpful.............good luck,
Bob
BCMadsen
Posts: 11
Joined: Tue Mar 09, 2004 12:03 am
Location: Los Angeles, CA
Contact:

Post by BCMadsen »

Bob Hansen wrote:I just checked the registry settings for Helios and see nothing that would provide answers, so the control does appear to be part of the source code.
I'd be very grateful if you'd please pass this request along to the folks who maintain the source code and ask them to treat it as a bug report. It would be enormously helpful to me (and others, presumably) to get this fixed.
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

I'm just an end user like yourself. Comments here are for all to see and read.

I have no influence. If I did, we would have had macro editing years ago.

But, despite some unfulfilled wishes, I still love TextPad. Cannot work without it.
Hope this was helpful.............good luck,
Bob
BCMadsen
Posts: 11
Joined: Tue Mar 09, 2004 12:03 am
Location: Los Angeles, CA
Contact:

Post by BCMadsen »

Bob Hansen wrote:I'm just an end user like yourself. Comments here are for all to see and read.
Oh! I saw you were a moderator here, and sort of assumed you were close to the powers that be.

My bad for jumping to conclusions. Sorry about that.
Bob Hansen wrote:I have no influence. If I did, we would have had macro editing years ago.
ROFL -- I heard that. :)
Bob Hansen wrote:But, despite some unfulfilled wishes, I still love TextPad. Cannot work without it.
Yeah, I hear ya. Me too.
User avatar
Bob Hansen
Posts: 1517
Joined: Sun Mar 02, 2003 8:15 pm
Location: Salem, NH
Contact:

Post by Bob Hansen »

Oh! I saw you were a moderator here, and sort of assumed you were close to the powers that be.
I volunteered some years ago since I am here so often. Just try to help eliminate incoming spam postings....
Hope this was helpful.............good luck,
Bob
YuanhaoYoung
Posts: 3
Joined: Thu Nov 22, 2012 5:47 am

Post by YuanhaoYoung »

I realize this is years late, but on the off-chance someone finds it useful...

I had this same problem until I tried a workaround. First I deleted the existing HTML document class in TextPad. Then I created a new one called "Html prEpub" (I'm editing HTML for converting to ePub). I added the *.htm and *.html files to this, and chose the html.syn file for syntax highlighting. Then I made changes to html.syn to add the entities I needed, like … and “ and ”

Loading an HTML file, I can get “ and ” to highlight properly now. It's possible there's an easier way to do this, but this definitely seems to work.
Post Reply