I regularly have to parse new editions of a database to use in my system, and one problem that has to be solved is to get rid of any characters higher than dec126 (hex 7E) because my system chokes when it sees these. This doesn't need to be fancy, i.e., just replacing everything with a tilde would be fine.
Try as I might, though, I haven't figured out a way to scan through the whole database and find these offending characters. I thought maybe [\x73-\x255] would work, but it doesn't seem to. Any ideas?
Cleaning hi-ascii out of a database
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
Re: Cleaning hi-ascii out of a database
[\x7e-\xff]
you got mixed up between hex and dec.
you got mixed up between hex and dec.
Re: Cleaning hi-ascii out of a database
You're right. I wrote it wrong in the message. The way you wrote it, though, is the way I tried and the way that doesn't work. So it doesn't change anything with respect to my problem or my real question.
Re: Cleaning hi-ascii out of a database
From TextPad help [:token:] will search for "Any of the characters defined on the Syntax page for the document class, or in the syntax definition file if syntax highlighting is enabled for the document class."
However, this does not seem to work - maybe I'm not setting it up right. But if all used punctuation characters were added to the syntax page. then a search on [^[:alnum:][:cntrl:][:token:]] ahould do it
However, this does not seem to work - maybe I'm not setting it up right. But if all used punctuation characters were added to the syntax page. then a search on [^[:alnum:][:cntrl:][:token:]] ahould do it