Copy&Paste in Hex view

Ideas for new features

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
daemonui
Posts: 9
Joined: Mon Jun 05, 2006 1:19 pm

Copy&Paste in Hex view

Post by daemonui »

Another improvement would be to allow copy in hex view (show waring if the block contains zeros, then replaces zers by space). Paste would be also nice in Hex-View (overwriting).

And dynamically switching between Text and/Hex View would also be great.

Again looking for Unicode support ;-)

Harald
hillsc
Posts: 35
Joined: Thu Jun 19, 2003 2:00 pm

Post by hillsc »

I would definitely love to be able to toggle between text/hex views.

I don't deal with unicode files as a rule. But the ones I've come across, Textpad can open, read, and write them. I know others have asked for "unicode support". What does that really mean?
daemonui
Posts: 9
Joined: Mon Jun 05, 2006 1:19 pm

Post by daemonui »

@hillsc

See this links for what unicode is:
http://www.unicode.org/standard/WhatIsUnicode.html
http://en.wikipedia.org/wiki/Unicode

The problem in TP is that characters are only displayed correct if they are in the current codepage of the operation system.
If you open a file containing chinese unicode on a english system you get '?'-characters replacing original text.
When saving such a file it saves the '?', which results in a corrupt file.

Because Windows XP/2000/NT use full Unicode they could handle the characters correct, but only if TP uses Unicode functions.
Even Win9x as some, but very limited unicode support.

An example this "确定" means "OK", and if you have installed international support you see two ideograms.
Select it, copy and paste it to the windows editor, or word or excel. It will be unchanged, but paste it into TP (unless you are on a chinese operaring system) will show "??".

If you have to deal with international files (like me), you will run into trouble.

BTW: The hex representation of "确定" is "6e 78 9a 5b", of "OK" it's "4F 00 4B 00".
hillsc
Posts: 35
Joined: Thu Jun 19, 2003 2:00 pm

Post by hillsc »

I definitely know what unicode is. I just didn't know what TP's problem was with them, since I've been able to read and write them before. Now I see why it wasn't a problem for me... ;)

I tried some french and sure enough ... I couldn't even type "regardé"

What an extraordinary limitation which I had no idea existed!
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

If the script is set to Western (Configure | Preferences | Document Classes | <Class> | Font | Script or View | Document Properties | Font | Script) TextPad has no problem with àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿœ and many other characters. All these are in the Western (or WinLatin1 or CP1252) character set.

Have a look in the help under How To ... | Edit Text | Type International Characters.

The problem is that TextPad renders an entire document in a single 8-bit character set. Unicode is a 21-bit character set. TextPad can't display, say, Greek and Arabic characters together in one document.
daemonui
Posts: 9
Joined: Mon Jun 05, 2006 1:19 pm

Post by daemonui »

ben_josephs wrote:If the script is set to Western (Configure | Preferences | Document Classes | <Class> | Font | Script or View | Document Properties | Font | Script) TextPad has no problem with àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþÿœ and many other characters. All these are in the Western (or WinLatin1 or CP1252) character set.

Have a look in the help under How To ... | Edit Text | Type International Characters.

The problem is that TextPad renders an entire document in a single 8-bit character set. Unicode is a 21-bit character set. TextPad can't display, say, Greek and Arabic characters together in one document.
Would you suggest not to use international character (especially characters not in 1252 codepage) and not to mix languages? But thats exactly what I need.

And it's one part of the problem. Try to search for Unicode text.
The correct regular expression would be for "Sample" "S\0a\0m\0p\0l\0e\0" which doesn't work.
"S.a.m.p.l.e." works, and "S.?a.?m.?p.?l.?e.?" would be auto/unicode.
But this doesn't help much for characters >255.
ben_josephs
Posts: 2461
Joined: Sun Mar 02, 2003 9:22 pm

Post by ben_josephs »

My posting was a reply to hillsc's assertion that he can't enter the letter 'é'. If you're using WinLatin1 there is no problem doing that.

Unfortunately, you can't mix characters from different 8-bit character sets. That's a restriction of TextPad. It translates each character in a Unicode file into its nearest equivalent, if there is one, in the character set of the current document. It converts all other characters into question marks.

The problem with searching for text in a Unicode file doesn't arise when viewing the file as a text, when "Sample" matches "Sample". Unfortunately, TextPad's binary mode knows nothing of the multibyte characters of which you speak and doesn't recognise the double byte U+0053 as the character 'S'.
Post Reply