Copy&Paste in Hex view

daemonui · Post by **daemonui** » Mon Jun 05, 2006 1:29 pm

Another improvement would be to allow copy in hex view (show waring if the block contains zeros, then replaces zers by space). Paste would be also nice in Hex-View (overwriting).

And dynamically switching between Text and/Hex View would also be great.

Again looking for Unicode support

Harald

hillsc · Post by **hillsc** » Mon Jun 05, 2006 9:18 pm

I would definitely love to be able to toggle between text/hex views.

I don't deal with unicode files as a rule. But the ones I've come across, Textpad can open, read, and write them. I know others have asked for "unicode support". What does that really mean?

daemonui · Post by **daemonui** » Mon Jun 05, 2006 11:35 pm

@hillsc

See this links for what unicode is:
http://www.unicode.org/standard/WhatIsUnicode.html
http://en.wikipedia.org/wiki/Unicode

The problem in TP is that characters are only displayed correct if they are in the current codepage of the operation system.
If you open a file containing chinese unicode on a english system you get '?'-characters replacing original text.
When saving such a file it saves the '?', which results in a corrupt file.

Because Windows XP/2000/NT use full Unicode they could handle the characters correct, but only if TP uses Unicode functions.
Even Win9x as some, but very limited unicode support.

An example this "确定" means "OK", and if you have installed international support you see two ideograms.
Select it, copy and paste it to the windows editor, or word or excel. It will be unchanged, but paste it into TP (unless you are on a chinese operaring system) will show "??".

If you have to deal with international files (like me), you will run into trouble.

BTW: The hex representation of "确定" is "6e 78 9a 5b", of "OK" it's "4F 00 4B 00".

hillsc · Post by **hillsc** » Tue Jun 06, 2006 4:18 pm

I definitely know what unicode is. I just didn't know what TP's problem was with them, since I've been able to read and write them before. Now I see why it wasn't a problem for me...

I tried some french and sure enough ... I couldn't even type "regardÃ©"

What an extraordinary limitation which I had no idea existed!

ben_josephs · Post by **ben_josephs** » Tue Jun 06, 2006 8:50 pm

If the script is set to Western (Configure | Preferences | Document Classes | <Class> | Font | Script or View | Document Properties | Font | Script) TextPad has no problem with Ã Ã¡Ã¢Ã£Ã¤Ã¥Ã¦Ã§Ã¨Ã©ÃªÃ«Ã¬ÃÃ®Ã¯Ã°Ã±Ã²Ã³Ã´ÃµÃ¶Ã¸Ã¹ÃºÃ»Ã¼Ã½Ã¾Ã¿Å“ and many other characters. All these are in the Western (or WinLatin1 or CP1252) character set.

Have a look in the help under How To ... | Edit Text | Type International Characters.

The problem is that TextPad renders an entire document in a single 8-bit character set. Unicode is a 21-bit character set. TextPad can't display, say, Greek and Arabic characters together in one document.

daemonui · Post by **daemonui** » Wed Jun 07, 2006 12:21 am

ben_josephs wrote:If the script is set to Western (Configure | Preferences | Document Classes | <Class> | Font | Script or View | Document Properties | Font | Script) TextPad has no problem with Ã Ã¡Ã¢Ã£Ã¤Ã¥Ã¦Ã§Ã¨Ã©ÃªÃ«Ã¬ÃÃ®Ã¯Ã°Ã±Ã²Ã³Ã´ÃµÃ¶Ã¸Ã¹ÃºÃ»Ã¼Ã½Ã¾Ã¿Å“ and many other characters. All these are in the Western (or WinLatin1 or CP1252) character set.

Have a look in the help under How To ... | Edit Text | Type International Characters.

The problem is that TextPad renders an entire document in a single 8-bit character set. Unicode is a 21-bit character set. TextPad can't display, say, Greek and Arabic characters together in one document.

Would you suggest not to use international character (especially characters not in 1252 codepage) and not to mix languages? But thats exactly what I need.

And it's one part of the problem. Try to search for Unicode text.
The correct regular expression would be for "Sample" "S\0a\0m\0p\0l\0e\0" which doesn't work.
"S.a.m.p.l.e." works, and "S.?a.?m.?p.?l.?e.?" would be auto/unicode.
But this doesn't help much for characters >255.

ben_josephs · Post by **ben_josephs** » Thu Jun 08, 2006 12:57 pm

My posting was a reply to hillsc's assertion that he can't enter the letter 'Ã©'. If you're using WinLatin1 there is no problem doing that.

Unfortunately, you can't mix characters from different 8-bit character sets. That's a restriction of TextPad. It translates each character in a Unicode file into its nearest equivalent, if there is one, in the character set of the current document. It converts all other characters into question marks.

The problem with searching for text in a Unicode file doesn't arise when viewing the file as a text, when "Sample" matches "Sample". Unfortunately, TextPad's binary mode knows nothing of the multibyte characters of which you speak and doesn't recognise the double byte U+0053 as the character 'S'.