8.0.2 How to get Tool Output to show Unicode?
Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard
8.0.2 How to get Tool Output to show Unicode?
How can I get Tool Output to show Unicode characters?
Currently it shows unicodes as multi-char gibberish e.g.
and obstructs search:
and no solution is found in the preferences here
Non-output windows using the same font show the characters fine:
Currently it shows unicodes as multi-char gibberish e.g.
and obstructs search:
and no solution is found in the preferences here
Non-output windows using the same font show the characters fine:
Last edited by chrisjj on Thu Nov 03, 2016 12:53 pm, edited 4 times in total.
Run any tools that output unicode.AmigoJack wrote:What do I have to do to even reconstruct your situation?
The source in the same font shows the characters. Evidence added to post.AmigoJack wrote:Maybe the source is the problem, not the output. Maybe the font being used is not capable to show the correct characters.
So far I can't reproduce it:chrisjj wrote:Run any tools that output unicode.
- using CMD (what most people still call "DOS") with UTF-16 works fine:
- using PHP with UTF-8 works fine:
None was needed to answer the question: "How can I get Tool Output to show Unicode characters?"AmigoJack wrote:That's why I ask for a reconstructable example
As your useful answer demonstrates.
Thanks. I'll try that here.AmigoJack wrote:So far I can't reproduce it:chrisjj wrote:Run any tools that output unicode.
- using CMD (what most people still call "DOS") with UTF-16 works fine:
To my knowledge TextPad does not offer a command prompt. Do you mean the Run command?AmigoJack wrote:And how should anyone have known you wanted to use the command prompt?
Different as far as getting Tool Output to show Unicode?? Wow. That had never occurred to me. Thanks for the warning. I'll stick with the Run command for now.AmigoJack wrote:Even now it's not sure if you need it, or if you're executing a program on its own. Those are all different things.
Following your example of a php.exe in Run, I still get the fail:chrisjj wrote:Thanks. I'll try that here.
Attempting "> out.txt" on Run had no effect, but using a Tool shows the same output via file and regular window succeeds:
I wonder if yours is not a reconstructable example. E.g. depends on PHP version or script, neither of which you've declared.
By "command prompt" I mean the only one existing in the system, not Textpad. This resembles to CMD.EXE on nowadays Windows version. In Textpad's "Run" dialog or "Tools > Add" menu this is the (wrongly titled) "DOS command" checkbox/item. In your "Run" screenshot you haven't ticked it, but in your "Tools" configuration I see you must have chosen "DOS command" previously, as you can't change "CMD.EXE" as command.chrisjj wrote:TextPad does not offer a command prompt. Do you mean the Run command?
That's knowledge outside of Textpad: either you are in the command prompt already, where you want to issue a command like DIR, or you want to start an EXE file (which can be the command prompt as well). How Textpad starts CMD.EXE on its own when you use the "DOS command" option is yet unknown to both of us, so the better approach is to do it on your own. Which also makes Unicode support available.
Why would you encode a text that is already encoded in the PHP file itself? Do you see in my file I call that function, or do you see the characters directly?chrisjj wrote:Code: Select all
utf8_encode('á')
Long story short:
- Your PHP should contain only this: (closing PHP tag is not needed).
Code: Select all
<?php echo 'Tus labios me dirán';
- Save the file with the encoding UTF-8 and no Unicode BOM. That's how my files were saved/encoded.
- Run again thru PHP.EXE directly.
Ah, you mean the command interpreter. Yes my original example used the command interpreter - via TextPad's Tools, DOS command option. For avoidance of doubt I'm not using a command prompt.AmigoJack wrote:By "command prompt" I mean the only one existing in the system, not Textpad. This resembles to CMD.EXE on nowadays Windows version. In Textpad's "Run" dialog or "Tools > Add" menu this is the (wrongly titled) "DOS command" checkbox/item.chrisjj wrote:TextPad does not offer a command prompt. Do you mean the Run command?
I don't know what you mean by "do it on my own". If you mean using TextPad Run, well, there's been no evidence in this thread indicating Run and Tool differ in Unicode availability.AmigoJack wrote:How Textpad starts CMD.EXE on its own when you use the "DOS command" option is yet unknown to both of us, so the better approach is to do it on your own. Which also makes Unicode support available.
The text is not already encoded as UTF-8. It is encoded as ANSI. So I used run-time encoding to get test output that is UTF-8.AmigoJack wrote:Why would you encode a text that is already encoded in the PHP file itself?chrisjj wrote:Code: Select all
utf8_encode('á')
Thanks. For me that fails:chrisjj wrote:Long story short:
- Your PHP should contain only this:
(closing PHP tag is not needed).Code: Select all
<?php echo 'Tus labios me dirán';
- Save the file with the encoding UTF-8 and no Unicode BOM. That's how my files were saved/encoded.
- Run again thru PHP.EXE directly.
Does it work for you?
I get the same results as you. And if I modify the code to: then the output is the correct one.
The output is - in any case - UTF-8. á is UTF-8 for á (you see that yourself when you compare the binary view of your UTF-8 saved file with the tool output). That means: save your tool output in a file with ANSI encoding, then open the file again by specifying UTF-8 as encoding (instead of Default). Now you should see what you always expected.
If you run the code from above (with the two Katakanas) then the tool output "magically" recognizes UTF-8 and displays it accordingly.
I can only assume that á alone is not enough for Textpad to think the encoding is meant to be UTF-8 - it just thinks it's ANSI. But my two additional Katakanas are enough as an indication to UTF-8. But we can trick Textpad into recognizing UTF-8 right off the start without displaying characters. Use this PHP file:Now this finally produces Tus labios me dirán to me as well.
I guess the "tool output" document behaves just as any other document as well: it tries to guess the encoding, and can fail to do so. Textpad could have an option for every tool run where you can choose a specific encoding of the output (or leave it to "automatic"), so the tab displaying the output knows the correct encoding (just like you can choose the encoding when opening a file).
Code: Select all
<?php
echo 'グリ';
echo 'Tus labios me dirán';
The output is - in any case - UTF-8. á is UTF-8 for á (you see that yourself when you compare the binary view of your UTF-8 saved file with the tool output). That means: save your tool output in a file with ANSI encoding, then open the file again by specifying UTF-8 as encoding (instead of Default). Now you should see what you always expected.
If you run the code from above (with the two Katakanas) then the tool output "magically" recognizes UTF-8 and displays it accordingly.
I can only assume that á alone is not enough for Textpad to think the encoding is meant to be UTF-8 - it just thinks it's ANSI. But my two additional Katakanas are enough as an indication to UTF-8. But we can trick Textpad into recognizing UTF-8 right off the start without displaying characters. Use this PHP file:
Code: Select all
<?php
echo "\xEF\xBB\xBF"; // UTF-8 BOM
echo 'Tus labios me dirán';
I guess the "tool output" document behaves just as any other document as well: it tries to guess the encoding, and can fail to do so. Textpad could have an option for every tool run where you can choose a specific encoding of the output (or leave it to "automatic"), so the tab displaying the output knows the correct encoding (just like you can choose the encoding when opening a file).
No, that's not what I always expected. What I expected was correct display in the Tool Output window.AmigoJack wrote:That means: save your tool output in a file with ANSI encoding, then open the file again by specifying UTF-8 as encoding (instead of Default). Now you should see what you always expected.
It is not alone. See the setting I already posted:AmigoJack wrote:I can only assume that á alone is not enough for Textpad to think the encoding is meant to be UTF-8
That should be more than enough.
OK, so my options to get correct display of the valid UTF-8 sent to Tool Output include:AmigoJack wrote: - it just thinks it's ANSI. But my two additional Katakanas are enough as an indication to UTF-8. But we can trick Textpad into recognizing UTF-8 right off the start without displaying characters. Use this PHP file:
1 Changing my program to mix some Japanese in with my output Spanish
2 Changing my program to add bytes to the output that are not recommended by the Unicode Standard and are illegal in some major applications e.g. https://tools.ietf.org/html/rfc7159#section-8.1
3 Save and reopen the output in an editor window, manually changing the encoding.
And my options do not include setting Tool Output default encoding to UTF-8 http://i.imgur.com/4UyFdFs.png
Thanks for your help.
Yes. Looks like you found a bug from the very start, and I didn't stripped down starting my attempts to accents only.chrisjj wrote:my options do not include setting Tool Output default encoding to UTF-8
The Document Class setting seems to have no effect at all. Maybe it even has no effect with any Document Class? If I set it to UTF-8 for Java files and save the text dirán in v.java, close the file, then open it again the text will be interpreted as ANSI, not UTF-8, despite being a .java file to which the Document Class settings should apply. I'll create a separate topic for this.
At least now I'm more confident in what to do and what to expect from 8.0.2.
This fails in a Tool: http://forums.textpad.com/viewtopic.php?t=13016chrisjj wrote:OK, so my options to get correct display of the valid UTF-8 sent to Tool Output include:
[...]
2 Changing my program to add bytes to the output that are not recommended by the Unicode Standard and are illegal in some major applications e.g. https://tools.ietf.org/html/rfc7159#section-8.1