Replace western text with Japanese text?

General questions about using WildEdit

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
UkSeo
Posts: 4
Joined: Sat Dec 11, 2004 2:36 pm

Replace western text with Japanese text?

Post by UkSeo »

Hello,
I need to do mass s&r operations where western text in html files gets replaced by japanese text, and later kyrillian, baltic etc.
This obviously touches on the encoding setting in Wildedit Replace panel. Problem is whatever setting I choose it either says:
Character conversion: Illegal input sequence/combination of input units, or:
Character conversion: Unmappable input sequence

With some files however it did work, choosing either Shift_JIS os UTF-8 as encoding options.

Any help and ideas much appreciated!
User avatar
bbadmin
Site Admin
Posts: 882
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

Hello,

WildEdit reads and writes files using the single specified character encoding, so you will need to convert your files to UTF-8, if you want to replace English text with Japanese, etc. You must do that prior to running WildEdit.

Don't forget to change the http encoding in the HTML files as well. For example, change:

Code: Select all

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
to

Code: Select all

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
in the head section.

Keith MacDonald
Helios Software Solutions
UkSeo
Posts: 4
Joined: Sat Dec 11, 2004 2:36 pm

Post by UkSeo »

Dear Keith,

thanks for your reply. I'm not really sure what to make of it though.
Some of my files are shown by Textpad as being unicode, some ANSI. The problem with Wildedit is with both types.
What I obviously can do is using character representations like &#12490 ;&#12499 ;&#12466 ; (spaces added to prevent rendering), search & replace with wildedit and other tools works just fine. The problem with that approach is just that the source files can not be read by humans in that way. So what I need is real japanese text representation in the source files, like ナビゲ

I am aware that the problem might very well be on my side, and may not be related to Wildedit specifically. I would still appreciate ideas and input.

Thanks again!
User avatar
bbadmin
Site Admin
Posts: 882
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

Unicode files work for me. There is only a problem with files containing character encodings that do not match the selected encoding. For Unicode files created on MS Windows, that should be UTF-16LE.

Note that unlike WildEdit, TextPad works internally in a specified code page, so it will not be able to open Unicode files containing characters from more than one code page.

Keith MacDonald
Helios Software Solutions
Last edited by bbadmin on Tue Dec 14, 2004 12:05 pm, edited 1 time in total.
UkSeo
Posts: 4
Joined: Sat Dec 11, 2004 2:36 pm

Post by UkSeo »

>it will not be able to open Unicode files containing characters from different code pages.

That might be the explanation.
I have tested creating unicode files with japanese text and you are right, I can work with them in wildedit without problems.
So when I have files authored as ANSI files containing japanese text and save those files as Unicode / utf-8 files, the japanese text still does not render in wildedit?
So I'll have to find out a way to convert my files to unicode. Any easy way to go about that?
I am however not sure if this is really safe for all browsers, i.e. older IE versions seem to support unicode less than perfect.


Btw I have found a way around the problem for now, I do the japanese replacements I need in one file manually in textpad, then open that file as trial file in Wildedit. The japanese text renders like this: X?g?b?N?z??????A????? (lot of strange characters with accents etc)
I then copy and paste those replacements from the trial file to the replace field and let it run. When opening the changed files in Textpad I have perfect japanese text, ANSI encoded. Certainly not the best method though.
User avatar
bbadmin
Site Admin
Posts: 882
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

I've uploaded a command line tool that can be used to convert the encoding of files. It can be downloaded from here:

www.textpad.com/download/wildedit/uconv.zip (346KB)

Extract the contents of the zip file into your WildEdit installation folder, which already contains some DLLs it requires. To run it, start a command prompt, then type:

Code: Select all

CD folder-containing-your-files
"C:\Program Files\WildEdit\uconv" -f windows-1252 -t UTF-8 -o newfile.html oldfile.html
The supported from (-f) and to (-t) encodings are any of those on the drop-down combobox in WildEdit. (Note that case is significant.) Just run uconv with parameter "-h" to get a listing of its other options.

If you have a lot of files to convert, set up all the uconv commands in a batch file, then run that. To get a directory listing of the relevant files in TextPad, choose Run from the Tools menu, and set up the command as follows:

Code: Select all

Command: DIR
Parameters: /b *.html
Initial folder: folder-containing-your-files
DOS Command: checked
Capture output: checked
Copy and paste the relevant lines from the Command Results window to your batch file, then edit it to insert all the uconv commands.

Keith MacDonald
Helios Software Solutions
Last edited by bbadmin on Tue Dec 14, 2004 3:03 pm, edited 1 time in total.
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

I just downloaded the tool (I hope that was ok) and tried it.

I got a message box saying

The Dynamic Link Library icuuc28.dll was not found in the path [followed by the path variable of my system]

Ok, I searched for it and found it in the Wildedit folder. So I moved uconv there.

Now it is missing icuin28.dll
I did not find it on my system here ...
User avatar
bbadmin
Site Admin
Posts: 882
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

Andreas,

Thanks for reporting the missing DLL. It was in another folder on my search path, so I did not notice that it was required. I have uploaded uconv.zip again, with icuin28.dll, so you (and anyone else) are welcome to use it.

Keith MacDonald
Helios Software Solutions
User avatar
MudGuard
Posts: 1295
Joined: Sun Mar 02, 2003 10:15 pm
Location: Munich, Germany
Contact:

Post by MudGuard »

Thanks - now it works! :D


except that
uconv -s -h
still prints the help message although it clearly says that -s suppresses messages
:D :lol: :wink: :P 8) :D :lol:
(too much wine in me, getting silly)
UkSeo
Posts: 4
Joined: Sat Dec 11, 2004 2:36 pm

Post by UkSeo »

Fantastic, many thanks! I'm going to try it later today as I'm currently knee-deep in Portuguese right now..
Post Reply