UTF-8 problems with RTL languages

Ideas for new features

Moderators: AmigoJack, helios, bbadmin, Bob Hansen, MudGuard

Post Reply
john1857
Posts: 13
Joined: Wed Oct 09, 2019 6:02 pm

UTF-8 problems with RTL languages

Post by john1857 »

I'm trying to prepare property files for internationalising a Java application, and I've noticed some issues with Textpad's UTF-8 support for right-to-left languages (Arabic, Hebrew).

The issues I'm seeing are:

1) A sequence of words is displayed left-to-right rather than right-to-left, even though the characters within each word are correctly displayed right-to-left.

2) Selecting the leftmost character in a word and copying it actually results in the first (rightmost) character being copied.

3) Cursor movement is erratic -- pressing right-arrow moves the caret from left to right, but it sometimes ends up in the middle of characters. I thought this might be because when moving right over the leftmost character, the caret is moving by the width of the first (rightmost) character in the word. However, the line length is also wrong. Consider this line:

Code: Select all

background\ colour = צבע רקע
(which displays in Textpad with the two Hebrew words in the opposite order to the correct order as shown here). When you press End to go to the end of the line, the cursor position ends up about one character width short of the end. Some characters seem to have a width of zero when you move the caret across them.

I know Textpad is coming rather late to the party with Unicode, so although I prefer Textpad overall, I still end up needing Notepad++ when dealing with multi-alphabetic text, since Notepad++ handles it correctly (and also has text direction commands to display a file as LTR or RTL). This is rather a pity, and I hope that some more effort can be put into Unicode support to fix these problems.
john1857
Posts: 13
Joined: Wed Oct 09, 2019 6:02 pm

Post by john1857 »

The erratic cursor movement seems to be font-dependent. It works correctly with Courier New, for example, but not with other monospaced fonts, e.g. Consolas. It also works correctly with some proportionally spaced fonts, e.g. Arial. Arial and Courier New both have "complex scripts" support. It looks like the "complex scripts" font associated with Consolas is a proportionally spaced font which does not match the characters as they are displayed, and which Textpad does not provide any way to change.
Post Reply