Unicode?

General questions about using TextPad

Moderators: AmigoJack, bbadmin, helios, Bob Hansen, MudGuard

Post Reply
User avatar
CWBillow
Posts: 102
Joined: Thu Nov 06, 2003 11:59 pm
Location: Chula Vista California
Contact:

Unicode?

Post by CWBillow »

I am a complete novice with this...

Are there advantages or disadvantages to editing files in Unicode as opposed to ASCII?

That being said, if a file is in Unicode, is the default UTF-8 or UTF-16?

Regards,
Chuck Billow
User avatar
bbadmin
Site Admin
Posts: 854
Joined: Mon Feb 17, 2003 8:54 pm
Contact:

Post by bbadmin »

Keeping it simple:

ASCII characters can be represented in 7 bits, so there are 127 different ones. This is generally sufficient for US English texts and most programming languages.

ANSI characters are stored in 8 bits (1 byte). The first 127 are the same as ASCII, while the remaining ones are different in different languages (eg. é and ï). However, two bytes per character are used for languages such as Chinese and Japanese.

Unicode represents characters in up to 24 bits, so there could theoretically be 16,777,215 different ones, with the first 127 being the same as ASCII.

So, if you need to edit text files in more than one language, Unicode is the answer. Files can be stored in UTF-8 or UTF-16. The advantage of UTF-8 is that files containing only ASCII characters don't take up any extra space, but characters with code points > 127 are encoded in 2 to 4 bytes. In UTF-16, characters with code points < 65,536 are stored in 2 bytes, otherwise 4 bytes are required, but that is very rare.

Google will turn up plenty of reading matter on the subject, but this should be enough for you to decide if you actually need to use Unicode.

Keith MacDonald
Helios Software Solutions
User avatar
AmigoJack
Posts: 515
Joined: Sun Oct 30, 2016 4:28 pm
Location: グリーン ヒル ゾーン
Contact:

Re: Unicode?

Post by AmigoJack »

I tried to find similar questions (along with answers) which may help you in understanding what you asked:
CWBillow wrote:Are there advantages or disadvantages to editing files in Unicode as opposed to ASCII?
Are there advantages or disadvantages to using a 64bit operating system as opposed to 32bit? [1]
CWBillow wrote:if a file is in Unicode, is the default UTF-8 or UTF-16?
If a file is an MP4, is the default resolution 800x600 or 1920x1080? [2]

  1. Yes, both have both. 32bit operating systems suffice for most needs if you need only 3,5 GiB RAM or less, while 64bit operating systems have to draw a line somewhere on which legacy stuff is not supported anymore (i.e. 16bit executables, or 32bit drivers).
  2. There is no default: the file is just a file - its content has no default. Video files have a section that defines the output resolution; UTF8, UTF16 and UTF32 might also have a recognition marker, but that is optional.
Post Reply