pmsgs You have no new messages
Fri Nov 21st 2008 Welcome, Anonymous! › Log in Or Register
› Technical mIRC Related Section.

mirc & utf-8

Since version 6.17, mIRC has been supporting UTF-8 Unicode text.
With the continues development of mIRC, Khaled has been improving mIRC's UTF-8 Support, added the font linking feature and more.
This page is a quick reference on UTF-8 as a whole, and specifically; UTF-8 and mIRC.
Hopefully you would find the information here as useful as desired.

UTF-8 Quick Reference.

UTF-8 (8-bit UCS/Unicode Transformation Format) is a variable-length character encoding for Unicode created by Ken Thompson and Rob Pike. It is able to represent any universal character in the Unicode standard, yet the initial encoding of byte codes and character assignments for UTF-8 is coincident with ASCII (requiring little or no change for software that handles ASCII but preserves other values). For these reasons, it is steadily becoming the preferred encoding for email, web pages, and other places where characters are stored or streamed.

UTF-8 uses one to four bytes (strictly, octets) per character, depending on the Unicode symbol. Only one byte is needed to encode the 128 US-ASCII characters (Unicode range U+0000 to U+007F). Two bytes are needed for Latin letters with diacritics and for characters from Greek, Cyrillic, Armenian, Hebrew, Arabic, Syriac and Thaana alphabets (Unicode range U+0080 to U+07FF). Three bytes are needed for the rest of the Basic Multilingual Plane (which contains virtually all characters in common use). Four bytes are needed for characters in other planes of Unicode.

Four bytes may seem like a lot for one character (code point). However, code points outside the Basic Multilingual Plane are generally very rare. Furthermore, UTF-16 (the main alternative to UTF-8) also needs four bytes for these code points. Whether UTF-8 or UTF-16 is more efficient depends on the range of code points being used. However, the differences between different encoding schemes can become negligible with the use of traditional compression systems like DEFLATE. For short items of text where traditional algorithms do not perform well and size is important, the Standard Compression Scheme for Unicode could be considered instead.

More UTF-8 documentation available on Wikipedia.


Basic Instructions for UTF-8 (Unicode) support in mIRC

To enable UTF-8 support in mIRC do the following:

  1. Download the following font and load to your windows font directory. Fixedsys
    • This is one of many Unicode fonts. The one above is actually 'Fixedsys Excelsior' and will look much like the standard Fixedsys font but will support Unicode for many languages.
    • Note: this file should be unzipped to your windows 'fonts' directory.
  2. Configure for Unicode for messages.
    • This is done in mIRC by following these instructions
      • Alt+o » IRC » messages, the enable the checkboxes beside the following:
      • MultiByte Display
      • MultiByte Editbox
      • UTF-8 display
  3. Configure for Unicode for Channel and other windows in mIRC.
    • This is done in mIRC followin these instrcuctions:
      • Menubar » view » Font
      • Select: Fixedsys excelsior v2.00
      • Size: 11
      • UTF-8: display and encode
      • Set as default channel font
After following the above instructions, if you still have issues displaying 'Unicode' stop into the #mIRC channel on DALnet and ask us for more help or post your queston in our message board.
 

Changes log | Contributers | DALnet Services | mIRC Scripts | Technical | Useful Links | Chat Now | Guidelines | DALnet