|
Title:
|
Unicode tip #9 - Console Output
|
Author:
|
Bob Swart |
Posted:
|
12/11/2008 12:29:22 PM
(GMT+1)
|
Content:
|
First the bad news: console I/O does not support reading (UTF-16) Unicode strings, and writing only suports AnsiStrings. This means that as soon as you call write or writeln, the contents of a (unicode) string will be converted to AnsiString when needed, and written to the output.
This means that any Text file I/O needs to be rewritten using streams or other techniques. However, since a UTF8String is also an AnsiString (with the 65001 code page specified), there is a good workaround for writing to console output provided you set the console codepage to UTF-8 and use a font that can display the Unicode characters (that’s Lucida Console for example):
program ConsoleUTF8; (*$APPTYPE CONSOLE*) uses Windows, SysUtils;
begin SetConsoleOutputCP(65001); // Writeln(UTF8String('[???????????? ???????]')); // normal Unicode String, now "????" Writeln(AnsiString('[наименование проекта]')); // UTF-8 cyrillic "hack" readln end.
This will produce cyrillic characters on the standard output, if you use Lucida Console as font (just try it - copy and paste into the Delphi 2009 IDE). Note that Lucida Console cannot display all Unicode characters – Chinese and the Clef are not shown, but at least cyrillic characters display without problems.
Note that I do not have to write the BOM to the output (you may want to in case you want to save the console output to a text file and read it afterwards. That way, you can set the font afterwards and also see the Chinese or Clef characters without problems. Provided they were written as UTF-8).
As long as we convert UTF-16 Unicode Strings to UTF-8 before writing to Text files, and don’t forget to use the UTF-8 BOM as prefix, this will work fine for writing files with Unicode UTF-8 output.
This tip is the 9th in a series of Unicode tips taken from my Delphi 2009 Development Essentials book published earlier this week on Lulu.com.
|
Back |
|