Skip to main content

Posts

Showing posts from April, 2008

Opening files with unicode (japanese/chinese) characters in filename using Perl

(Read a complete article on the matter and much more
"Unicode issues regarding the Window OS file system and their handling from Perl)

Currently there is no way to manipulate a file named using Unicode characters,by using Perl's built in functions.
The perl 5.10 todo wish list states that functions like chdir, opendir, readdir, readlink, rename, rmdir e.g
"could potentially accept Unicode filenames either as input or output".
Windows default encoding is UTF-16LE,but the console 'dir' command will only return ANSI names.Thus unicode characters are replaced with "?"
,even if you invoke the console using the unicode switch (cmd.exe /u),change the codepage to 65001 which is utf8 on windows and use lucida console true type font which supports unicode.
A workaround is to use the COM facilities provided by windows (in this case Scripting.FileSystemObject) which provide a much higher level of abstraction or use the Win32 api calls.
I tried to read a file…