From: Patrycjusz R. £ogiewa (silverdr_at_inet.com.pl)
Date: 2005-05-11 13:59:10
On 2005-05-11, at 13:21, Marko Mäkelä wrote:
> On Wed, May 11, 2005 at 12:57:19PM +0200, Patrycjusz R. £ogiewa wrote:
>>> (Oh well, newer file
>>> systems have problems with file names as well. Take the
>>> case-insensitive
>>> Apple HFS+, for example. If I have understood the technical notes
>>> correctly,
>>> the case-folding will effectively treat strings consisting of
>>> non-Latin
>>> letters as empty strings. This would mean that e.g., Greek, Russian
>>> or
>>> Japanese users would have to give Latin names to their documents.)
>>
>> Disagreed.
>>
>> This kind of case-insensitivity as used in HFS+ has its own name,
>> which
>> I forgot but it certainly doesn't affect using non-Latin filenames.
>
> I was referring to this technical note:
> http://developer.apple.com/technotes/tn/tn1150.html
>
> It seems that I confused HFS+ with HFS:
Might be, but even in the old days of MacOS 7.1, I used a Chinese
version, which allowed me to use the names in Chinese characters. It
was more or less all about the "scripts" they introduced somewhere
around that time.
>
> "The problem with using non-Roman scripts in an HFS file name is that
> HFS
> compares file names in a case- insensitive fashion. The
> case-insensitive
> comparison algorithm assume a MacRoman encoding. When presented with
> non-Roman text, this algorithm fails in strange ways. The upshot is
> that
> HFS decides that certain non-Roman file names are duplicates of other
> file
> names, even though they are not duplicates in the source encoding."
Yeah, the remnants of the "scripts" are biting us still today. I have
to deal with them on a daily basis as I also happen to have a
"non-Roman" language as my native one. Anyway, at least the filesystem
related problems are mostly gone today.
> Anyway, I think that it is a bad idea to disallow a large group of
> characters
> or strings in a file system.
Perfectly agreed.
> Unix-like file systems do it nicely: the only
> disallowed characters are the directory separator and NUL, the
> end-of-string
> marker in C library routines. Well, Commodore takes this further,
> treating
> the file name as an arbitrary binary string.
Which is quite logical IMHO - yet unixalikes have somehow to deal with
the directory hierarchy, hence the limit on the separator.
--
As we all know, Linux is only free if your time has no value - Jamie
Zawinsky
Message was sent through the cbm-hackers mailing list
Archive generated by hypermail pre-2.1.8.