Sunday 21 June 2009

I'm not saying Apple are evil, but...

I don't use the word "hate". It is too strong for most of my emotions. If I did, however, I would most likely be using it right about now, with regard to HFS.
When copying the files for the disc (with hfsutils, see previous post), there are several modes that can be used to copy the files: raw, binhex, macbinaryii, text. There is also an auto mode, which tries to make an intelligent guess as to which mode should be used. Unfortunately, it doesn't get it right for the T7G data files: it tries to copy them as text. They need to be copied with the raw mode, but the binary (which also, helpfully, contains another selection of required files) cannot be copied in raw mode. That can be copied with auto (which guesses correctly that it should be copied with the macbinaryii mode).
I think before I dig too deeply into support for the Mac version I'm going to have to learn a little more about how HFS works, and why this confusion arises. I remember a Mac-based friend trying to explain to me (about 15 years ago) all about resource forks, data forks, etc. I just sat there thinking "surely a file is a file? it start, has binary data in, then ends, and different files are interpreted in different ways". I wish I was still in contact with that friend so I could phone him and get him to repeat the conversation...

Edit: It seems I'm not alone in my view of HFS: http://www.engadget.com/2008/02/05/linus-torvalds-calls-apples-file-system-utter-crap/

6 comments:

Fingolfin said...

It is sad when people start to hate or defame that what they do not understand :-(. Especially in the days of Google and Wikipedia, which make it soooo easy to find out what HFS does, what resource vs. data forks are, what binhex means etc. ;).

A file is a file you say -- of course! On HFS just as on any other *file* system. But what *is* a file? On most classical unix filesystems, and also on FAT, a file is a set of meta-data (a filename, modification dates, access permissions, ...) plus a single data stream.

The main difference with HFS is that there are *two* data streams (and lots more meta-data, like type/creatore codes that allow associating files to programs without the hack called "file extensions"). One data stream is the "data fork", which resembles what you probably call a "file" (doing so, you do forget about meta data, though ;). The other work is not containing arbitrary data, though, but rather contains structured data. A bit like in an IFF file. That makes it possible to write programs which read and display this data in a generic fashion, too.

There is nothing evil about that; it's actually a quite useful feature, but of course it does have the "drawback" of not mapping 1-1 on the filesystems used on most other systems. So, to transfer a file from MacOS HFS with a resource fork to one of these systems, you need to somehow merge the two data streams into a single one. That's what the BinHex and MacBinary encodings were good for.

This problematic interoperability is probably the primary reason that Apple stopped using the resource fork system. This eased cross platform compatibility of files, at the cost of sacrificing some truly useful system features :/. Many mac users and developers were really unhappy about this loss.

By the way, most modern file systems (i.e. *not* ext3 ;), and also many old ones (like the VMS file system) do allow associating data streams with a single file, e.g. NTFS or ZFS. And also many now offer additional metadata support.


As for Linus Torvalds': He's a bright guy, but very often behaves like an utter troll. That comment you link to is perfect example of that. If you look at it closely, not a single factual argument is given. Sad :(.

clone2727 said...

The only thing that has really ever bothered me with HFS are some QuickTime videos that have the mdat section in the resource fork. Those videos are unplayable without modifications on non-Mac systems. Of course the mdat section can just be appended to the end of the data fork ;)

Unknown said...

Thank you for the explanation, and apologies for seeming closed minded and such: the post was written after a long night of copying the same disc multiple times, so naturally I felt a little frustrated!
I see your point, it does seem like it has useful features. The main point of my post (which I admit possibly got lost in my rant) is that I need to understand it better before starting to make the T7G Mac binary readable by ScummVM. I'll have to take a look at the way other engines do it.

I guess my post could be re-written as "why can't everyone do everything the same way?" (rhetorical ;-).

Eugene Sandulenko said...

scumm/he/resource_he.cpp contains MacResExtractor class which reads requested sections from MacBinary files.

So what you need is to move this class over to common/ directory, it is pretty generic.

Unknown said...

Wow, ScummVM really does have everything... I'll give that a go, thanks!

Fingolfin said...

Note: There is a difference between MacBinary and Binhex. I think we only support the former.

The main difference is that binhex encodes the data into pure ASCII, using only 7bits in each byte (thus making it "8bit clean" and compatible with say old modems, etc.). MacBinary encodes into a binary data stream and as such is a lot more space efficient (and hence in later days, Mac users tended to prefer it).

There are excellent Wikipedia articles on both, though, which are probably more accurate than my fading memories, too ;).