Slides from my ILC09 lightning talk

April 1st, 2009

I gave a lightning talk at ILC09 on threads and GC implementation in Clozure CL.

I put the slides and the one-page paper (as seen in the ILC09 proceedings) at http://www.clozure.com/~rme/.

CCL on x8632: more registers, please

September 19th, 2008

The biggest problem with porting CCL to the x8632 architecture is that the architecture has so few registers. I need to talk a little bit about how CCL’s GC works in order to explain why the small number of registers is a problem.

CCL is designed to use pre-emptively scheduled OS threads. This means that a context switch can happen at any instruction boundary. Since other threads might allocate memory, this means that a GC can happen between any two instructions, too.

(The following summarizes information found in the Implementation Details of Clozure CL chapter of the CCL manual.)

The GC in CCL is precise. That is, it believes that it can always tell whether a register or a stack location contains a lisp value or just raw bits. In order it to enable it to do this, we have to adopt and follow strict conventions on register and stack usage.

What we do is to partition the machine registers into two sets: one that will always contain raw, unboxed values (”immediates”), and another that will always contain tagged lisp objects (”nodes”).

This works fine on architectures that have a reasonable number of registers. On x8632, we’re so register-starved that it’s impossible to get by with a static partitioning of registers.

We therefore keep a bit mask in thread-local memory that indicates whether a given register is a node or an immediate, and have the GC consult these bits when it runs. This allows us to switch the class of a register at run time.

The default register partitioning looks like this:

  • We have a single “immediate” register. EAX is given the symbolic name %imm0.
  • There are two “dedicated” registers. ESP and EBP have dedicated functionality dictated by the hardware and calling conventions.
  • The remaining 5 registers are “node” registers (%temp0, %temp1, %arg_y, %arg_z, and %fn). We don’t use the x86 string instructions which implicitly use ESI and EDI.

Most of the time, all we need to do is to steal a node register and mark it as an immediate for a couple of instructions. Typically this is because we need to index some foreign pointer, or use MUL or DIV to produce extended-precision results.

Here’s an example of a case where we have to do this.

(defx8632lapfunction %%get-unsigned-longlong ((ptr arg_y) (offset arg_z))
  (trap-unless-typecode= ptr x8632::subtag-macptr)
  (mark-as-imm temp0)
  (let ((imm1 temp0))
    (macptr-ptr ptr imm1)
    (unbox-fixnum offset imm0)
    (movq (@ (% imm1) (% imm0)) (% mm0)))
  (mark-as-node temp0)
  (jmp-subprim .SPmakeu64))

The mark-as-imm macro expands to something like this:

(andb ($ bit-for-temp0) (@ (% :rcontext) x8632::tcr.node-regs-mask))

Here, :rcontext is the register that points to a block of thread-local storage (the thread context record). On x8632, it’s an otherwise useless segment register, typically %fs. (We’d be in real trouble if we had to dedicate a GPR to point to thread-local storage on x8632.)

In simple cases like this, there’s actually another alternative. We don’t use the x86 string instructions, so the direction flag in EFLAGS is otherwise unused. So, what we do is to say that if DF is set, then %edx is an immediate register. So, if we used temp1 (aka EDX) instead of temp0 (aka ECX) in the example above, we could actually replace the mark-as-imm/mark-as-node with the (presumably cheaper) std/cld instruction pair.

In fact, I should probably make that change…

Anyway, many’s the time I wished for just two more registers. I thought about sending a bug report to Intel, but I didn’t figure that I’d get a response.

Clozure CL 1.2 released

September 18th, 2008

Clozure CL 1.2 is out now.  It runs on x86-64 and PowerPC processors, under Mac OS X, Linux, and FreeBSD.  (I continue to be surprised by how many people think it runs only on Macintosh systems.)  This is the first official release in over two and a half years.

Fast compiler, native threads, convenient FFI, Unicode, generational GC, etc.  See http://trac.clozure.com/openmcl

One major feature that will be in Clozure CL 1.3 is support for the 32-bit x86 platform.  In fact, an experimental 32-bit lisp is already in the trunk for Darwin/x86.  I worked on the 32-bit Intel port.

It’s probably a little unusual for software to be ported from x86-64 back to x8632.  Anti-progress, as it were.

The existence of an x8664 port made the job quite a bit simpler: one major benefit was that there was already a working assembler (and disassembler, too).  I was also able to use the existing low-level x86-64 assembly language code as a model for what the corresponding 32-bit version should look like.

Another thing I had going for me was that the lisp already ran on the 32-bit PowerPC, so the word size issues were mostly ironed out.

I didn’t really know (or care to know) the x86 architecture all that well before I started working on the port.  I think other architectures (SPARC, MIPS, PowerPC, …) are much nicer targets.

However, the hardware engineers at Intel and AMD are brilliant, and it’s impossible to ignore the performance of the x86 chips that they build.  You just have to hold your nose, study the architecture manuals, and get on with it.

After doing the port, I find it funny that I look on x86-64 as some sort of Nirvana.  (I mentioned this on a private IRC channel, and got the reply “It’s not THAT bad.  Of the 8 or so architectures that I can think of that’re still in use, it’s in the top 7.”)

I’m afraid that it might be a bit boring to read about issues that face the lisp implementer when targeting x8632, but maybe I’ll write a follow-on post with some more details if there’s any interest.

Clozure CL (née OpenMCL)

October 19th, 2007

OpenMCL is getting renamed to Clozure CL. Now that regular MCL is going to be released as open source software, this is proabably a good idea. It’s already the case that many people think that OpenMCL runs only on the Macintosh. In reality, it runs on PowerPC hardware under Mac OS X and Linux PPC, and on x86-64 hardware under Mac OS X, Linux, and FreeBSD.

Anyway, OpenMCL has had a kind of proof-of-concept Cocoa-based development environment for a while now, but it’s starting to get some attention.

A demonstration version of the Clozure CL development environment for the PowerPC Macintosh is available as a double-clickable application. (There’s an x86-64 version, too, but it won’t be released until Leopard comes out.)

See the announcement.

Playing sounds from the command line

June 5th, 2007

On the NeXT machine, there was a command called sndplay that would play .snd files from the command line.

It’s not too tough to put together a similar one for Mac OS X. We can play many more kinds of sounds than the old NeXT sndplay command did. On the other hand, since NSSound uses QuickTime to play some media formats, a run loop is required for sounds to keep playing, so that requires a few gyrations.

sndplay.m

(See also this cocoa-dev message)

Command-line compiling of Cocoa code

May 30th, 2007

I find that tools like Xcode are often too heavyweight when trying out little fragments of code. In these cases, it can be simpler to use the traditional Unix tools to edit, compile, and run tiny test programs. (Of course, it might be simpler for me because I’m used to the traditional Unix way.)

For instance, say that you have created a category on NSMutableArray that adds a method to reverse the contents of the array, and you’re ready to test it out.

You could create an Xcode project for this (you’d use the Foundation Tool template), but it’s also possible to use a Unix text editor (like emacs or vi) to create the file that contains your category, together with a simple main() function. (Here is an example file.)

There are basically two tricks to know. The first is that you need to create an autorelease pool before you call any methods, or else you’ll see warning messages about leaking objects. The other trick is how to compile the file, and that is simply

cc file.m -framework Foundation

Now, just run a.out. Debug, edit source, re-compile, repeat.

The Pleasure of Interactivity

February 21st, 2007

What’s so great about Lisp? This question is frequently asked by people who wonder what such a weird-looking language could offer.

The usual response is often “macros.” As a one-liner, this is probably a fair answer. However, it’s not very enlightening, since the questioner isn’t going to have any idea what a Lisp macro is, or what it can do. Telling someone that penicillin is great because it is an antibiotic isn’t very useful if the questioner has no idea what bacteria are or what role they play in causing illness.

The thing *I* like best about using Lisp is that it’s interactive.

My current project is a Mac OS X application written in Objective-C. I am using Apple’s Xcode developement environment, which contains numerous fancy features.

Yet, I still have an emacs running, talking to an inferior lisp via SLIME.

Sometimes I use the lisp as a calculator. Sometimes I’ll get a function working in Lisp and then re-write it in C for insertion into my application. Sometimes I’ll re-write some troublesome C function in Lisp and debug it from the lisp.

Having the ability to do this sort of work interactively and incrementally is a real pleasure. No recompiling files, no special debugger commands, just the whole Lisp environment at your fingertips all the time.

Probably prime numbers

February 9th, 2007

I was browsing through a copy of the New Turing Omnibus, and ran across the article on detecting primes.I grabbed an algorithms text, and implemented the Miller-Rabin primality test in Common Lisp. Read the rest of this entry »

Determining the default route

February 5th, 2007

A subscriber on cocoa-dev@lists.apple.com was asking whether there was some way to programmatically determine the default route.

The cheap and sleazy way would be to read the output of netstat, but that’s not very aesthetically appealing.

Here is some sample code, derived from the source for netstat, which will find and print the IPv4 default route.

Getting disk insertion/removal notifications

February 5th, 2007

On Mac OS X, there is a daemon called diskarbitrationd that can notifiy interested clients of the appearance of disks and filesystems. Users talk to diskarbitrationd via the Disk Arbitration framework.

For some reason, the current developer documentation doesn’t have much to say about this framework. It therefore appears to be necessary to grovel through the headers to figure out how to use it. Fortunately, the headers are well-commented.

I was able to throw together a trivial program in about 15 or 20 minutes from first looking at DiskArbitration.h. Maybe this will help someone out there.

#include <stdio.h>
#include <DiskArbitration/DiskArbitration.h>

void hello_disk(DADiskRef disk, void *context)
{
    printf("disk %s appeared\n", DADiskGetBSDName(disk));
}

void goodbye_disk(DADiskRef disk, void *context)
{
    printf("disk %s disappeared\n", DADiskGetBSDName(disk));
}

main()
{
    DASessionRef session;

    session = DASessionCreate(kCFAllocatorDefault);

    DARegisterDiskAppearedCallback(session, NULL, hello_disk, NULL);
    DARegisterDiskDisappearedCallback(session, NULL, goodbye_disk, NULL);

    DASessionScheduleWithRunLoop(session,
        CFRunLoopGetCurrent(), kCFRunLoopDefaultMode);

    CFRunLoopRun();

    CFRelease(session);
    exit(0);
}