Concurrency in newLISP

Threaded applications are considered to be so difficult to implement properly that programmers are often encouraged to avoid them unless absolutely necessary. This is because of the difficulty in synchronizing data between threads. There are several techniques to accomplish this. Typical solutions involve sharing an area of memory or a pipe (a channel that maps one process’ input channel to another’s output channel) in order to pass state between the two threads. The difficulty with this solution is that trouble can arise if two threads attempt to write to that same area of shared memory at the same time. One solution to this is to use a semaphore to signal threads that it is safe to write to shared memory. Other solutions involve locking the memory with a mutex lock (mutually exclusive lock).

On unix-based systems, newLISP uses the operating system’s fork to create a child process. State may be shared between two processes using either pipes or semaphores and shared memory. It is slightly faster to use shared memory and semaphores, but more difficult to track and coordinate. The process of forking the evaluation of an expression and accessing the result can be made easier with a couple functions to handle the messy business of coordinating access to shared memory. The first, which we will call spawn (since I like Erlang), will fork a child process that will evaluate an expression, allocate a shared memory address, and create a semaphore to signal the process.

(define-macro (spawn)
  (letn ((expr (args 0))
         (sem (semaphore))
         (res (share))
         (pid (fork (begin
                      (let ((res-val (eval expr)))
                        (semaphore sem -1)
                        (share res (source 'res-val))
                        (exit))))))
    (list pid sem res)))

Notice that block that the function is actually forking. After evaluating the expression passed to spawn, it blocks by setting the semaphore to -1. A positive value will cause it to unblock and evaluate (share res (source ‘res-val)). (source) serializes the value of ‘res-val into a string (which will actually assign res-val when it is evaluated). It then exits. The function itself returns a list of the pid of the forked process, the semaphore and the shared memory address.

The spawn process can be used like so: (setq proc (spawn (+ 3 6)))

Then we need a function that will read in the result from the spawned process. It must block until the process is complete, and then attempt to read in the value from the shared memory. Since we are using fork (which is unix-only, sorry!), we may as well clean up the shared memory afterward (which also only works on unix).

(define (receive proc , res-val serialized-result)
  (let ((pid (proc 0)) (sem (proc 1)) (res (proc 2)))
       (semaphore sem 1)
       (wait-pid pid)
       (setq serialized-result (share res))
       (share nil res) ; release shared memory
       (semaphore sem 0) ; release semaphore
       (if (catch (eval-string serialized-result) 'res-val)
           res-val 'invalid-result)))

Here, we pass receive the result of the spawn function. The value returned will be either the result of evaluating the expression passed to spawn or the symbol, ‘invalid-result. This function first unblocks the spawned process, allowing it to store its result in shared memory, blocking until it completes. It access the shared memory and then releases the shared memory address and semaphore.

The last statement is slightly tricky (if you are unfamiliar with newLISP error handling). It attempts to evaluate the serialized result pulled from shared memory. In the case of our example above, (setq proc (spawn (+ 3 6))), this would expand to something like (set 'res-val 9). Remember that our statement above used res-val to store the result for serialization. To avoid side effects, we declare res-val locally in our arguments (placing local variables after a comma in a lambda list is a newLISP convention to create local, nil variables; it is no different than declaring them as nil in a let statement or using the function local) and catch the result in it. This is a bit redundant. It would expand like, (catch (eval-string (set 'res-val 9)) 'res-val), but the effect is the same and the overhead of an extra set is negligible.

These functions do not provide the power of Erlang’s concurrent syntax (especially since they use OS threads, which are inefficient, although this is mitigated somewhat by the size and speed of newLISP). Instead, they achieve the desired result of simplifying the spawning of a child process (non-blocking) and accessing the result. For many purposes, the complexity of formulating expressions in a concurrent fashion outweighs the gains (which sometimes are not so great) of concurrent programming. By simplifying and automating the process, we negate that particular difficulty and can use the method that provides the most efficiency.

Well, on unix, anyway.

Leave a comment | Trackback
Sep 18th, 2007 | Posted in Programming
No comments yet.