ASCIIMath creating images

Tuesday, July 24, 2012

Simple multithreading using clone()

For a bit of hacking I want to do, I need to implement concurrency between two processes sharing the same memory.  So I looked into how to do it (this is all under Linux). My first stab was to look at the old SYSV shared memory interface - but that was kind of ugly, then I looked at pthreads and thought the same. Isn't there a simple way to do fork() without splitting the memory?

Turns out, Linux does have such a mechanism, in the clone() function call. However, there are some pitfalls that one needs to be aware of. Let's see the code first, though.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <linux/sched.h>

Pretty standard stuff, but note the last one: it's important to pick that rather than <sched.h> (as the man page claims), since otherwise you don't have the necessary constants defined!
Next, we need some memory in global space:
#define STACKSIZE 256
int globalint;
char stack[STACKSIZE];
The lone int is the only memory that I'm using to communicate between the threads for now. The stack however, is only used by the child thread/process. It is visible to the parent, but it's not easy (or reliable) to use it to communicate with the child. Now let's define the code for the child process.
int secondp()
{
  int i;

  for (i=0; i<1000; i++) {
    globalint = i;
    usleep(60000);
  }
  return 0;
}
Simple enough: every 60 ms, set the global variable to the value of of the local one being incremented. Note I am NOT "incrementing" the global, since that would be a read-write operation, which gets computer science people all excited in multiprogramming situations! (Or it did, 70 or so years ago.)
Now let's have the main program.
int main()
{
  int i, j, sppid;

  for (i=0; i<STACKSIZE; i++) stack[i] = 0x55;
I initialize the stack so I can observe what happened to it during execution, filling it with a simple 010101... pattern. Next, the interesting bit.
  sppid = clone( secondp, &stack[STACKSIZE], CLONE_VM, NULL );
  if (sppid==-1) {
    printf("clone error.\n");
    exit(1);
  }
  printf("clone pid 0x%08x\n", sppid );
The first line creates the child process, given the pointer to the function as the first argument. The second argument is what that process gets as a stack - but note that the pointer points to the TOP of the stack! This is x86 specific and MAY (or may not) be different on other architectures (ARM? amd64?).
The third argument, the options, is what creates the magic to make this shared memory scheme work. Without it, clone() behaves more like fork(), giving the child a copy (-on-write) of the parent memory space, which is exactly what I don't want. Other options allow you to copy or share specific elements, like the file I/O table etc. Read the documentation. Lastly, the following arguments are passed to the child function - useful in many instances, but not used here.
I finish up the program with code that actually demonstrates that things are happening as expected:
  for (i=0; i<10; i++) {
    printf("%d\n", globalint);
    fflush(NULL);
    usleep(1000000);
  }

  for (i=0; i<16; i++) {
    printf( "%04x :", i<<4 );
    for (j=0; j<16; j++) {
      printf( " %02x", stack[j+(i<<4)]&(0xff) );
    }
    printf( "\n" );
  }

  return 0;
}
So, for 10 seconds, the value of the global integer is printed; after that, I dump the stack in a hexdump fashion. This is what I get on my Ubuntu Netbook:
clone pid 0x000005e3
0
16
33
49
66
83
99
116
132
149
0000 : 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55
0010 : 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55
0020 : 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55
0030 : 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55
0040 : 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55
0050 : 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55
0060 : 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55
0070 : 55 55 55 55 55 55 55 55 55 55 55 55 48 f2 0e 08
0080 : e0 8e 04 08 00 00 00 00 6f 58 08 08 cd 84 05 08
0090 : 08 f2 0e 08 00 00 00 00 55 55 55 55 55 55 55 55
00a0 : 55 55 55 55 55 55 55 55 00 00 00 00 00 87 93 03
00b0 : 55 55 55 55 55 55 55 55 55 55 55 55 03 8f 04 08
00c0 : 60 ea 00 00 55 55 55 55 55 55 55 55 55 55 55 55
00d0 : 55 55 55 55 55 55 55 55 55 55 55 55 a6 00 00 00
00e0 : 55 55 55 55 00 01 00 00 00 00 00 00 2e 96 05 08
00f0 : 00 00 00 00 55 55 55 55 55 55 55 55 55 55 55 55
Clearly, the global int is modified by the child process while being read by the parent. The stack display is interesting, showing clearly how it's being filled towards lower memory. A x86 expert could probably explain clearly what is there and why the stack is not written to contiguously (presumably, some of it is allocated byt never written to).
Thus, the three main things to keep in mind when using clone() are:

  • Make sure you use the right sched.h file.  (You'll notice this at compile time)
  • Choose the proper options to clone()
  • Make sure you know which way the stack grows, and how big it'll get!  If you get this wrong, it'll clobber the parent variables - if you use malloc() instead, you'll get a memory fault (which is preferrable since it's slightly easier to diagnose).
Enjoy!