Calculating hierarchical animation transformation matrices

This is a short post with my thoughts on what’s the best way to design a transformation hierarchy for animation, and how the “ideal design” changes over time.

Bottom-up lazy evaluation magic

For a long time, I was a big fan of backwards (bottom-up) lazy evaluation of transformation hierarchies. Essentially having an XFormNode class for each node in the hierarchy with a get_matrix(long msec) function which calculates the current node’s transformation matrix for the current time, then calls parent->get_matrix(msec) and returns the concatenated matrix.

Matrix4x4 XFormNode::get_matrix(long msec) const
{
    Matrix4x4 xform;
    calc_matrix(&xform, msec);

    if(parent) {
        xform = parent->get_xform(msec) * xform;
    }
    return xform;
}

Of course, such a scheme would be wasteful if these matrices where calculated every time get_matrix functions are called. For instance if a node is part of a hierarchy, then its get_matrix will be called when we need to draw the object corresponding to this node, and also every time the get_matrix of any one of its descendants is called, due to the recursive bottom-up evaluation of matrices outlined previously. If one considers the posibility of drawing an object multiple times for various multi-pass algorithms the problem gets worse, with the limiting worse case scenario being if we’re doing ray-tracing which would require these functions to be called at the very least once per ray cast.

It follows then, that such a design goes hand in hand with lazy evaulation and caching of calculated node matrices. The XFormNode class would hold the last requested time and corresponding matrix, and when get_matrix is called, if the requested time matches the last one, we just return the cached matrix instead of recalculating it.

const Matrix4x4 &XFormNode::get_matrix(long msec) const
{
    if(msec == cached_msec) {
        return cached_matrix;
    }

    calc_matrix(&cached_matrix, msec);
    cached_msec = msec;

    if(parent) {
        cached_matrix = parent->get_xform(msec) * cached_matrix;
    }
    return cached_matrix;
}

This worked nicely for a long time, and it worked like magic. At any point I could just ask for the transform at time X and would get the matrix automatically, including any effects of hierarchy, keyframe interpolations, etc. It’s all good… until suddenly processors stopped getting faster any more, moore’s law went belly up, and after the shock passed we all sooner or later realised that single-threaded graphics programs are a thing of the past.

Multithreading pissed on my rug

In the brave new multithreaded world, lazy evaluation becomes a pain in the ass. The first knee-jerk reaction is to add a mutex in XFormNode, and lock the hell out of the cached matrices. And while that might be ok for an OpenGL program which won’t have more than a couple of threads working with the scene database at any point (since rendering itself can only be done safely from a single thread), it throws out of the window a lot of concurrency that can take place in a raytracer where at any point there could be 8 or more threads asking for the matrix of any arbitrary node.

A second way to deal with this issue is to have each thread keep its own copy of the matrix cache, keeping it in thread-specific memory. I’m shamed to admit I never got around to doing any actual performance comparisons on this, though I’ve used it for quite some time in my programs. In theory it avoids having to wait for any other thread to access the cache, so it should be faster in theory, but it needs a pthread_getspecific call in every get_matrix invocation which comes with its own overhead.

const Matrix4x4 &XFormNode::get_matrix(long msec) const
{
    MatrixCache *cache = pthread_getspecific(cache_key); // cache_key created in the constructor for each node
    if(!cache) {
        // first time we need a cache for this thread we'll have to create it
        cache = new MatrixCache;
        pthread_mutex_lock(&cache_list_lock);
        cache_list.push_back(cache);
        pthread_mutex_unlock(&cache_list_lock);
        pthread_setspecific(cache_key, cache);
    }

    if(msec == cache->msec) {
        return cache->matrix;
    }

    calc_matrix(&cache->matrix, msec);
    cache->msec = msec;

    if(parent) {
        cache->matrix = parent->get_xform(msec) * cache->matrix;
    }
    return cache->matrix;
}

This works fine, and although we managed to avoid blocking concurrent use of get_matrix, we had to add some amount of overhead for thread-specific storage calls, and the code became much more complex all over the place: invalidations must also access this thread-specific storage, we need cleanup for the per-thread MatrixCache objects, etc.

Return of the top-down evaluation

So nowadays I’m starting to lean more towards the simpler, less automagic design of top-down evaluation. It boils down to just going through the hierarchy once to calculate all the matrices recursively, then at any point we can just grab the previously calculated matrix of any node and use it.

void XFormNode::eval(long msec)
{
    calc_matrix(&matrix, msec);

    if(parent) {
        matrix = parent->matrix * matrix;
    }

    for(size_t i=0; i<children.size(); i++) {
        children[i]->eval(msec);
    }
}

The simplicity of this two-pass approach is hard to overlook, however it’s just not as good for some things as my original ideal method. It works fine for OpenGL programs where it suffices to calculate transformations once per frame, it even works fine for simple raytracers where we have again a single time value for any given frame. However it breaks down for ray-tracers doing distribution ray tracing for motion blur.

No rest for the wicked

The best way to add motion blur to a ray tracer is through a monte-carlo method invented by Cook, Porter and Carpenter, called “distribution ray tracing“. In short, when spawining primary rays we have to choose a random time in the interval centered around the frame time and extending to the past and future, as far as dictated by the shutter speed of the camera. This time is then used both to calculate the position and direction of the ray, which thus might differ between sub-pixels if the camera is moving, and to calculate the positions of the objects we’re testing for intersections against. Then if we cast many rays per pixel and average the results, we’ll get motion blur on anything that moves significantly while the virtual shutter is open (example from my old s-ray renderer).

It’s obvious that calculating matrices once per frame won’t cut it with advanced ray-tracers, so there’s no getting rid of the complexity of the lazy bottom-up scheme in that case. Admittedly, however, caching won’t do much for us either because every sub-pixel will request the matrix at a different time anyway, so we might as well just calculate matrices from scratch all the time, and skip the thread-specific access overhead. The jury is still out on that one.

Do you have a favourite design for hierarchical animation code? Feel free to share it by leaving a comment below!

OpenGL stereoscopic anaglyphs and patents

An anaglyph is a combination of two images into one, in such a way that they can later be separated by viewing the image through appropriately colored transparent filters. The objective is to present slightly shifted views of the same 3D environment to each eye, in order to achieve depth perception (i.e. really perceive the 3rd dimension).

anaglyph glasses

I’ve never dealt with anaglyphs in the past, but during my recent week-old obsession with stereoscopy, I’ve stumbled upon a pair of free anaglyph viewing glasses (made out of cardboard and cellophane of course). So I couldn’t help but try to find out how I can use them with my own programs.
Read the rest of this entry »

Raytracing Anamorphic Images

A long time ago, I stumbled upon a couple of strikingly odd images on Jim Arvo’s web site, which are apparently called “anamorphic”. The idea behind an anamorphic image, is that it’s distorted in such a way, that its true shape can be seen only when viewed in a particular manner. In the case of these images, you’re supposed to print the image and place a highly reflective cylindrical object, such as a chrome pipe, at a specific marked location in order to see the geometric shapes correctly.

I kept the images back then, with the purpose of either finding an appropriate cylindrical object, or raytracing them to see what they look like, but for some reason I’ve forgotten all about them until I accidentally found them again yesterday, in a dusty corner of my filesystem.

So I decided to hack some code to add perfect (x^2 + y^2 = r^2) cylinder primitives to my raytracer and do a couple of renderings with those images texture-mapped onto a ground quad (I could just do the same thing with another raytracer such as pov-ray but where’s the fun in that?).

So anyway here are the anamorphic images along with the renderings (click on the images for the full rendering):

Introductory OpenGL tutorials continued

Just a short notice, the second part of my “introduction to 3D graphics with OpenGL” series should be available as we speak. This time, we’ll perform the full set of transformations that we described while discussing the rendering pipeline in the previous issue. We’ll use the matrix stack to separate the model from the view parts of the modelview matrix, and render multiple objects properly. And finally we’re going to explain the mathematical model of shading and illumination, and we’ll apply lighting to our object in order to increase the realism of our simple 3D environment tremendously.

So, go and grab a copy of the november-december issue of the greek linux format magazine, and let me know what you think. As always I look forward to your comments, suggestions, corrections, etc.

By the way, due to popular demand, I will upload the first tutorial of the series in a couple of weeks, after the previous issue of linux format goes out of circulation.

Introductory OpenGL tutorials

I recently started writing a series of introductory tutorials about graphics programming with OpenGL, for the greek linux format magazine.

The articles are written for the complete begginer, who hasn’t had any previous exposure to graphics programming. However, familiarity with the C programming language is definitely required.

What I’m aiming for, is to thoroughly explain the underlying theory, in order to provide a stepping stone for someone who would like to eventually delve deeper into graphics algorithms, rather than just present raw examples for doing this and that with OpenGL.

In any case, the first article of the series will be published in the september-october issue of the greek linux format magazine, which should be available during the next few days. Any feedback, is greatly appreciated.

Follow

Get every new post delivered to your Inbox.