OpenGL video editing hack

video post result shot
Hello, just a quick hack report cause I really liked how useful it turned out to be.

It all started when I located a small program I wrote, for an interesting coursework assignment, back when I did my graphics MSc at the University of Hull. I wanted to upload a video capture to youtube to show 4rknova who’s going through the same MSc course right now.

So I did capture the video, and saved it as an image sequence for further editing, because I wanted to add titles at the bottom describing what was demonstrated at each part of the video (the program was basically a sequence of arbitrary shader effects).

But how was I supposed to add the captions? The thought of wrestling with one of those fucking GUI video editing programs made me cringe. They are all slughish, heavy and unweildy, and I always have to fight for a few hours to do even simple things. I was more inclined to use ffmpeg from the command line, but then adding transitions to the captions, like having them fade and slide in from below would be a complete pain in the ass. Btw take a look at the final video to understand what I was trying to achieve with the caption transitions. Should be really simple right?

But then it dawned … hey, I could easily write that transition in a couple lines of C/OpenGL code instead of fighting with all those video editing programs! I started by writing a simple program that iterates over an input image sequence, opening each one in turn and then feeding them one by one to a dlopened plugin which could do whatever it likes with the frame. When that plugin processing function returns, I just dump the image back to the disk. It’s that simple!

Then the plugin was almost trivial as well. I just drop the frame into the OpenGL framebuffer and use my new text rendering library to draw the captions at the appropriate times with the appropriate alpha and position. The timing was derived by a simple event script file containing the frame numbers where each part starts and ends. I fed the script into my new event sequencing (demosystem) library which gives me back a nice linearly increasing [0, 1] value for each event (part) during the time when it is active according to the script. That makes it a piece of cake to fiddle with trig and some factors here and there to transition my captions just the way I wanted.

Here’s the code in case you want to play around with it:

I’m impressed, it’s fucking awesome and powerful to edit videos like that, and I’m definitely going to use it again.

Stereoscopic fun on iOS

[Edit: this app is now available on the appstore, and has a dedicated web page]

The fun never stops with stereoscopic rendering. I posted previously about my earlier attempts with anaglyphs and shutter glasses, and all that was really fun, but not without drawbacks. Shutter glasses are awesome, but the only computer I have with a stereo output is an old SGI workstation, which isn’t up to the task to render modern 3D graphics, and doesn’t event give me stereo OpenGL visuals and a depth buffer at the same time. Anaglyph glasses are cheap and work everywhere, but they mess up the colors and they have a serious problem with ghosting, ruining the stereoscopic effect.

My3D ipod stereo tunnel
So, it was with great enthousiasm that I learned there’s a cheap and simple stereoscopic viewing contraption for the iphone produced by hasbro. It’s sortof like a viewmaster, only instead of cardboard reel with stereoscopic pictures, it has a place to attach an iphone or ipod touch on the back of it, using it as the source of the stereoscopic image presented to the user. What needs to be done iphone-side is simple enough. Just make it display a stereo pair side by side in a split-screen. The only drawback of this approach, is that since the iphone display is split in half, the achievable aspect ratio is slightly less than 1 which has an impact on immersion, making the perception more like looking through a squarish window into the 3D world rather than being surrounded by it. Still very impressive for a 28 dollar plastic widget.

Buying this apparatus gave me the final push I needed to get onto iOS programming. I find Objective-C unspeakably ugly and the Apple APIs needlessly convoluted, which was why I kept pushing this back, but I really wanted to see my code in glorious stereoscopic … glory, so I bit the bullet and ported over the stereoscopic tunnel program I’ve written originally for the SGI when I bought the shutter glasses.

The result is awesome; full stereo 3d without color degradation on modern programmable graphics hardware. Unfortunately one has to use the crippled version of OpenGL that’s become so popular on mobile devices lately: OpenGL ES 2.0 (see webgl post for my rant on that issue), but it was easy enough to make a wrapper that brings back immediate mode and the matrix stack.

In case you’d like to play around with the code, here’s a tarball. Feel free to use it under the terms of the GPLv3. It includes an Xcode project that compiles it for the iphone and a makefile for normal systems. If you run the program on your iphone tap anywhere on the screen to go to the options GUI to enable stereo rendering or change between the simple and the normal-mapped tunnel (keys s and b on the PC version).

WebGL hacks

webgl julia quaternion raytracerThat’s it, I finally found something fun in web development! I never thought I’d live to see the day when I would feel the motivation to learn javascript, but here we are. WebGL is fun, because you can do all the things you could with regular OpenGL, but now you can send URLs to all your friends to show off. I can’t say I liked javascript really, but I guess it’s passable as long as you can avoid the horrible conventions people have established for pretending to write object-oriented code with it.

So what I did, after experimenting to see how WebGL and javascript programming works, is a port of a GPU-raytracer for 4D quaternion Julia fractals, and a simple 360-panorama viewer. You can find those on my webgl hacks page I put up yesterday.

About WebGL itself now, I’m really disappointed they chose to base it on OpenGL ES 2.0, which is the bastard child of a slashed down OpenGL subset initially spec’ed for fixed point embedded devices, and Khronos’ OpenGL >= 3 d3d10-buttlicking madness. I understand why they chose that, because they intend to have WebGL easily implementable on mobile phones and tablets, but I’m still disappointed.

For those of you not well versed in the differences between the various OpenGL versions that suddenly crept up when Khronos group took control of OpenGL and apparently surrendered it over to inmates of the nearest insane asylum, I’ll give you a short overview of what sucks in OpenGL >= 3.x, OpenGL ES 2.0, and WebGL:

  • No fixed function pipeline. Yeah I know shaders are awesome, I love them too. But it’s convenient to be able to put a goddamn texture on a quad without having to write a bloody shader for it. OpenGL is not just used for video games you know.
  • No immediate mode (glBegin). Again, yes immediate mode is slow if you use it to draw multimillion vertex meshes, but having to make a vertex buffer for a quad representing a button in a GUI or a simple overlay, is insanity.
  • No matrix stack. Obviously when I’m writing a full 3D engine, with hierarchical keyframe animation, I have to ignore the matrix stack and write my own quaternion/matrix code. But for everything else, the OpenGL matrix functions are unbelievably useful.
  • No GL_QUADS.

So anyway, while I was playing around with it these past few days, I had to bring back a little bit of sanity to WebGL. For that reason I wrote SaneGL, a small piece of code that implements immediate mode drawing, and the OpenGL matrix stack on top of WebGL. I bundled that along with a small matrix math library and some helper functions for WebGL programs in a project called webgl-tools, which you can find in my mercurial repository:

Oh by the way, if you’re one of those misguided sods that keep using windows, and you try to run any webgl apps right now you will probably be disappointed. In an unprecedented inspiration of pure stupidity, both firefox4 and chrome chose to implement WebGL over Direct3D by default on windows, using a project called ANGLE. The reason for that, they say, is that most graphics card vendors provide buggy OpenGL implementations on windows, so apparently it makes sense to write an even more buggy OpenGL->D3D translator and use that.

Initially I thought that ANGLE fails to translate huge shaders such as the one on my fractal raytracer, but in fact it seems to fail on pretty much everything, complex or trivial. The only way for windows users to use WebGL at the moment until mozilla and google comes to their senses and make ANGLE a fallback for known buggy OpenGL implementations instead of the default choice, is to go and force the browsers to use OpenGL instead. On firefox you can do that by setting the about:config variable “webgl.prefer-native-gl” to true, while chrome requires the command-line argument: –use-gl=desktop.

On GNU/Linux, as long as you have an nvidia card everything should be peachy from the get-go. Other cards are apparently blacklisted by firefox, so you’ll have to set webgl.force-enable to true, and pray to Odin.

Escaping glutMainLoop

Let’s say you’re writing a distinctly glut-like window-system abstraction library for OpenGL context creation, event handling, etc. For those not familliar with the way one uses OpenGL to draw graphics, what happens is you talk to the native window system (X11, Win32 API, etc) to create a window and process events, then you create an OpenGL context and you bind it to that window using again platform-specific calls (GLX, WGL, etc).

So let’s say you’re writing that code, but you decided your library will allow the user to keep control of the main loop, so you provide a funcion called something like process_events to run a single iteration of your event processing, so that the user may call it in a loop. How do you implement that on top of glut, which has a single glutMainLoop function that doesn’t ever return?

By the way, for those qurious on why would you do that in the first place, the reason to write a glut backend for this library, would be as a catch-all fallback to be able to run on platforms for which no native backend is yet written.

On GNU/Linux systems generally we don’t have the original GLUT, but rather FreeGLUT, which is nice enough to provide a glutMainLoopEvent function which runs a single iteration of the event loop, so we just call that from process_events and we’re done. But I have actually written an X11/GLX backend for my library, so I don’t need GLUT there, I need it on other systems. So how to break the chains of glutMainLoop and return after each iteration of the event handling loop?

The solution is obvious, use setjmp/longjmp. In process_events we call setjmp which obviously returns 0 the first time around, in which case glutMainLoop is called. Now glut enters its infinite loop and waits for events from the window system. As soon as all pending events are processed, or if there are no events to be processed, it calls our idle callback, then when that returns it loops back to the top again, and again, and again.

Of course we set up an idle callback that doesn’t actually return. It calls the user’s idle callback if there’s one registered, and then calls longjmp which unwinds the stack until we end up back into process_events at which point setjmp returns non-zero and we return execution to the user.

One minor issue that needs to be addressed is that since we set an idle function, if the user didn’t set one with our library, we’re wasting cpu cycles busy-looping because GLUT will never block waiting for events when there’s an idle callback. This again is easily remedied. If the user didn’t register an idle function with us, we don’t actually set our idle function to GLUT a-priori, instead we wait for one of the other event callbacks to trigger, and at the end of those callbacks we set the idle callback, and remove it again when it gets called, before longjmping back to the user.

Here are a few snippets of the actual code demonstrating the above:
Read the rest of this entry »

Linuxtrack 6dof headtracking for wine games

linuxtrack-wine logoSince I was a little kid, I always loved airplanes. When I became a little older, mainly during the 90s I used to play a lot of flight simulators on my computer, I even had a set of decent flight controls (stick/throttle), but for some reason I dropped that hobby for many years. Until I very recently picked it up again.

One really important thing that changed during my abstinence from flight simulators, a huge change that transformed the whole experience, was the almost universal adoption of 6dof headtracking for looking around as you fly!

Now people are able, with simple intuitive movements of their head to be able to look outside as they fly above that beautiful lake, “check six” to effectively maneuver to avoid an enemy plane in a dogfight, or follow the runway with their own eyes as the airplane turns slowly into final approach to line up perfectly for landing! Even better, since 6dof headtracking includes translation as well as rotation, the user can look around an obstacle blocking the view, to see for instance a pesky instrumment in the panel that’s partly hidden behind the stick, or a plane in formation which happens to fly just where the canopy frame happens to have a metal support bar. Just moving the head a bit to the left or the right does the trick… Unbelivable!

Instrumental for the universal adoption of 6dof headtracking among flight simulator users and developers, is a company called NaturalPoint who sells a complete head-tracking system called TrackIR, that includes an infrared high framerate camera, markers that the user attaches to their heads, and supplies an API to game developers to access their headtracking data easily. Now that set doesn’t come cheap, so there’s the necessary free alternative out there, that works with a simple (or even better modified) webcam, called freetrack. The main problem with both of those as you might have guessed, is that they only work on windows.

After the first dissapointment, I obviously had to have that functionality, so I decided to start hacking my old 3dof headtracking experiment to make it 6dof and connect it somehow with games running through wine. However, while I was researching how to do that, I stumbled upon the linux-track project, which does exactly what I needed, but it only worked with a native GNU/Linux flight simulator called x-plane.

So, with only a small piece of the puzzle missing, I went on and wrote a program that emulates the TrackIR API which is supported by many windows games, but feeds them data from linuxtrack instead. Currently I’m happily playing IL-2 Sturmovik and Falcon4 AF through wine, with full head-tracking support, enjoying the virtual view from my cockpit.

This new project of mine is called linuxtrack-wine and is available under GNU GPLv3.

I also had to do a hardware hack, to convert my old flight controllers from gameport to USB, but that’s much less interesting, and I’m too lazy to write about it right now :)

Wacom Hacking

Recently, I acquired a rather old Wacom Intuos A5 tablet, for reasons I won’t delve into at the moment.

Wacom Intuos A5 Serial

It’s an old serial model, and although I don’t have a serial port anymore, I do have a USB-to-serial adapter, which thought would probably work fine with the driver produced by the linuxwacom project, since they do mention support for serial tablets on their website.

I plugged it in, followed the linuxwacom documentation about setting everything up in xorg.conf, and fired up the X server. Nothing happened… So I started digging in the wacom driver source.

First Obstacle: Prolific PL2303 USB Serial Port

The first problem I encountered was with my USB-serial adaptor. Apparently, the wacom driver used a TCIOCGSERIAL ioctl, which is supposed to return information on a serial device, to determine if it’s talking to a serial or a USB device. If the ioctl fails, it thinks it’s not talking to a serial device, and starts talking USB HID to the tablet.

Unfortunately, the pl2303 driver in the linux kernel (as of version that I tested) does not implement the aforementioned ioctl, even though it’s supposed to present a proper serial port to the system. So I had to hack the kernel, and add that ioctl in the pl2303 driver (patch against

(edit: my pl2303 patch got included in linux 2.6.34, so you don’t need the patch any more)

Sadly however, although now the wacom driver realized that it was dealing with a serial device, the tablet still didn’t work…

Second Obstacle: The linuxwacom Driver

After a lot of digging around the source of multiple wacom driver versions, it turns out the guys at the linuxwacom project removed the code for old serial tablets from the driver sometime last year!

I tried installing a few older versions of the driver (anything prior to 0.9.1 had the serial code), but they would not compile with the much newer headers in my system. Apparently some changes with the XInput extension broke the old driver. So I started hacking again.

First I tried back-porting the XInput changes from the newer wacom driver to the old 0.8.4 version, but although I managed to compile it, it would segfault upon being loaded by the X server. Evidently I screwed something up during the backport, which I could have probably fixed eventually, but I realized that it’s a bad idea to get stuck to the old wacom driver.

So instead I started from scratch in the opposite direction, re-integrating the old serial code from 0.8.4 to the most recent 0.10.4 wacom driver.


And indeed after a few hours of hacking, I managed to get my serial Wacom Intuos to work perfectly! Here’s the patch in case anyone needs it.

I must say it’s an impressive input device, pressure-sensitivity, tilt sensing, everything works perfectly. The only problem is that I still can’t draw any more than I could on paper :)

Stereoscopic OpenGL part2

Me with my shutter glassesMy obsession with stereoscopic rendering continues unabated. It’s just so fucking cool to write a bit of code and have 3D objects pop outside of your monitor and float above your keyboard.

Fact is, I couldn’t settle with my crummy anaglyph glasses see previous post. I had to try out proper shutter glasses and quad-buffer OpenGL visuals.

Thanks to nvidia’s policy of supporting quad-buffer visuals and stereo ports only on expensive Quadro graphics boards, and the proliferation of flat panels which are entirely unsuitable for use with shutter glasses due to ridiculously low refresh rates, I couldn’t do that with my PC. On the other hand, my trusty Silicon Graphics Octane2 workstation was more than up to the task as it comes with a stereo synchronization port and quad-buffered OpenGL support out of the box.

So off I go to ebay, where I bought the cheapest lcd shutter glasses I could find, the ASUS VR100 glasses which came once upon a time bundled with some expensive ASUS TNT2 graphics cards as a high-end gimmick.

SGI to ASUS VR100 adaptor circuitConnecting these glasses to SGI workstations has been done before and it was a piece of cake to follow that guy’s schematic and construct the necessary circuit to translate the signals from the SGI stereo port to those required by the shutter glasses.

The only problem I’ve had, is that my Octane2 has the low-end V6 graphics option, which apparently doesn’t provide a z-buffer when using stereo visuals.

3D tunnelNow I didn’t feel like z-sorting all polygons like the good old days when z-buffering was too expensive to use on underpowered PCs while doing software polygon rendering, so I tried to figure out a couple of graphics hacks that I could do which wouldn’t require a z-buffer to look right. So I came up with this swirling tunnel, and a simple wireframe teapot.


Get every new post delivered to your Inbox.