Striate Cortex: Advanced Pixel Manipulation


As most Processing experiments, the Striate Cortex Series started with a small investigative sketch. Triggered by a book I was reading called Cutting Edges: Contemporary Collage, I set out to take the technique of collage into digital generative territories. The code I’ve written, which so far has led to two short films, actually consists of two seperate sketches at the moment. Whether or not the result can be called collage, it’s definitely a unique and completely dynamic mixture of still images. In fact when viewing the final films, one hardly notices that these are just moving pixels from a series of static images. All the images used in this project have been released under the Attribution 2.0 Generic Creative Commons licence. For full attribution see this document.

The big picture
In this post I’ll give a global description of the whole project and dive a little deeper into the code of the second, smaller app. Note that in the forthcoming Processing 2.0 release there will be changes in the XML classes, so the required approach may change. Let me start with an overview of the steps from start to finished films covering both apps. Steps 1 to 11 take place in the first app, the subsequent steps happen in the second app. And on a sidenote, in real-life happened a few months later. ;-)

Simplified steps
1. The user types one or more search terms and presses enter to start the search.
2. According to these search terms and several other criteria (creative commons) the program searches the Flickr database through the Flickr API.
3. An XML is generated by the Flickr API with all the information requested.
4. The program reads the XML, interprets the information and checks the dimensions of the images.
5. 100 thumbnails are downloaded for the approved images (see screenshot above).
6. As the thumbnails are downloaded they appear in the main view.
7. The user can select these thumbnails and/or search using new search terms.
8. As thumbnails are selected they are sorted to the left and the high resolution images are downloaded.
9. When the high resolution image has been downloaded, pixels start moving in the background.
10. The user can influence the speed, direction and volatility of the pixel movement using mouse controls.
11. The user can output to an image sequence. Information regarding source images is saved to a text file.
12. The user chooses two images sequences and can set their respective offset.
13. One image sequence runs in the background, one in the front. The front sequence runs through a cartesian-to-polar transformation.
14. The radius of the front sequence runs on trigonometry.
15. The transparency of both sequences is tweaked to give the final film a transcendental feeling.

The first app
As the number of steps suggests, the bulk of the coding went into this app. An important characteristic of this sketch is that a lot of the external communication (XML, downloading images) is done using concurrent threads. This allows the main application to keep running, without having to wait for a response from Flickr or 100 thumbnails being downloaded. Once the external communication is finished and the high resolution images are in memory, the application turns these static images into moving collages through manipulation of the pixel arrays. The main focus of my learning process for this app was in fact the whole communication with Flickr thing which required the solutions mentioned.

The second app
Although smaller in scope and complexity, the second app gave that necessary creative push to the output, making it visually interesting enough to release. This code actually came about when working on another topic from my (seemingly ever growing) to-do-list: cartesian-to-polar transformation. While such transformations are not that hard, trying to do them within the pixel array can be a challenge. The reason is that the pixel array is inherently cartesian. So there has to be not only transformation but also some kind of re-interpretation. Let me give working code examples of what I started with and then shortcut to where I ended up.

// Example 1: input-based cartesian-to-polar

PImage inputBased;

void setup() {
  PImage input = loadImage("input.jpg");
  size(input.width, input.height, P2D);
  inputBased = inputBasedPolar(input, 0.5, 0.25);
}

void draw() {
  image(inputBased, 0, 0);
}

PImage inputBasedPolar(PImage input, float factor, float density) {
  PImage output = createImage(input.width, input.height, RGB);
  for (float y=0; y<input.height; y+=density) {
    float r = y * factor;
    for (float x=0; x<input.width; x+=density) {
      float q = map(x, output.width, 0, 0, TWO_PI)-HALF_PI;
      int polarX = int(r * cos(q)) + output.width/2;
      int polarY = int(r * sin(q)) + output.height/2;
      polarX = constrain(polarX, 0, output.width-1);
      polarY = constrain(polarY, 0, output.height-1);
      int outputIndex = polarX + polarY * output.width;
      int inputIndex = int(x) + int(y) * input.width;
      output.pixels[outputIndex] = input.pixels[inputIndex];
    }
  }
  return output;
}

I started with input-based cartesian-to-polar transformation. As the name suggests, this starts from the input xy and via calculations ends up at the output xy. Although this code works, there is a major inefficiency at the core due to the different density moving from cartesian to polar. Resulting in gaps (transparent pixels) in the output image. There are at least two solutions. The first is to increase the density. Although this makes the output look good, it also increases the inefficiency of the program. The second solution is to use color interpolation to ‘fill the gaps’. I’ve implemented both solutions. However as said, both are relatively resource-intensive due to their inefficiencies. So if speed is important (which it is, when working with images sequences) one might look at an alternative option: output-based cartesian-to-polar transformation.

// Example 2: output-based cartesian-to-polar

PImage outputBased;

void setup() {
  PImage input = loadImage("input.jpg");
  size(input.width, input.height, P2D);
  outputBased = outputBasedPolar(input, 0.5);
}

void draw() {
  image(outputBased, 0, 0);
}

PImage outputBasedPolar(PImage input, float factor) {
  PImage output = createImage(input.width, input.height, RGB);
  color black = color(0);
  for (int y=0; y<output.height; y++) {
    for (int x=0; x<output.width; x++) {
      int my = y-output.height/2;
      int mx = x-output.width/2;
      float angle = atan2(my, mx) - HALF_PI ;
      float radius = sqrt(mx*mx+my*my) / factor;
      float ix = map(angle,-PI,PI,input.width,0);
      float iy = map(radius,0,height,0,input.height);
      int inputIndex = int(ix) + int(iy) * input.width;
      int outputIndex = x + y * output.width;
      if (inputIndex <= input.pixels.length-1) {
        output.pixels[outputIndex] = input.pixels[inputIndex];
      } else {
        output.pixels[outputIndex] = black;
      }
    }
  }
  return output;
}

Output-based cartesian-to-polar transformation works the other way around. It starts from the output xy and via calculations ends up at the input xy. This is much more efficient, since calculation is only done for exactly the amount of pixels that you need. No more, no less. Not only is it more efficient (read: faster), it also looks better. So truly a win-win option. To really make it lightning fast, one can pre-calculate a lot of the necessary math. This information is stored in so called lookup tables (LUTs). When you apply this technique to my code, the whole cartesian-to-polar transformation can be done in real-time with no noticable effect on the framerate. Awesome! To specify it even more: the inputBased code runs below 5 fps, the regular outputBased around 20 fps and the outputBasedLUT above 200 fps. I think those numbers speak for themselves. The price you pay for pre-computation is of course loss of flexibility (there are some workarounds to counter this though). So it depends on your needs which route you take. But it’s always good to have different options, right?

// Example 3: output-based cartesian-to-polar using LUTs

PImage input, outputBasedLUT;
int[][] LUT;

void setup() {
  input = loadImage("input.jpg");
  size(input.width, input.height, P2D);
  calculateLUT(0.5, input.width, input.height);
}

void draw() {
  // lookup tables are so fast that - when using them -
  // the pixel manipulation can be run in realtime without
  // any noticable cost to the sketch's frameRate
  outputBasedLUT = outputBasedLUT(input);
  println(int(frameRate));
  image(outputBasedLUT, 0, 0);
}

void calculateLUT(float factor, int w, int h) {
  LUT = new int[w][h];
  int pL = w * h - 1;
  for (int y=0; y<h; y++) {
    for (int x=0; x<w; x++) {
      int my = y-h/2;
      int mx = x-w/2;
      float angle = atan2(my, mx) - HALF_PI ;
      float radius = sqrt(mx*mx+my*my) / factor;
      float ix = map(angle,-PI,PI,w,0);
      float iy = map(radius,0,h,0,h);
      int inputIndex = int(ix) + int(iy) * w;
      if (inputIndex <= pL) {
        LUT[x][y] = inputIndex;
      } else {
        LUT[x][y] = -1;
      }
    }
  }
}

PImage outputBasedLUT(PImage input) {
  PImage output = createImage(input.width, input.height, RGB);
  color black = color(0);
  for (int y=0; y<output.height; y++) {
    for (int x=0; x<output.width; x++) {
      int outputIndex = x + y * output.width;
      int inputIndex = LUT[x][y];
      if (inputIndex == -1) {
        output.pixels[outputIndex] = black;
      } else {
        output.pixels[outputIndex] = input.pixels[inputIndex];
      }
    }
  }
  return output;
}

The future
I have a few projects lined up that I intend to finish and release. But once I have time to return to this one, there are a few things I’d like to work on. Improve the pixel manipulations of the first app (they are highly unoptimized right now). Clean up the code and add some features such as including the search terms in the text file (don’t know how I missed that one). And most importantly, try to see if it’s possible to integrate all of the code from these two apps into a single application that is capable to perform all the steps from start to finish efficiently (and perhaps be suitable for public release). Note that these two videos were basically made from test footage while writing & playing around with code. Now that I have a clearer view of where to end up, optimizing the code may be easier. I hope my blog post gave you a bit of insight into the making-of these two videos. See you around!

For high resolution screenshots from the Striate Cortex Series check out the Flickr set.

About these ads
Comments
6 Responses to “Striate Cortex: Advanced Pixel Manipulation”
  1. jaygo says:

    Thanks Amnon! please post more! I am learning P5 only recently and love this approach. With processing 2 how does this improve your pixel manipulation ?

    • Amnon says:

      Thanks jaygo.

      Processing 2.0 will most likely not improve the speed of pixel manipulations directly. The speed improvements in 2.0 will come from the inclusion of a new OPENGL renderer based on the GLGraphics library, so it’s mainly the 3D geometry stuff that will benefit in an easy way.

      However, I do know that the current GLGraphics library facilitates the use of GL shaders. Of course working with fragment/pixel shaders would be a great opportunity to take pixels manipulations like these to the next level, since those shaders run on the GPU, making them lightning-fast.

      Unfortunately I will have to learn GLSL first, which is the C-like shading language that these shaders are written in. I want to and I probably will, but it’ll take time. One more thing on the list. ;D

      • grag (@grag) says:

        shaders can let you do the pixel manipulation code on the gpu right? if you pass the values as a uniform texture?

        this looks lovely though, did you render to the video? or can it do this in real time?

  2. Amnon says:

    You are right. With shaders all the pixel manipulations are so much faster and you could use multiple input textures. As described in the blog post, the making of this was done in several stages, so it’s neither one program nor realtime. If I were aiming for a realtime program like this today, I would try to implement it in shaders as you said.

  3. deano says:

    These are beautiful! I will take a look at the code soon.
    And thanks for helping out at P5 forums.
    d.

Follow

Get every new post delivered to your Inbox.

Join 106 other followers

%d bloggers like this: