Feb 13, 2010

Belousov-Zhabotinsky reaction with hexagonal cellular automaton on a GPU

1. Introduction

This is going to be a description of a CA resembling (to some extent) the behavior of real-life Belousov-Zhabotinsky chemical reaction. To give it more isomorphic properties, we will model it on a hexagonal grid, thus naming the whole thing as BZ HEX.

You might want to read this first to get the general idea how a shader program is organized, and some basics of ARB assembly language.

2. CA description

The rules of BZ HEX cellular automaton are logical extension of the rules of Brain's Brain CA: instead of having one passive, one semi-active and one active state, we will have, say, 11 states. Let's pick arbitrary integers N & K: 0<=N<=10, 0<K<=6, and then define BZ HEX CA rules as follows:

  • The lattice has hexagonal structure, i.e. every cell has 6 neighbors

  • Every cell has a state 0..10, and state 10 is called active

  • States 0..(N-1) are called passive, and states N..9 are called semi-active

  • If a cell is active, it remains active on the next step

  • If a cell is passive, it increases its value by one on the next step

  • If a cell is not passive and >=K active neighbors, it becomes passive with initial value of 0, else it increases its value by one

One can vary N and K (as well as total number of allowed states), but for this article we'll use N=4, K=2, 10-state CA.

Note

This is not a canonical representation of rules describing so called "excitable media", but they result in pretty much the same behavior when simulating. To give you a general idea of more conventional model, I'll use an example from the book by Toffoli and Margolus.

Imagine a colony of tubeworms, where each worm sits in a lattice cell. A tubeworm is active, when it has extended its tentacles and searches for food. When number of active worms around any given worm exceeds some limit, that worm pulls his tentacles back it and goes into passive state. Nothing can bring it back from the passive state, except its internal timer.

Authors then proceed with various criteria for passive-active switch, and draw parallels with famous Belousov-Zhabotinsky chemical reaction.

3. Moore neighborhood on a hexagonal grid

The CA we are going to use employs Moore neighborhood trivially tuned for a hexagonal grid - every hexagon has 6 neighbors (as opposed to 8 neighbors for every rectangle on a rectangular grid).

Now, we need to decide how to map our hexagonal lattice to rectangular pixels of a texture used by GPU. In principle, it maps 1:1 to the rectangular grid with some weird neighboring rules:

Figure 2. 1:1 hexagonal to rectangular grid mapping


As you can see, calculating coordinates of neighbors is different for odd and even lines. Also, such mapping will give a somewhat distorted image on the screen (i.e. compound hexes won't look like hexes). Thus, let's employ a different technique, which gives more convenient image, but effectively reduces the number of simulated cells by a factor of 2:

Figure 3. Alternative mapping of hexagonal to rectangular grid


In the picture above, every 2-pixel "brick" corresponds to a single hexagon. Obviously, a brick has 6 adjacent bricks as neighbors, and both pixels of a brick should be identical.

4. CA shader

4.1. Hexagonal neighbors coordinates

Every "brick" is composed from 2 texels, thus calculating coords of texels corresponding to the neighboring bricks is simple:

Figure 4. Calculating coords of 6 hexagonal neighbors for a texel


The picture above shows how to find neighbors for the left texel of a brick. If we do it the same way for the right texel, every "brick" would behave as if it were intact - both halves would have same value as long as their initial values were the same. And, of course, we will take care of bricks being intact during initializing.

4.2. Representing cell's value

A pixel color has four components: R, G, B, A, where A stands for alpha. This component is used for alpha blending. We don't need alpha blending, so it has been turned off:

gl.glDisable(GL.GL_BLEND);

But we still can use A component to store some information, if needed. We need to encode possible cell states [0,1..10] in the pixel's color, and every of four color components is a float. Thus let's represent cell states as [0.0, 0.1, ... 1.0].

Then the blending is turned off, A component doesn't really affect pixel's color, so we'll have to propagate it to RGB components, e. g. like this:

MOV result.color.rgba, cur_clr.aaaa;           # propagate A component to RGB

This way, A component holds true value on which we perform all the calculations, but we have the freedom to apply any post-processing steps on it before moving it to RGB components. We'll need this feature later.

Note

In ARB assembly language, referencing four components of 4d vector can be done in two equivalent ways: v.rgba and v.xyzw. It doesn't matter, if vector v holds coordinate or color, rgba and xyzw are totally interchangeable.

4.3. Implementing CA rules

To implement BZ HEX, first thing we'll need is obviously the counting of active neighbors. This is very similar to what we've done earlier, with hexagonal neighbors tuning applied:

Example 1. Calculating the number of active neighbors for BZ HEX

TEMP shift;
MOV  shift, program.env[1];                     # coord shift is passed from the outside, x - positive, z - negative

#  x.x.   remember this 'brick' pattern we use to emulate hexagons on a rectangular-grid texture
# x.x.x.  x mark pixels in question - center pixel and its neighbors
#  x.x.

TEMP max_coord;
SUB  max_coord.xy, 1, shift.xyzw;
MOV  max_coord.z, 1;

# let's get number of neigbours which are active (1.0 or more) 
TEMP n_neig;                                    # holds the number of active neigbours
TEMP sum_neig;                                  # temporary sum holder

TEMP cur_pos;
TEMP cur_clr;

# go through sum_neigbours, 'w' component is significant one
TEMP ge_512;                                    # flag for out-of-bounds situation
TEMP over;

MAD  cur_pos, shift.zwww, 2, fragment.texcoord; # (-2,  0) - remember, shift.z is the negative shift component
                                                # MAD - multiply and add, so cur_pos.x = texcoord.x + 2*shift.z
  ADD over.x, max_coord.z, cur_pos.x;
  CMP cur_pos.x, cur_pos.x, over.x, cur_pos.x;  # usual wrap-aroung for X axis on torus
TEX  cur_clr, cur_pos, texture, 2D;             
SGE  sum_neig.x, cur_clr.wwww, 0.95;            # compare w against 0.95 to guard from possible floating-point errors

MAD  cur_pos, shift.xwww, 2, fragment.texcoord; # (+2,  0)
  SUB over.x, cur_pos.x, max_coord.z;
  SGE ge_512, cur_pos.x, max_coord.z;
  CMP cur_pos.x, -ge_512, over.x, cur_pos.x;    # wrap-around
TEX  cur_clr, cur_pos, texture, 2D;
SGE  sum_neig.y, cur_clr.wwww, 0.95;            # note that we write to the next component of sum_neigh, y one

ADD  cur_pos, fragment.texcoord, shift.zzww;    # (-1, -1)
  CMP cur_pos.x, cur_pos.x, max_coord, cur_pos.x;
  CMP cur_pos.y, cur_pos.y, max_coord, cur_pos.y;
TEX  cur_clr, cur_pos, texture, 2D;
SGE  sum_neig.z, cur_clr.wwww, 0.95;            # z component used

ADD  cur_pos, fragment.texcoord, shift.xzww;    # (-1, -1)
  SGE ge_512, cur_pos.x, max_coord.z;
  CMP cur_pos.x, -ge_512, 0, cur_pos.x;
  CMP cur_pos.y, cur_pos.y, max_coord, cur_pos.y;
TEX  cur_clr, cur_pos, texture, 2D;
SGE  sum_neig.w, cur_clr.wwww, 0.95;            # store to w component

DP4  n_neig, sum_neig, 1;                       # sum first 4 neighbours
                                                # DP stands for 'dot product', i.e. n_neig will have 4 components,
                                                # each equal {x, y, z, w} X {1, 1, 1, 1} = x*1 + y*1 + z*1 + w*1

# let's reuse sum_neigh since we stored partial result for 4 neighbors in n_neig
ADD  cur_pos, fragment.texcoord, shift.zxww;    # (-1, +1)
  CMP cur_pos.x, cur_pos.x, max_coord, cur_pos.x;
  SGE ge_512, cur_pos.y, max_coord.z;
  CMP cur_pos.y, -ge_512, 0, cur_pos.y;
TEX  cur_clr, cur_pos, texture, 2D;
SGE  sum_neig.x, cur_clr.wwww, 0.95;            # reuse x component

ADD  cur_pos, fragment.texcoord, shift.xxww;    # (+1, +1)
  SGE ge_512, cur_pos.x, max_coord.z;
  CMP cur_pos.x, -ge_512, 0, cur_pos.x;
  SGE ge_512, cur_pos.y, max_coord.z;
  CMP cur_pos.y, -ge_512, 0, cur_pos.y;
TEX  cur_clr, cur_pos, texture, 2D;
SGE  sum_neig.y, cur_clr.wwww, 0.95;            # reuse y component

DP4  sum_neig, sum_neig, {1, 1, 0, 0};          # sum 2 remaining neighbours, note zeros in z and w to discard 2 values
ADD  n_neig, n_neig, sum_neig;                  # now, n_neig holds the number of active neighbours


Second thing to implement is the remaining rules from BZ HEX rule set, which is easy having n_neig at hand:

Example 2. State-changing logic for BZ HEX

TEMP num_ge_two;                                # flag signalling when a value is >= 2
SGE  num_ge_two, n_neig, 2;                     # num_ge_two = (n_neig >= 2.0)

TEX  cur_clr, fragment.texcoord, texture, 2D;   # read current cell's color
TEMP is_notpass;                                # flag signalling the cell is semi-active or active
SGE  is_notpass, cur_clr.wwww, 0.35;            # is_notpass = (state >= 4)

MUL  is_notpass, is_notpass, num_ge_two;        # reuse is_notpass: is_notpass == (num >= 2) && (cell was not passive)
ADD  cur_clr.w, cur_clr, 0.1;                   # implement increasing rule,
                                                # active remains active here, but >1.0

# normalize active cells which are >1.0 here, but in fact you could save a few instructions and remove it
TEMP res_clr;
SGE  res_clr, cur_clr, 0.95;                    # check if active cell's value is too high
CMP  cur_clr.w, -res_clr, 1, cur_clr;           # then normalize if needed

CMP  cur_clr.w, -is_notpass, 0, cur_clr;        # the only exception to the increasing rule is is_notpass flag,
                                                # when is_notpass == true, go to the passive state 0


Now, cur_clr.w holds new cell's state, and all we need to do is to propagate it to RGB to show on screen. But, before we go, let's normalize resulting color, dropping off any floating point errors. We don't the error to accumulate, and the A component to be as close to "discrete" [0.0, 0.1, 0.2 .. 1.0] values as possible:

Example 3. Normalizing resulting value of a cell's state

# normalize color to be closer to n*0.1
SUB cur_clr, cur_clr, 0.05;                     # let's assume absolute value of error on each step is < 0.025
MUL cur_clr, cur_clr, 10;                       # e.g. ~0.2 becomes ~1.95
FLR cur_clr, cur_clr;                           # FLR is "floor", so ~1.95 becomes 1
ADD cur_clr, cur_clr, 1;                        # 1 -> 2
MUL cur_clr, cur_clr, 0.1;                      # gets exactly 0.2

# now we can propagate the A component
MOV result.color, cur_clr.aaaa;


4.4. Applying anti-flickering

At this point, CA implementation works fine, but it is not very pleasant to watch. You can observe swirling spirals, but everything flickers so much, it's like looking right into the eyes of the hypnotoad.

This is where our decision to store the cell's value in the A component comes handy. As long as we don't mess up our finely calculated A component of the resulting color, we can do whatever we want with actually displayed RGB components on every simulation step.

First of all, let's note that according to the rules, every cell tends to simply increase it's state by 0.1 on each step. Instead, let's try to adjust displayed image in a way that most of the pixels keep their color, which must reduce flickering for sure. It's easy to do: instead of merely propagating A component to RGB, we should subtract 0.0 on the first step, 0.1 on the second and so on. We'll just need to pass one more external parameter to the ARB program on every step, so it would know which step it is calculating now (Java, jogl):

Example 3. Passing color-rotating parameter on each step

// somewhere during initialization
int _num_step = 0;
float _pr_param[][] = new float[11][];            // since there are 11 states in [0,1..10]

for (int i=0; i<_pr_param.length; i++) {
  float f = i * 0.1f;
  _pr_param[i] = new float[] {f, f, f, 0.0f};     // note 0.0 in A component
                                                  // as an extra measure to protect A from screwing up
}

// on every step
gl.glProgramEnvParameter4fvARB(GL.GL_FRAGMENT_PROGRAM_ARB, 0, _pr_param[_num_step % _pr_param.length], 0);
renderScene();
_num_step++;


We can now recover this passed parameter in the ARB program and apply the color transformation:

Example 4. Applying color-shifting to the result color

# apply color-shifting anti-flickering trick, 
# cur_clr.w holds true state of the cell, don't mess it up
SUB  res_clr.xyz, cur_clr.wwww, program.env[0];   # instead of simply propagating, subtract first
                                                  # note .xyz, w is not used as an extra measure to keep w intact
ADD  cur_clr.xyz, 1.1, res_clr;                   # resulting color should be in the range of [0.0 .. 1.0]
CMP  cur_clr.xyz, res_clr, cur_clr, res_clr;      # if res_clr was <0, adjust cur_clr to be in [0.0 .. 1.0] range

# ... normalizing goes here, skipped ...
MOV  result.color, cur_clr;                       # finally, write the resulting color


After this transformation you'll note that evolutions become smoother indeed. The only remaining cause of flickering will be black color turning to white and vise versa on every step. No problem, it's easy to deal with (as long as W component stays intact):

Example 5. Applying black=white transformation

# even less flickering by making black == white
SUB  res_clr.xyz, cur_clr, 0.05;                  # res_clr is <0 if cur_clr is black
CMP  cur_clr.xyz, res_clr, 1.0, cur_clr;          # adjust cur_clr to be white if it were black

# ... normalizing goes here, skipped ...
MOV  result.color, cur_clr;                       # finally, write the resulting color


5. Further reading

5.2. Cellular automata in general

5.3. Proprietary sources

No comments:

Post a Comment