AI, Pacman & Python…

Collision Detection Sketch

I’m taking an Artificial Intelligence class this semester, and the class uses Pacman (in Python) as a pedagogical tool for learning AI. What a great idea! The projects typically start off by asking you to play a game of Pacman, and then dive into the content.

Here is one of the projects, if you are interested.

Also, if you want to use this system at your school, I think they are actively seeking other instructors to try it. Contact Dan Klein if you are interested in more information.

Tagged Tags: , , on February 13, 2009 at 12:18 pm

Cheap Collision Detection in Processing

Collision Detection Sketch

In the process of writing the BUGS! sketch I stumbled onto a quick and simple collision detection algorithm, so I figured I would share it. For the purpose of this article I will be assuming 2D, however it should work in 3D just as well.

The detection system is implemented in Processing, which is basically Java specialized for multimedia. The core algorithm is independent of any Processing specific features and it should be easy to port to any other language.

Motivation

I made a 2D flocking/gravitation visualization, and I wanted the particles to “follow” each other, but only when an object is close to another object. The code developed here is a cleaned up and simplified version of this original sketch. You may want to take a quick look at the final product before starting to get an idea of what I am trying to accomplish.

Particles are updated on every frame, and then drawn. During the update step, each particle can move freely, but then must look for nearby particles that it could possibly follow. The brute force method would be to compare all combinations of objects and check for a collision, requiring 2^(n-1) comparisons (where n is the number of particles), resulting in O(2^n) time complexity just for the collision detection phase. But time is precious and I wanted to allow for a large number of particles, so O(2^n) was unacceptable. I figured, if two particles are more than 2x the particle size away from each other, they don’t need to be tested at all. This is the core of the Algorithm that follows.

The Uniform Grid Algorithm

The technique I stumbled upon is acutally well known and documented; it is known as a uniform grid [1]. Since particles only collide when in the same general area, the simulation area can be decomposed into small sectors:
Sectors
This way, only particles occupying the same sector need to be compared, drastically reducing the total number of comparisons.

Notice that in the example, the number of comparisons for the red dot is reduced from 7 to 2; the total number of comparisons is reduced from 128 to 6! This example glosses over the issue of particle size, which will increase the number of required comparisons, but it does illustrate the general performance improvement.

Data Structure

The speedup obtained by the uniform grid is almost entirely based on a simple data structure. Below is a high level definition of the algorithms components, as well as descriptions of how they will work:

  • Particle: An object. In this simulation a particle is simply a circle.
  • Grid: The collection of all sectors, which together composes the simulation space.
  • Sector: A square area that may or may not contain particles. The size of each sector should be the maximum size of the largest particle at any rotation. This ensures that any particle will occupy at most 4 sectors.
  • The sector a particle occupies is represented by an x,y coordinate pair, just like a regular point. The sector is uniquely determined by the x,y world coordinate of the particle, divided by the sector size. For example, if the particle world location is (110,210), and the sector size is 10, then the sector location is (11, 21).
  • All particles are bounded by axis aligned bounding boxes (AABBs). An AABB is a rectangle bounding the entire particle. The location of the current particle is determined by the four corners of the AABB:
    AABB
    So a single object can exist in up to 4 different sectors on the grid at once.
  • When a particle moves, its status in the grid must also be updated. For this reason, finding sectors in the grid must be *extremely* fast. (An alternative approach is to *not* update the particles as they move, but rather to rebuild the grid from scratch at every frame. It’s easy to argue that this may actually be faster than updating the Grid, due to the overhead of finding the particle to remove it during the update phase.)

Collision Queries

Queries are the primary action performed on the grid. A query uses the grid data structure to retrieve candidates for collision tests.

  • Given a point P=(x,y) in world coordinates, a query begins by translating the world coords into sector coords.
  • The sector contents are particles, and each particle is compared to P
  • The comparison to P is done by component wise clamping P to the AABB of the current object, to produce the point Q:
    Clamping P to produce Q
    The distance (d) between P and Q is the distance from the given point to the AABB. If P = Q, then the point P is inside the AABB.
  • Once the AABB test passes, we know that the point and the particle are very close, but not necessarily colliding. Since my simulation uses circles, my final detail test is: dist(P, center(Particle)) < radius. In words: test if the distance from point P to the center of the circle is less than the radius of the Particle being tested.

Notice that the lookup of particles is amost effortless. The major work here is simply testing the point P against the particle in two phases. The first phase is a broad phase using AABBs, and the second phase is a detail phase using the actual geometry of the object.

You may be thinking that a two stage testing process is overkill for testing circles, and you are right. It would actually be faster to test the circle directly, but I’m assuming here that you may want to use geometry other than circles, in which case you would do an expensive polygon test in the detail phase.

Implementing the Grid

The sectors compose an infinite grid that covers the world. A flyweight hash table is used to index the grid, since an infinite list would take a **really** long time to search.

The grid is a 2-dimensional array, the index into the array is a sector coordinate moded by the sector size. This gives the illusion of an infinitely large array, but actually only contains (sector size)^2 elements. This is like a circular array in two dimensions, resulting in a toroidal array.

The down side is that many sectors map to the same array entry. To accomidate for this, each entry in the array is an ArrayList of sectors. To find a specific sector, the list is searched until the requested sector is found. This search adds an additional O(n) lookup overhead, however, in the average case, n (the number of sectors contained in the array entry) is very small, so the slowdown is minimal. The public Grid members are as follows:


class Grid {
int sectorWidth
int sectorHeight

void addParticle(Particle obj)
void removeParticle(Particle obj)

Sector getSector(float x, float y)
ArrayList getSectors(Particle obj)
}

Now that objects can be stored, and retrieved, it’s time to add the ability to query it.

Putting the Grid to Use

I’ve created a small simulation of bouncing particles exibiting the collision detection system and the uniform grid. In the simulation each particle is given a random velocity (vx, vy) which describes their x and y movement in terms of pixels per second. During each frame, the following actions are performed, given the time elapsed since the last frame, dt:


1. Find the new x,y coords of the particle using:
x = dt * vx + x
y = dt * vy + y
2. Calculate a small step size dx,dy by dividing
the movement by 5
3. Step from the current position to the new position
using dx,dy
4. At each step, check for collisions
5. If a collision is detected, step back one step and
invert the velocity of both particles

The reason for stepping slowly between the current position to the new position is to avoid tunneling, in which the step size of a particle is larger than the particle itself and allows it to step through another particle.

View the uniform grid in action!

Problems

When optimizing collision detection, there is a great deal of give and take, every optimization you make results in new errors in the detection mechanism. The same is true for this method. The problem of tunneling can still occur, even with the additional precautions I’ve taken here. This can be seen in my simulation when two particles get “stuck” together. They become stuck due to the fact that they one particle has actually stepped *inside* another particle. However, because tunneling is really more of an animation issue, I’m not going to attempt to solve it here.

The second problem is if you have objects of highly varying sizes. A uniform grid works best for objects of similar size, when the size varies greatly, the grid becomes inefficient. To solve this problem, use a hierarchical grid instead [2].

The final problem is that the collisions are not very realistic. This is not a problem with the detection system at all, but a problem with how velocities are calculated during a collision. A much nicer way to do this would be to use a concept from physics known as an elastic collision. The formula for an elastic collision is simple, however, I didn’t want to muddy the waters here with any addition details unrelated to collision detection.

Conclusion

The uniform grid concept greatly improves performance of collision detection by using spatial locality of the particles being tested. I hope this tutorial has provided a good starting point for a fast collision detection system, despite some minor issues. Perhaps in the future, I will continue to build upon this framework and refine some of the problems presented here.

References:

[1] C. Ericson. Real Time Collision Detection, Morgan Kaufmann Publishers, pages 285-290, 2005.
[2] C. Ericson. Real Time Collision Detection, Morgan Kaufmann Publishers, pages 300-306, 2005.

Tagged Tags: , , , on January 14, 2009 at 12:53 pm

Finding Interesting Fractals

The fractal generator I posted visualizes sin(z)*c type fractals, where c is some constant. The constant c is actually where all the action happens, because if you choose the wrong c, you get a very boring fractal (as I noted on the project, the c value for the fractal I posted came from Paul Bourke). The difficulty of finding these constants gave me a new idea, use the fractal generator to find interesting constants. So here is my fractal finder algorithm:


1. Specify a sample fractal area, such as 10 x 10 pixels
2. Select a minimum and maximum range for the constant
(real and imaginary parts)
3. Loop over every possible constant in this range and
process the 10 x 10 fractal
4. Record the activity of the generated fractal
5. Display constants with activity greater than some
threshold

So “activity” should probably be defined better. I didn’t want to get crazy and calculate the true sobel-style activity in the image, because this would take forever to process a large number of fractals. Instead, since the actual output for each pixel is in the range [0, 255], I stored a count of these values into an array of 255 elements. So if the value 15 comes out once, array[15] is incremented. As each fractal is processed, the activity of the fractal is the number of array elements greater than zero.

One problem with this method is that all elements in the array could have a value of 1, while all the true activity takes place in array[255]. Also, with an array of 10×10 (which I’ve been using for testing), the maximum possible activity is 100. Although, this could easily be resolved by using a larger sample size, say 16×16.

I’m still playing with this, but I’ll post the code soon.

Tagged Tags: , , on August 21, 2008 at 8:20 pm

Sobel Edge Detector in VB.NET

Sample Output
I recently stumbled onto a C tutorial on edge detection and decided to implement the algorithm in VB.NET. Edge detection is a machine vision technique that attempts to identify interesting parts of an image, such as where one object ends and another begins. One way of finding these areas is to search for sharp changes in intensity between each pixel and its neighbors. A source image is processed and an output image is created that highlights the edges found in the source. The output image is a visualization of what the algorithm detected and ultimately these results can be applied to solve a specific problem.

Although the background and mathematical basis for the algorithm are interesting, I’m only going to discuss them briefly and focus on the implementation and performance issues in .NET. If you are curious about the details behind the algorithm, you should check out the original articles I found.

The Algorithm

To find relative changes in intensity level, the algorithm processes the image one pixel at a time. It looks at the change in intensity to the left and right of the current pixel, stores it and then checks the change in intensity in the pixels above and below the current pixel and stores it as well. The actual process happens one dimension at a time for each pixel (horizontal and then vertical), and then combines the result into two dimensional pixel data, the output image.

As each pixel is encountered, its neighboring pixel’s intensity levels get calculated and then subtracted from the neighbor pixel on the opposite side. The resulting sum of all the pixels is the relative change in intensity at that location. If opposite neighbor pixels are the same color, when subtracted from each other the result will be zero (black). If the two neighboring pixels were radically different intensity levels the output would be greater than zero. The mechanism that averages the pixels is a weighted matrix, one for vertical (the xMask) and one for horizontal (the yMask):

Dim xMask(,) As Single _
= New Single(,) {{-1, 0, 1}, _
{-2, 0, 2}, _
{-1, 0, 1}}

Dim yMask(,) As Single _
= New Single(,) {{1, 2, 1}, _
{0, 0, 0}, _
{-1, -2, -1}}

Each element in the matrices represents a bordering pixel, with the center element being the current pixel. The numbers in each matrix is the weighting of the importance of the pixel at that location, so pixels directly above or beside the current pixel are weighted heavier (denoted above with a 2 instead of a 1) than diagonal pixels. Notice that in both matrices, the current pixel is ignored (set to zero).

For the vertical yMask, the strictly horizontal elements are zero, and the opposite is true for the xMask. The elements of each matrix are multiplied by the border pixels intensity levels, and then the results are summed. Since the opposite sides are also opposite signs, the sum is the change in intensity we were looking for. The final step is to add the absolute value of the horizontal and vertical differences in intensity. This process is actually a rough approximation of the mathematical gradient of the image.

First Implementation: GDI+

To get things started, I wanted to keep the details of working with the image data to a minimum, so I created the algorithm to work with the infinitely slow GDI+ Bitmap object using the GetPixel and SetPixel methods. The implementation is straight forward:

1. Create X and Y loops to scan across
each pixel in the source image
2. Create I and J loops to process the
eight border pixels for the current pixel (X,Y)
3. Get the current border pixel
Intensity(X + I, Y + J) : 1/3 * (R + G + B)
using Bitmap.GetPixel on the source image
4. Multiply the intensity by the appropriate mask
5. Clamp the output value to [0, 255]
6. Write the output pixel value to the output
image (Bitmap.SetPixel)

Below is sample output of the initial implementation:
Sample Output
This process typically runs in about 3K pixels /second, which sounds fast… but actually takes about 160 seconds to process an 800 x 600 pixel image (480K pixels). So for real-time processing, this method is absolutely out. Although it’s very slow, this method works as expected and was useful to me as a reference renderer as I tested new approaches.

Take Two: Direct Pixel Access

The GDI Bitmap object offers a handy function, Lock/UnlockBits(), that returns the raw bytes of memory composing the Bitmap object. By using this function, it is possible to read all pixels in one call, process them, and then write them back in a single call. This is much, much faster than using Get/SetPixel() methods which operate on a single pixel for each read and write. The intense down side of LockBits is that you no longer have friendly access to the pixel by X and Y coordinates, and documentation is pretty bad.
When calling LockBits(), you specify what format you want the data to be returned in. The pixels get returned as a contiguous array of bit data, and you are charged with picking it apart. For my purposes, I’ve forced the format to always be 24 bit RGB. The function returns a one dimensional array of bytes. Each pixel is encoded according to the format specified, so in my case, there are 3 bytes for each pixel (R, G and B). To emulate two dimensions, a “stride” value is given, which lets you know how many bytes there are per line, along with a “height” which is the total number of scan lines. So each pixel can still be accessed with X and Y coordinates by using the following formula:

Pixel.Red = Array[stride * Y + X * 3]
Pixel.Blue = Array[stride * Y + X * 3 + 1]
Pixel.Green = Array[stride * Y + X * 3 + 2]

Notice that the Y value is multiplied by the width of the scan line and X is multiplied by the amount of byte data per pixel, the 3 here is for RGB. The first byte at this location is red, the next byte is green and the last byte is blue, which is why 0, 1 and 2 are added to X.
To make this logic less painful, I created a wrapper class for the bitmap object which has its own GetPixel and SetPixel methods. This class locks the bits on the image when it loads and then operates on the array. It implements IDisposable, and when Dispose is called, it calls UnlockBits on the original bitmap image, committing all changes to the image at once.
This implementation runs at around 50Kp/s, a huge improvement of over the original. Processing the pixels in blocks greatly improved performance, but this is still a little too slow for any real-time application. The 800×600 image still takes about 10 seconds to process with this new method.

Take Three: Divide and Conquer

The next approach I took was to reduce the input data by splitting the image in two. The actual split is done by creating Rectangle objects and then passing these rectangles to LockBits when retrieving the image data.
Since I wanted to test different numbers of splits, it became very important to *neatly* keep track of work units, that is, what part of the image was actually being processed. Also, I had another idea in mind for the next implementation, so I wanted this idea of work units to be reusable. To facilitate this, I created a class called ImageWorkUnit with the following properties:

Image: The input Bitmap which is being processed
WorkArea: The Rectangle of the area to process
Result: The output Bitmap shared by all workers

Using this new work unit class, I then created a method to split the image into multiple work units (the number of units is variable) and then a loop to process each loop sequentially. Right now, stop and make a quick estimate of how fast this new method will run using two units (splitting the image in half). Before I ran this new implementation, I made an approximation of no more than few percent speed increase. I was wrong. This method runs at a blazing 450Kp/s, a vast improvement over the direct pixel access method.
As happy as I was with the result, it was a little disturbing and I decide to do some tests. I varied the number of divisions, and much to my surprise, using only one division it ran at the same speed, 450Kp/s. This implied that the speed increase was not caused by dividing the image, but by something else. Here is the original processing loop:

For y As Integer = 0 To inImg.Height - 1
For x As Integer = 0 To inImg.Width – 1
...
Next
Next

Now here is the new processing loop for the work unit based implementation:

For y As Integer = 0 To area.Height - 1
For x As Integer = 0 To area.Width - 1
...
Next
Next

The speed increase was due to the fact that I was no longer accessing the Height and Width properties of the image (it was also accessed in the body of the loop). It turns out, the Height and Width properties of the Bitmap object are not exactly optimized. By simply not accessing these properties inside the loop, I got a speed up of about 900%.

Take Four: Use the Cores Luke

In the final implementation, I created a new class to spawn separate threads to process the image in parallel. This function gets the processor count, and then splits the image that many times and spawns threads to process the chunks.
Oddly, this method runs at either 500Kp/s *or* around 700Kp/s. The discrepancy is because of the thread scheduler. In the new class I created, I simply spawn threads, fire them off and leave it up to the thread scheduler to pick a core to execute on. If both threads execute on the same core, the result is 500Kp/s, when they happen to run on different cores, the speed up to 700Kp/s occurs. I’m not exactly sure how to get around this in managed code – if you have any ideas, please post a comment and let me know.
So with the fastest algorithm, and my fingers crossed, it can process that 800×600 image in about 0.6 seconds (down from 2.5 minutes), which is actually a reasonable rate for real time applications.
Download the Sobel edge detector code and sample images.

Tagged Tags: , , , , on March 22, 2008 at 6:52 pm

Simple Recursive Blob Detection

Continuing my quest for useless VB.NET code, I am going to delve into blob detection. When I say blob, I am not referring to Binary Large OBjects (like database BLOBs), but blobs in the visual sense – a contiguous area of one specific color (in the simplest case). A good example of blob detection is the magic wand tool in a paint program. When you click an area all similar and contiguous pixels are included to create a selection area. Another is optical character recognition (OCR), where the blobs that are found are possible characters.
The type of blob recognition I am going to describe is a very simple algorithm that does not use calculus – this is a straight forward recursive algorithm, however it may not be the best candidate for the job.

The Goal

The goal of this algorithm is to be able to identify the blobs in the following black and white image.
Source imageSource with blobs selected
The first image above is the source image, and the second is the source with each pixel outlined in black and the blobs outlined in red. I will define a blob as a collection of pixels that share any of the 8 possible common borders. Notice that a blob could be defined using only 4 borders (that is, don’t count diagonal boarders), which would change the top two blobs in the example.

The Algorithm

The algorithm is simply going to count the number of pixels in the blob. This could easily be extended to actually collect the pixel locations to create a region. Keep in mind that in this example black pixels are “on” and white pixels are “off”. The basic idea is as follows:

1. To initialize, create a copy of the image, which will
be altered to mark the progress of the algorithm
2. Base Cases
a. If a pixel is off, return zero
b. If a pixel is out of bounds, return zero
3.Recursive Case
a. If a pixel is on
i. Turn off the current pixel
ii. Return 1 plus the sum of all 8 surrounding
pixels

And that’s the entire algorithm.

An Example: BlobSize(6, 1)

An example might help to clear up any confusion. Let’s get the size of the blob located at pixel (6, 1). The following image is a visual representation of what the algorithm is going to do in the first two recursive calls.
BlobSize at 6,1BlobSize at 7,2
Assume our function is BlobSize(x as Integer, y as Integer) and returns an Integer (the number of pixels in the blob), and we are calling BlobSize(6, 1). We go back to the algorithm: step 1, this pixel is on and it is within the boundaries of the image, so we are not at a base case – skip to step 2. Step 2, the pixel is on, so turn it off and then return 1 + the sum of all surrounding pixels, that is:

Return _
BlobSize(5, 0) + BlobSize(6, 0) + BlobSize(7, 0) + _
BlobSize(5, 1) + 1 + BlobSize(7, 1) + _
BlobSize(5, 2) + BlobSize(6, 2) + BlobSize(7, 2)

Notice how each term in the return statement corresponds to a pixel from the 3×3 box in the image above (the 1 being the current pixel). The results from pixels 1, 2, 3, 4, 5, 6, and 7 will all be base cases and return 0 since they are off. But notice that pixel 8 is another recursive case, where it’s surrounding 8 pixels will be tested and the blob will expand. This function will ultimately return 5.
Here is the code for BlobSize():

Private Function BlobSize(ByVal image As Bitmap, _
ByVal x As Integer, ByVal y As Integer) _
As Integer
If x < 0 Or x >= image.Width Or _
y < 0 Or y >= image.Height Then
Return 0
End If
Dim c As Color = image.GetPixel(x, y)
If c.R = 255 And c.G = 255 And c.B = 255 Then
Return 0
End If
image.SetPixel(x, y, Color.White)
Dim size As Integer = 1
For i As Integer = -1 To 1
For j As Integer = -1 To 1
If i = 0 And j = 0 Then Continue For
size += BlobSize(image, x + i, y + j)
Next
Next
Return size
End Function

The Problem

Here is a comparison of the blob size (n) to the number of recursive calls (r):

n | r
---------------
5 | 41
6 | 49
13 | 105
1125 | 9001
1350 | 10801
2925 | 23401

The performance of this algorithm is 8n + 1 or O(n) in Big O notation. This causes a problem because the algorithm is recursive. The problem occurs for larger values of n. On my machine, the upper limit was 26705 recursions on a 196 x 156 pixel, solid black image. At this point the system ran out of stack space and I got a StackOverflowException.

Conclusion

This algorithm will work for images no larger than 3338 pixels on my machine; however the stack space can possibly vary from machine to machine. There are several sneaky tricks that could be done to make it work on larger images, but the bottom line is that this algorithm is not well suited for the task.
Ultimately, you must resort to using calculus to solve this problem with larger image sizes. I will show an implementation using the Gaussian or Laplaceian method some time in the future.

You can download the code and test images for BlobSize() here.

Tagged Tags: , , on February 9, 2008 at 10:00 am

Modeling with Euler’s Method

Recently, I have been learning how to create mathematical models in code so I figured I would share a little. Models are used to show how some phenomenon in life behaves and are typically based on an equation that represents the thing to be modeled. So the trick is to find a way to solve those equations in code.

Euler’s method (pronounced Oiler) is one way of doing this, and I will show a very simple example using the following differential equation:

dy/dx = 2x

When solved by integration, the actual solution is:

y = x ^ 2

So that is the value that we want to obtain from the program after it completes. The Euler method requires initial known values for a starting point and then determines the slope of the tangent line at that point in order to find the next *approximate* point on the curve. The initial x, y values can be arbitrary (x=1 and y=1 are used in the example), but it must be a solution to the system – using x=1, y=8 would be bad since x^2=8 is false.

Since this is an approximation, the the final result will always be off by some margin of error (for example, for xMax=2, y should equal 4, since 2 ^ 2 = 4, but when you run this code, you will see that it is off by a small amount). This error is a function of the step size, dx. Smaller step sizes are generally more accurate. In the example below, a step size of 0.01 is used, which results in an error of 0.25% and a step size of 0.005 results in an error of 0.103%.

In the code below, xMax is the ending x-value that will be approximated, dx is the step size, and x & y are the initial known values. The code was written to be run as a console application in VB.NET 2005.


Sub Main()
Dim x As Double = 1
Dim y As Double = 1
Dim xMax As Double = 2
Dim dx As Double = 0.01

Console.WriteLine("X , Y -- % Error")
Do While x < xMax
Euler(x, y, dx)
Console.WriteLine( _
"{0:f2} , {1:f2} -- Error: {2:f3}%", _
x, y, (1 - (y / x ^ 2)) * 100)
Loop

Console.Read()
End Sub

Sub Euler(ByRef x As Double, _
ByRef y As Double, _
ByVal dx As Double)
Dim slope As Double = 2 * x
Dim change As Double = slope * dx
y += change
x += dx
End Sub

Here is a sample run:

X , Y -- % Error
...
1.91 , 3.64 -- Error: 0.249%
1.92 , 3.68 -- Error: 0.250%
1.93 , 3.72 -- Error: 0.250%
1.94 , 3.75 -- Error: 0.250%
1.95 , 3.79 -- Error: 0.250%
1.96 , 3.83 -- Error: 0.250%
1.97 , 3.87 -- Error: 0.250%
1.98 , 3.91 -- Error: 0.250%
1.99 , 3.95 -- Error: 0.250%
2.00 , 3.99 -- Error: 0.250%

As you can see from the sample run, the output claims that:

x ^ 2 = y
2 ^ 2 = 3.99

illustrating the approximation error. However, models are not supposed to be exact, they are supposed to be a representation that exhibits behavior similar to the thing your are trying to analyze. Also, there are much better approximation methods (such as Runge-Kutta). You wouldn't want to really use this method to solve the equation in this example, this was just to establish how the concept works. As the complexity of the model grows, this method becomes much more attractive.

This basic framework can be used to model all sorts of neat stuff. Coming up, I will post an example model for a cooling liquid using this method in conjunction with Newton's law of cooling, and then gravity and falling objects.

You can read more about Euler's method here.

Tagged Tags: , , , on January 20, 2008 at 11:57 pm