useless rants

7Oct/110

Game programming in Haskell part 2 : dealing with object interactions in a high-level and composable way.

My first post was about heterogeneous collections in Haskell. I saw this as useful, because I thought of my game objects as sharing a single property, the ability to update themselves in reaction to outside events. I ended up with something along the lines of this:

class Game g where
  handleEvent :: Event -> g -> g
 
data GameWorld where
  GameWorld :: (Game g) => [g] -> GameWorld

The issue with this design comes from the game objects not being able to interact with each other, as the Game class only tells me that they can handle external events, and nothing else. The list structure doesn't tell me anything more about which object should know what about other objects, so I came up with the idea of dependencies between objects.

I will introduce my ideas through an example. Let us consider a simplified shoot them up game, with a single player ship advancing through a 2D world. The ship must avoid enemy bullets in order to not get destroyed. We can describe at a high level of abstraction how our game objects would interact with each other:

In this diagram, nodes are game objects, and arrows from an object to another denote a dependency from the source object to the target object. This means that the target object must know the information about the source object in order to update itself according to the action denoted by the label on the arrow. In Haskell, this is how we would translate it:

type Dependency a b = a -> b -> b

For instance, we would define the dependency between time and a single bullet in the following way:

type Time = Float
 
data Bullet = Bullet {
  bulletPosition :: (Float, Float),
  bulletSpeed :: Float,
  bulletAngle :: Float,
  bulletHitboxSize :: (Float, Float)
}
 
moveBullet :: Dependency Time Bullet
moveBullet dt bullet@(Bullet (x, y) speed angle _) = 
  bullet {
    bulletPosition = (x + cos angle * speed * dt, y + sin angle * speed * dt)
  }

But we're not just interested in moving a single bullet, we'll want to move a collection of bullets. It turns out, any dependency modifying a single type can easily be turned into a dependency modifying a list of said type:

toMany :: Dependency a b -> Dependency a [b]
toMany f a = map (f a)
 
moveBullets :: Dependency Time [Bullet]
moveBullets = toMany moveBullet

Similarly, we can define the dependency between a single bullet and the player ship in the following way:

data Ship = Ship {
  shipPosition :: (Float, Float),
  shipDirection :: Float,
  shipSpeed :: Float,
  shipHitboxSize :: (Float, Float)
}
 
destroyOnCollision :: Dependency Bullet (Maybe Ship)
destroyOnCollision bullet maybeShip = do
  ship <- maybeShip
  if bullet `collidesWith` ship
  then Nothing
  else Just ship
 
collidesWith :: Bullet -> Ship -> Bool
(Bullet (bx, by) _ _ (bw, bh)) `collidesWith` (Ship (sx, sy) _ _ (sw, sh)) =
  not (bx + bw < sx || by + bh < sy || sx + sw < bx || sy + sh < by)

Here, we denote the possibility of the ship being destroyed by wrapping it into a Maybe datatype. This in turn allows us to use the Maybe monad to elegantly handle the case where our ship is already destroyed, so it stays destroyed. Otherwise, we check for collision between the two object's hitboxes. Now we would like to turn this dependency into one which would check for collision with a list of bullets, namely the one produced by the moveBullets dependency. In order to do this, we must first turn this dependency into one which depends on a list of bullets, instead of just one:

fromMany :: Dependency a b -> Dependency [a] b
fromMany f as b = foldl (\b' a -> f a b') b as
 
destroyOnAnyCollision :: Dependency [Bullet] (Maybe Ship)
destroyOnAnyCollision = fromMany destroyOnCollision

Though the interface of fromMany is quite similar to toMany's one, it is quite different under the hood. Indeed, toMany doesn't have to care in what order the dependencies are applied, because the updated objects are independent from each other. But in this case, the same object is being updated many times through dependencies from multiple objects, and as such we must choose an order in which to execute them. Using foldr instead of foldl could change the meaning of fromMany, if the order of computations is significant.

As you can see, toMany and fromMany only require map and foldr respectively, so we can generalize them in the following way:

toFunctor :: (Functor f) => Dependency a b -> Dependency a (f b)
toFunctor g a = fmap (g a)
 
fromFoldable :: (Foldable f) => Dependency a b -> Dependency (f a) b
fromFoldable g fa b = foldl (\b' a -> g a b') b fa

This means that any Functor that is also a Foldable can be used on both the source and the target of a dependency. This includes most collection datatypes, and Maybe.

Coming back to our shoot them up, we now would like to compose moveBullets and destroyOnAnyCollision into a bigger dependency which would move bullets then check for collision with the ship. We can do this in the following way:

compose :: Dependency a b -> Dependency b c -> Dependency (a, b) c
compose f g (a, b) c = g (f a b) c
 
moveBulletsThenDestroyShip :: Dependency (Time, [Bullet]) (Maybe Ship)
moveBulletsThenDestroyShip = compose moveBullets destroyOnAnyCollision

Composition of dependencies does not quite work like function composition or otherwise arrow composition. If it was the case, we would expect our resulting dependency to have type Dependency Time Ship. It turns out that we cannot abstract out the middle datatype, because we need one to update in the first place before feeding the result into the second dependency. It follows from this that Dependency is not a Category and therefore not an Arrow. But it is powerful enough that we can compose them into larger graphs of dependencies. Let us define the dependency between time and the player ship, and we'll see how we can make it fit with what we already have:

moveShip :: Dependency Time Ship
moveShip dt ship@(Ship (x, y) speed angle _) =
  ship {
    shipPosition = (x + speed * cos angle * dt, 
                          y + speed * sin angle * dt)
  }
 
moveShipIfAlive :: Dependency Time (Maybe Ship)
moveShipIfAlive = toFunctor moveShip

So, we need to somehow take moveShipIfAlive and moveBulletsThenDestroyShip and compose them into a larger dependency. A simple way involves first merging their target object, then merging their source object.

mergeTarget :: Dependency a c -> Dependency b c -> Dependency (a, b) c
mergeTarget f g (a, b) c = g b (f a c)
 
moveBulletsAndShipThenDestroyShip' :: Dependency (Time, (Time, [Bullet])) 
                                                 (Maybe Ship)
moveBulletsAndShipThenDestroyShip' = 
  mergeTarget moveShipIfAlive moveBulletsThenDestroyShip

This merges their target object, turning them into a single dependency. The merging is done so the left dependency is applied before the right one, so we move the ship before checking for collision. We are getting closer to closing the triangle in the diagram shown earlier, but we are not quite there yet. Merging the source Time parameters is a bit trickier, as Time isn't the same as (Time, [Bullet]), so we'll need to introduce a slightly more powerful merging function:

mergeSourceWith :: (c -> (a, b)) -> Dependency (a, b) d -> Dependency c d
mergeSourceWith f g c d = g (f c) d
 
mergeSource :: Dependency (a, a) b -> Dependency a b
mergeSource = mergeSourceWith (\a -> (a, a))
 
moveBulletsAndShipThenDestroyShip :: Dependency (Time, [Bullet]) (Maybe Ship)
moveBulletsAndShipThenDestroyShip = 
  mergeSourceWith (\(dt, bullets) -> (dt, (dt, bullets))) 
  moveBulletsAndShipThenDestroyShip'

As you can see, mergeSourceWith allows us to combine the inputs of our dependency into one in a very specific way.

Note that we could have also started by merging the source of our two dependencies to get a single one, then merging the targets. But merging targets of a single dependency is quite tricky, and may not behave as one would expect. To get the same result, we would need a mergeTarget' to look like this:

mergeTarget' :: Dependency a (b, b) -> Dependency a b
mergeTarget' f a b =
  let leftDep a' b' = fst $ f a' (b', b)
      rightDep a' b' = snd $ f a' (b, b') in
  rightDep a (leftDep a b)

While this works fine if the parameter dependency is the result of a simple merging of sources, I don't know what it would do on a dependency that is not. This is why I think the other way around is a better way to create our result.

Right now, our game is not very interesting, as it doesn't interact with the outside world. Next time, I will describe how user inputs and other events, as well as rendering, playing sound and writing data to the disk fit into a graph of dependencies.

30Aug/110

Game programming in Haskell – part 1

As a personal project, I am currently developing a simple 2D game in Haskell. My goal is to create a manic shooter, also known as danmaku, a specific kind of 2D shoot them up game, and also to learn how to "design in the large" using Haskell. There are some game libraries and frameworks for Haskell out there, and I decided to use Gloss, a very simple library for drawing vector graphics, which also offers convenient datatypes for manipulating points, extents and pictures. It also abstracts IO, looping, input handling and rendering from the programmer, only requiring a few pure functions describing how the game behaves. In this post, I will describe the problems I have met and the design patterns I used to solve them.

One of the first hurdle I encountered was that I needed to keep a collection of game objects, sharing common behaviors such as the ability to be rendered or to handle game events, but not necessarily the same type. One very basic way to do this in Haskell would be to use a single datatype which would be the sum of those datatypes:

data GameObject = Ship ShipData | Bullet BulletData ...

This approach has the advantage of not requiring any fancy language extension to work and it allows downcasting, but it is not very extensible, because I would have to manually add every new object type to this sum type. Also, it kind of defeats the purpose of static typing, because I would have to check at runtime whether an object has the expected constructor or not, without any static guarantees.

In OOP, this problem would easily be solved by upcasting game objects to a common interface. In Haskell, existential datatypes allow us to do just that. The trick for this is to quantify the type variable on the right side of the data declaration with a forall, so the type variable doesn't appear on the left hand side, and therefore isn't set at the type level. This allows us to use any datatype as the argument to the constructor, and still get the same datatype. If we allowed any datatypes, they would become essentially useless, as we wouldn't be able to do anything with them afterwards. This is why we add an additional constraint: the datatype has to instantiate the GameObject type class.

data AnyObject = forall go . GameObject go => AnyObject go
 
class GameObject go where
  currentFrame :: go -> Picture
  actionHandler :: Action -> go -> go
  destroyed :: go -> Bool
 
instance GameObject AnyObject where
  currentFrame (AnyObject obj) = currentFrame obj
  actionHandler action (AnyObject obj) =
    AnyObject $ actionHandler action obj
  destroyed (AnyObject obj) = destroyed obj

Datatypes wrapped with the AnyObject constructor must instantiate the GameObject class, which is checked statically. However, once a datatype is wrapped with AnyObject, there is no way to downcast it back to the original datatype, the only information we have available is that it instantiates the GameObject type class. This allows us to make AnyObject an instance of GameObject.

In the same way, Gloss enforces that the game state must be encapsulated in a single datatype, which isn't very convenient, as some parts of the game may be different enough to warrant different datatypes. Using another existential datatype providing only the needed behavior for the main loop allows us to do just that, so we can use different datatypes for levels and menus for instance. I don't know how I will handle transitioning from one game state datatype to another yet, though.

class GameWorld gw where
  gameObjects :: gw -> [AnyObject]
  keyConfig :: gw -> [((Key, KeyState), Input)]
  globalActionHandler :: Action -> gw -> gw
 
data AnyWorld = forall gw . GameWorld gw => AnyWorld gw
 
instance GameWorld AnyWorld where
  gameObjects (AnyWorld gw) = gameObjects gw
  keyConfig (AnyWorld gw) = keyConfig gw
  globalActionHandler action (AnyWorld gw) =
    AnyWorld $ globalActionHandler action gw

Apparently, this same issue can be addressed with Generalized Abstract Data Types (GADTs), however I am not very familiar with those and don't understand them completely.

I am currently at the point where I have some basic game objects defined which are only mutated on collisions, user inputs and every frame. While this is sufficient to do pretty much everything I need, it is still very inconvenient for programming more complicated stuff in game, for instance creating new objects on specific events, or planning sequences of game events. I will try to look into a way to make this more convenient next time.

2Jun/110

A proof that Legendre’s inner product is indeed an inner product

I'm currently studying numerical integration, and we came across inner products for \mathbb{R}-vector spaces. Let E be a \mathbb{R}-vector space, then a form E \times E \rightarrow \mathbb{R} is an inner product if and only if it is bilinear, symmetrical, positive and definite. Let \mathcal{P} be the vector space of all real-valued polynomial functions, then Legendre's inner product is defined by:

\langle u, v \rangle = \int_{-1}^{1} u(x)v(x)dx

Polynomial functions are continuous over \mathbb{R}, and therefore integrable over [-1 ; 1], so this product is well defined. It is symmetrical by commutativity of the product in \mathbb{R}, bilinear because integration is linear, and positive because squares are positive in \mathbb{R}. These proofs are short and elementary, so I will not detail them further. Definiteness is the tricky part. Our product is definite if and only if:

\int_{-1}^1 u^2(x) dx = 0 \; \Rightarrow \; \forall x \in \mathbb{R} \; (u(x) = 0)

This result may seem obvious at first glance - if its integral is 0, then it must be equal to 0 - but this is a trap. We require that our polynomial be equal to 0 over not only [-1 ; 1], but \mathbb{R}. Our proof is going to rely on properties of roots and continuity of polynomials.

We will reason by contradiction, and we assume that our polynomial, which we are going to call u, is not the constant zero polynomial.

The idea of the proof is to show that there is at least a \lambda in [-1;1] such that u^2(\lambda) \neq 0, and that by continuity there is an interval containing \lambda contained in [-1; 1] which has a non-zero integral.

u^2 is a product of polynomial functions, and is therefore a polynomial function. By assumption u is not the constant zero polynomial, and so u^2 is not the constant zero polynomial, and therefore has a finite number of roots. No matter what those roots are, we can choose \lambda \in [-1 ; 1] such that \lambda is not a root of our polynomial, and therefore u^2(\lambda) > 0.

u^2 is a polynomial functions, and therefore it is continuous on \lambda. By definition of a continuous function, this means that for all \epsilon > 0, there exists \eta > 0 such that for all x \in [\lambda - \eta ; \lambda + \eta], |u^2(x) - u^2(\lambda)| \leq \epsilon. We are only interested in the part of this interval that's in [-1; 1], so we let [a ; b] = [\lambda - \eta ; \lambda + \eta] \cap [-1; 1]; because this inverval contains \lambda, and \eta \neq 0, we have a \neq b.

We must now show that \int_a^b u^2(x) dx is greater than zero. In order to do this, we define e(x) = u^2(x) - u^2(\lambda) which is the "error" we make by approximating. We then have:

\int_a^b u^2(x) dx = \int_a^b u^2(\lambda) + e(x) dx = u^2(\lambda)(b - a) + \int_a^b e(x) dx

By continuity, we have for all x \in [a;b], e(x) \geq -\epsilon, and therefore we have:

-\epsilon(b - a) \leq \int_a^b e(x) dx

As this is valid for all \epsilon > 0, this is also valid if we choose \epsilon \in ]0;u^2(\lambda)[, in which case we have:

-u^2(\lambda)(b - a) < -\epsilon(b - a) \leq \int_a^b e(x) dx

Which by transitivity, and by adding u^2(\lambda)(b - 1) to both side of the inequation, gives us:

u^2(\lambda)(b - a) + \int_a^b e(x) dx > 0

It follows from this that \int_{-1}^1 u^2(x) dx > 0, which contradicts our assumption that u is not the constant zero polynomial. So it follows that u = x \mapsto 0, and therefore Legendre's inner product is definite.

Filed under: maths No Comments
18Apr/110

Bézier curves in Haskell part 2

You can find the source for this article here.

Let \gamma be a Bézier curve over a polygon P = (P_0, ..., P_n), and let \gamma' be its derivative. To construct \gamma', we must first choose an origin O. As a result we are now working in a vector space. We define a polygon Q = (Q_0, ..., Q_n) where Q_i - O = \Delta P_i = P_{i + 1} - P_i. Then \gamma' is the Bézier curve over Q, up to an homothety of ratio n.

> bezierDerivative1 :: (AffineSpace p, AffineSpace (Diff p),
>                       VectorSpace (Diff p), VectorSpace (Diff (Diff p))) =>
>                      Seq p -> Scalar (Diff (Diff p)) -> Diff p
> bezierDerivative1 polygon t =
>     let n = length polygon - 1
>         polygon' = fmap (uncurry (.-.)) $ pairs polygon
>         bezier1 = bezierSeq polygon' t in
>     sumV (take n (repeat bezier1))
>
> pairs :: Seq a -> Seq (a, a)
> pairs = fst . foldl toPair (empty, Nothing)
>     where toPair (ps, Nothing) p0 = (ps, Just p0)
>           toPair (ps, Just p0) p1 = ((p0, p1) <| ps, Just p1)

Now that's some serious type level fun. Here the AdditiveGroup module doesn't provide functions for multiplying by integers, so I had to roll my own. The pairs function just pairs up the elements of the sequence into a sequence of pairs. Let's try this with an example:

> examplePolygon :: Seq (Double, Double)
> examplePolygon = fromList [(1, 1), (-3, -5), (-3, 5), (1, -1)]

So we have this curve for the Bézier curve over examplePolygon:

and its first derivative:

Now how do we generalize this to the rth derivative of a Bézier curve, where r is a natural number? As the derivative is also a Bézier curve, we can apply the same reasoning to it. The rth derivative of \gamma is the Bézier curve over the polygon defined by the differences of the control points of the r-1th derivative, up to an homothety of ratio \prod_{i \in \{r+1, ..., n\}}\; i.

So we first define a function giving us the list containing the polygon for each derivative:

> polygon' :: (AdditiveGroup p) => Seq p -> [Seq p]
> polygon' = iterate (fmap (uncurry (^-^)) . pairs)

And we use it to define our rth derivative easily:

> bezierDerivative :: (AdditiveGroup p, AffineSpace p, VectorSpace (Diff p)) 
>                  => Int -> Seq p -> Scalar (Diff p) -> p
> bezierDerivative r polygon t =
>     let n = length polygon - 1
>         polygon_r = polygon' polygon !! r
>         bezier_r = bezierSeq polygon_r t in
>     sumV (take (product [r+1..n]) (repeat bezier_r))

So we have, for r \in \{0, ..., 2\} over examplePolygon:

I think there are many improvement to be done here, most notably the final homothety is quite ugly and inefficient. I'll try to find a better solution.

Next time, we're going to look at B-spline curves.

16Apr/110

Proof that the factorial of a sum is divisible by the product of the factorials

I am currently teaching myself group theory using the wonderful free course notes by J.S. Milne. It doesn't have any prerequisites beyond some linear algebra and a touch of set theory, yet it is very comprehensive and sometimes challenging, in a good way.

After going through the first chapter, I stumbled upon the following exercise, which on first glance has nothing to do with group theory:

Let n = n_1 + ... + n_r be a partition of the positive integer n. Use Lagrange’s theorem
to show that n! is divisible by \prod_{i=1}^r n_i!.

Or less formally, show that the factorial of a sum is divisible by the product of the factorials. Thankfully the following hint is given, to show how this proposition can be proven in the language of group theory:

The symmetric group S_n contains a subgroup that is a direct product of subgroups S_{n_1}, ..., S_{n_r}.



S_k has order k!, so proving this would indeed prove our proposition. First thing to notice, the S_{n_i}s are clearly not subgroups of S_n as such, as the permutation do not have the right domain and codomain. So we first have to identify them to subgroups of S_n through an injective group homomorphism. Also, we want our images of the S_{n_i}s through this homomorphism to form a direct product, so we will keep this in mind while designing our homomorphism.

The winning strategy here is to notice that \{1, ..., n\} is a disjoint union of \{1, ..., n_1\}, \{n_1 + 1, ..., n_1 + n_2\}, up to \{(\sum_{i=1}^{r - 1} n_i) + 1, ..., \sum_{i=1}^r n_i \}. So we can map our permutations over disjoint subsets of \{1, ..., n\}. We formalize this with with the following map for all i \in \{1, ..., r\}:

\begin{array}{rcl} \alpha_i : S_{n_i} & \rightarrow & S_n \\ s_i & \mapsto & \begin{array}{rcl} \{1, ..., n\} & \rightarrow & \{1, ..., n\}\\ j & \mapsto & \begin{cases} s_i(j) & if \; j \in \{d_i + 1, ..., d_i + n_i\} \\ j & otherwise \end{cases} \end{array} \end{array}


where d_i = \sum_{k=1}^{i - 1} n_k. Let j \in \{1, ..., n\} and let (s_i, s'_i) \in (S_{n_i})^2. If j \in \{d_i + 1, ..., d_i + n_i\}, then we have:

\alpha_i(s_is'_i)(j) = (s_is'_i)(j)=\alpha_i(s_i)\alpha_i(s'_i)(j)


Otherwise, we have:

\alpha_i(s_is'_i)(j) = j = \alpha_i(s_i)\alpha_i(s'_i)(j)


Therefore \alpha_i is a group homomorphism, and Im(\alpha_i) is a subgroup of S_n. \alpha_i is injective, because if the image of two permutations are equal for all j \in \{1, ..., n\}, then they are equal for all j on the subset they operate on, and therefore by definition of \alpha_i the two permutations are equal.

For all i in \{1, ..., r\}, let S'_{n_i} = Im(S_{n_i}), and let G = S'_{n_1}...S'_{n_r}. We must now prove that G with the group structure on S_n is a direct product of the S'_{n_i}s.

Let i \in \{1, ..., r\}, then every permutation in S'_{n_i} operates only on a subset of \{1, ..., n\} over which no other permutation in S'_{n_1}...S'_{n_{i - 1}}S'_{n_{i +1}}...S'_{n_r} operates other than the identity on \{1, ..., n\}. Therefore their intersection is the trivial group.

Let s_i \in S'_{n_i} and g \in G. If we look at gs_ig^{-1} over \{d_i + 1, ..., d_i + n_i\}, then it's a composition of permutations and therefore a permutation over \{d_i + 1, ..., d_i + n_i\}. If we look at gs_ig^{-1} over the rest of its domain, then it's the identity. Therefore gs_ig^{-1} \in S'_{n_i} by definition of S'_{n_i}, so S'_{n_i} is normal in G.

G is therefore the direct product we were looking for. It is a subgroup of S_n and has order \prod_{i = 1}^{r} n_i!, and therefore Lagrange's theorem states that it must divide n!.

I must say I am amazed by the how powerful group theory is. Groups are at first glance very simple mathematical objects, and I certainly didn't expect such a rich theory to come out of them. And this is only the first chapter of the textbook I'm studying, so I'm very excited about what's following. Maybe I'll post some other interesting proofs.

Filed under: maths No Comments
15Apr/110

Bézier curves in Haskell part 1

This article was intended to be a literate Haskell program, but WordPress decided otherwise. Oh well. You can still download the compressed source file here.

I'm currently studying Bézier curves, so I figured I would implement what I know.

I know of two algorithms for constructing Bézier curves: De Casteljau's algorithm and using Bernstein polynomials. The algorithm using Bernstein polynomials is more efficient, but propagates more errors during floating point operations. De Casteljau's algorithm also provides wonderful geometric insight into the nature of Bézier curves, so this is the one I'm going to use.

You can construct a Bézier curve over any affine space, and the AffineSpace typeclass allows us to work in complete generality, so this is what we're going to do.

bezier, bezier' :: (AffineSpace p, VectorSpace (Diff p)) => 
                   [p] -> Scalar (Diff p) -> p

So far it's only a bit of type gymnastic in order to work with affine spaces.

De Casteljau's algorithm goes as follows: let p = (p_0, ..., p_n) a polygon in an affine space, and let \gamma be the bézier curve over p. To compute \gamma(t) where t \in [0 ; 1], we first compute the points (1 - t)p_i + tp_{i+1} for i \in \{0, ..., n - 1\}. We then iterate this process over the points we get, until we get only one point which is \gamma(t).

This algorithm is very imperative in nature, so we will derive a recursive algorithm from it. If we look at the formula for \gamma the algorithm gives us, we notice that the final affine linear interpolation occurs over two other Bézier curves evaluated at t. And we also notice that a Bézier curve over a single point is this same point.

So our first version simply does recursion over the polygon, exploiting the fact that a Bézier curve over (p_0, ..., p_n) can be expressed in function of the Bézier curve over (p_1, ..., p_n) and the Bézier curve over (p_0, ..., p_{n-1}):

bezier' [p] _ = p
bezier' polygon t = 
    let poly0 = reverse . tail $ reverse polygon
        poly1 = tail polygon in
    alerp (bezier' poly0 t) (bezier' poly1 t) t

Here alerp p p' t varies from p to p' as t varies from 0 to 1. We test it with an example polygon:

examplePolygon :: [(Double, Double)]
examplePolygon = [(0, 0), (1, 1), (2, 0)]
 
plotBezier f = do 
    let i = [0,0.001 .. 1]
        points = map f i
        (xs, ys) = unzip points
    plotWindow xs ys

And then running the following in ghci:

plotBezier (bezier' examplePolygon)

gives us the following result:

Now, bezier' is not without fault, as we must remove the last element of the list, which is highly inefficient for large lists (hence the ugly double reverse, which is in O(n) anyway). In practice, Bézier curves are often used with small polygons and then attached to other low degree bezier curves to form larger curves, so this is not a big deal for this use case. But we cannot let such inefficiency pass by innocently.

Basically, we need a data structure which allows efficient removal at the beginning or the end of the sequence. This is precisely what the Seq data structures offers:

bezier polygon t =
    bezierSeq (fromList polygon) t
 
bezierSeq :: (AffineSpace p, VectorSpace (Diff p)) => 
             Seq p -> Scalar (Diff p) -> p
bezierSeq polygon t =
    let poly0 :> _ = viewr polygon
        p :< poly1 = viewl polygon in
    if null poly1
    then p
    else alerp (bezierSeq poly0 t) (bezierSeq poly1 t) t

The Seq datatypes guarantees constant-time removal of the first or last element. Otherwise the algorithm is precisely the same as in bezier'.

Next time, we will look at derivatives for Bezier curves.

10Apr/100

A* search for a mobile robot

As a student in computer science, I was involved until very recently into a software development project. With a team of 4 classmates, we were in charge of implementing the A* path finding algorithm for a mobile robot. This robot is scheduled to compete in the "Coupe de France de robotique", a robotics competition, and is being built jointly by students in electronics, mechanics and computer science.

Our goal for this project was clearly defined. We had to provide a simple C API for searching an optimal path between two points on a map, taking into account obstacles in the way using the A* search algorithm. As the robot has to run without any external help, our program had to be embedded on a 8-bit microcontroller clocked at 16 mHz with only 64 kB of RAM and 128 kB of ROM, no dynamic allocation and no floating-point operations. It was quite a challenge to work with those constraints, considering the overall complexity of the algorithm.

Here is a simple explanation of the algorithm in python-like pseudo-code:

# returns the path between the nodes start and goal in the graph
def astar_search(start, goal):
    # the open list contains the nodes to be explored
    open_list = [start]
    # the closed list contains the nodes already explored
    closed_list = []
    # the algorithm stops when there are no more nodes to explore
    # (which means there is no possible path between start and goal)
    # or when we are exploring the goal node
    while open_list:
        # the best node in the open_list is the one with the smallest F value
        # (see the F function for details)
        # the current node is the node we explore at this iteration
        current = pop_best(open_list)
        # if the best node is the goal node, then we found our path
        # and reconstruct it by following parent nodes
        if current is goal:
            return reconstruct_path(current)
        # otherwise, we explore its neighbors 
        # (its adjacent nodes in the graph)
        for neighbor in neighbors(current):
            # if the neighbor is in the closed list, it was already 
            # explored so there is no need to look at it
            if not neighbor in closed_list:
                # if the neighbor is already in the open list, it
                # means we found a new path to access this
                # neighbor. If the new path is better, we need
                # to update the G value of this node and his parent
                # node so we only keep the best path
                if neighbor in open_list:
                    # the new G value is the G value of the current
                    # node plus the cost to go from the current node
                    # to the neighbor
                    if new_G(current, neighbor) < G(neighbor):
                        set_G(neighbor, new_G(current, neighbor))
                        set_parent(neighbor, current)
                # otherwise, we initialize its G and H value then
                # add it to the open list for further exploration
                else:
                    set_G(neighbor, new_G(current, neighbor))
                    set_H(neighbor, heuristic_distance(neighbor, goal))
                    set_parent(neighbor, current)
                    insert(open_list, neighbor)
        # now that we explored the current node,
        # we add it to the closed list so we don't
        # explore it a second time
        insert(closed_list, current)
    # if we get at this point, it means there is no path
    # between start and goal, so the algorithm fails
    return []
 
# F returns the sum of the distance from the start to
# the node (G) and the estimated remaining distance
# between the node and the goal node (H)
def F(node, goal):
    return G(node) + heuristic_distance(node, goal)

Now if you don't get everything in the algorithm, just keep in mind it explores the nodes the same way Dijkstra's algorithm does, except it favors nodes which are closer to the goal during its exploration. It's usually faster than Dijkstra's algorithm, but it consumes more memory to keep the G and H values for each node and won't work well on labyrinth-like graphs, where the heuristic distance estimate will generally lead the search in the wrong direction. It fits to the particular case of our robot, since memory is relatively abundant compared to clock speed. Also, the competition rules state that every match opposing two robots lasts 90 seconds, so computing time is an important factor. Performance was our #1 goal.

In order to get the fastest possible algorithm, it was very important to choose the right data structure for the open list, as it is where most of the execution time is consumed. A quick look at the algorithm tells us that we need 2 operations on it, insert and pop_best. So this is basically a minimum priority queue, where the node are ordered by F value. Additionally, as most efficient data structures need to keep nodes in a certain order, we need to be able to re-balance the open list when the G value of a node is updated.

Most papers on A* told us that a binary heap is the best way to implement the open list, as it insert nodes and delete the best node in O(log(n)) complexity. Additionally, the heap can be efficiently put into a simple array, which is nice considering we don't have dynamic memory allocation. But to be sure it was a good solution, we put it to the test alongside a sorted array and a simply linked list, which both insert in O(n) complexity and pop the best node in O(1) complexity. Surprisingly, the sorted array performed significantly better on our workstations than the binary heap, with the linked list seriously behind. The most likely explanation is that the sorted array benefits more from cache, but as there is no cache on the microcontroller we stuck with the binary heap. Now what this tells us, is that a simple sorted array will perform better than the more complex binary heap in your everyday computer, at least for graphs of moderate size (our test was performed on thousands of randomly generated 64 * 64 grids).

Often, you will read papers suggesting to use the same data structures for both the open and closed list. This is a really bad idea, as aside from insertion, the lists just don't share the same operations. You just need to add nodes to the closed list and check whether a node is in the closed list or not. The best and simplest way to implement it, is simply to keep a bit for each node which is at 1 if the node is in the closed list and at 0 otherwise (it's also a good idea to do this for the open list, as it significantly speeds up lookup too). At first we tried to use the same data structure for both list, and we were surprised to see that 99% of the execution time was in looking up nodes in the closed list. Indeed, the closed list quickly grows huge, and as you need to identify the node by its coordinates (not its F value) the lookup is in O(n) complexity and performed for every single neighbor.

Another important parameter for performance is the choice of the heuristic distance estimate. The first thing you usually think of is the euclidean distance:

euclideanDistance(dx, dy) = \sqrt{dx^2 + dy^2}

This is the straight-line distance between two points on a plane. Now aside from the expensive square root operation (which can be avoided by using the square of this distance AND the square of the G value in the F function), when your algorithm only looks at the 8 (or 4) adjacent neighbors of the current node it's simply not worth it, as it will usually be much lower than the distance the algorithm will find by exploring neighbor by neighbor (as it cannot move in straight line on a grid-based graph). Our tests showed very clearly that it explored way too many nodes, in average 1500~ nodes to find a path between (0,0) and (50,50) on randomly generated grids with 10% of obstacle cells. And the end result isn't even worth it, it finds path identical to the Manhattan distance with 8 directions:

manhattan8(dx, dy) = min(dx, dy) + \sqrt{2} | dx - dy |

This is the distance when you can only move in 8 directions on the grid, which is the case of our algorithm. This is basically the perfect heuristic. Under the same tests as the euclidean distance, it found paths just as good, while only exploring 600~ nodes on average. It's also much simpler to compute. Another possiblity is the Manhattan distance with 4 directions:

manhattan4(dx, dy) = dx + dy

It's basically an even tradeoff compared to manhattan8. It explores slightly less nodes on average, and finds slightly worse paths. Overall, manhattan8 was the obvious choice, but if you look at a different number of neighbors (24 or 4 for instance) other heuristics may be worth trying.

Another issue we ran into, was that the robot cannot turn while moving. It has to stop, turn, then move into a straight line. Our grid representation of the map was therefore quite inefficient, as our path was a list of 5 cm * 5 cm contiguous squares. The best way to resolve this problem was to change the representation of the map. Not using a grid at all, but a polygonal map or navigation meshes, and if you are thinking about implementing the A* algorithm I highly suggest you look into this first.

This was of course much more complex to implement, and we simply didn't have time to put it into application, so we stuck with a grid map, and built a post-processing smoothing algorithm. It's largely inspired by the "line of sight" algorithm used in video games to check whether it is possible to go from one point to another in a straight line without meeting an obstacle. Basically, it starts at the initial node, then tries to draw a "ray" between this node and the next node. If there was no obstacle, it does the same with the next node in the path, up until there is an obstacle and we start over with this node as the start node. The result is, we only keep the key points in our path, and it usually greatly simplifies the path to a few straight lines.

In the end, we somewhat miraculously managed to finish the job on time, and it actually worked on the robot after only 3 hours of work on it. We couldn't use the actual robot until the last day before the deadline, so it was extremely uncertain whether our program would run at all on it, or even perform well enough. In the end, it took between 2 and 3 seconds for the robot to compute the path with our algorithm, and he moved from the start point to the goal flawlessly, avoiding the specified obstacles. Overall it was a big success.

Filed under: algorithms No Comments
13Feb/100

Prüfer sequence and Haskell

A Prüfer sequence is a very compact data structure for a labeled tree. For a tree with n nodes, it only takes a list of n - 2 integers to describe it fully.

Trees in maths are usually represented as T = (X, A), where X is a set of nodes, and A is a set of couples (n1, n2) representing an edge between the nodes n1 and n2. There is a simple, iterative algorithm to convert a tree from the T = (X, A) form to the corresponding Prüfer sequence, which is what we are going to implement in Haskell.

X = {1,2,3,4,5,6}
A = {(1,5),(2,5),(3,5),(5,4),(4,6)}
Prüfer sequence: (5,5,5,4)

First, we define the data type for trees:

data Tree a = Tree [a] [(a, a)]
              deriving Show

The first line defines a new data type called Tree. This is kind of similar to a structure definition in C. [a] stands for a list containing elements of type a, and (a, a) stands for a pair of types a, therefore [(a, a)] stands for a list of pairs of types a. Here, 'a' stands for 'any type', so Tree is basically what java-heads would call a generic class, where 'a' is '<T>'.

As you may have guessed, [a] stands for the X set, and [(a, a)] for the A set. Don't pay too much attention to the second line, it simply allows us to display the contents of a Tree easily.

Let's take a look at the algorithm to convert a tree into a Prüfer sequence:

function prufer (tree T):
S = the empty set
C = the empty list
while T has more than 2 nodes left:
j = the leaf with the smallest number
k = the only node adjacent to j
add j to S
add k to S
remove j from T
end while
return C

This algorithm looks quite haskell-unfriendly at first glance. It has sequential operations, traditional iteration and mutable state, all of which don't exist in haskell. Thankfully, it's quite easy to convert such a simple iterative algorithm into a recursive algorithm. But before diving straight into it, it is a good idea to define some functions for manipulating trees:

size :: Tree a -> Int
size (Tree a _) = length a

Size returns the number of nodes in the tree. The first line may look quite scary at first, but it's only a type signature for the function. It means that the function size takes a Tree of any type, and returns an integer. It's very similar to the f : \mathbb{R} \longmapsto \mathbb{R^+} notation in maths.

The second line is fairly straightforward. It simply returns the length of the A list of the tree.

contains :: (Eq a) => a -> (a, a) -> Bool
contains x (y, z) =
    x == y || x == z

This function has a slightly more complicated type signature than the previous one. '(Eq a) =>' is a type constraint, so the function only accept types that have the '==' operator defined for them.

'a -> (a, a) -> Bool' means that the function takes one parameter of a type 'a', a pair of types 'a', and returns a boolean value. So contains takes a value and a pair, and returns true if the pair contains the value, false otherwise. The body of the function itself is self-explanatory if you have had any experience with a C-like language.

adjacent :: (Eq a) => Tree a ->a -> a
adjacent (Tree a x) n =
    let pair = head (filter (contains n) x) in
    if fst pair == n
    then snd pair
    else fst pair

The adjacent function is a bit more complicated. It takes a tree and a node n, and returns a node adjacent to n.
The expression 'let x = expr1 in (expr2 that uses x)' simply replaces x in expr2 by expr1, kind of like a macro in C/C++. It's called let-binding, and it's very useful for making the code clearer and more concise.

The filter function is very straightforward to use: filter p l returns all the elements of the list l that verify the predicate p. Here, it returns the list of pairs from the X set containing n, which in natural language means the full list of the edges adjacent to n.
Head then simply takes the first element of the list, as we only need one node.

The following if p then x else y sequence should be quite clear, you just need to know that fst and snd return the first and second value from a pair respectively.

degree :: (Eq a) => Tree a -> a -> Int
degree (Tree _ x) n =
    foldr (+) 0 (map (\y -> if contains n y then 1 else 0) x)

The degree function returns the number of edges adjacent to a node in the graph. It uses 2 other list processing functions, somewhat similar to filter.

The map function take a function f and a list l, applies f to every element of the list, then returns the new list. Here we use a lambda expression to define the function to apply. A lambda expression is a quick way to define anonymous functions. It's actually quite simple: '\y -> x' is the equivalent to the mathematical notation f(y) = x.

Here, the map function turns the list of edge into a list of Integers, 1 if the edge is adjacent to the node n, 0 otherwise. Afterwards, we use the foldr function: it basically replaces the commas in the list by the operator given, so 'foldr (+) 0 [1,2,3,4]' processes 0+1+2+3+4 = 10. Now it should be quite clear how this function works :).

leaves :: (Eq a) => Tree a -> [a]
leaves (Tree a x) =
    filter (\y -> degree (Tree a x) y == 1) a

The leaves function returns the list of the leaves in the tree. If you're unfamiliar with graph theory, a leaf is a node with only one other adjacent node, which also means its degree is 1. Here we use the filter function to return only the elements which have a degree of 1. Nothing new here.

rm :: (Eq a) => Tree a -> a -> Tree a
rm (Tree a x) n =
    Tree (filter (/= n) a) (filter (not . contains n) x)

The rm function removes a node n from the given tree, then returns the new tree.
So we return a Tree with his A set filtered so n is not there anymore, and it's X set filtered so there are no edges containing n anymore. The trick to the last part is to use the dot '.' operator, which is the function composition operator in haskell, exactly like f \circ g in maths.
This particular function should be extremely intuitive to understand :).

Now that we wrote all the building blocks, we can finally write the function to convert from a regular tree to a Prüfer sequence:

prufer :: Tree Int -> [Int]
prufer tree =
    prufer' tree []
    where prufer' t s | size t == 2     = []
                      | otherwise       =
                          let j         = minimum (leaves t)
                              k         = adjacent t j
                              new_t     = rm t j
                              new_s     = s ++ [j] in
                          [k] ++ prufer' new_t new_s

The type signature means our function takes a tree containing integers, and return a list of integers, which corresponds to the C list. Note that we didn't specify integers in our previous functions because it wasn't necessary, here we need a tree of n nodes with its nodes numbered from 1 to n.

We start the function by calling a helper function, prufer', which takes a tree of integers t and the s set. The s set is empty at the beginning, so we give it the empty list '[]'.

We then use a where clause to define the prufer' function. It works exactly like 'let ... in', except after the expression. The prufer' function is a recursive function, so we define a base case where the function will stop recursing, and a general case where the function will keep on recursing. For that we use a '|' guard to define a condition, exactly like an 'if ... then ... else' block would.

In the base case, when the size of t is lower or equal than 2, we stop the recursion by returning the empty list. In the general case, which is every other case, we start by defining the values we are going to use thanks to let-binding:

  • j, the leaf of t with the smallest value
  • k, the only node adjacent to j
  • new_t, the current tree t but without j
  • new_s, the current s list with j added (++ is the operator for concatenating lists)

As you can see, our previously defined functions do all the job for us, no need for things to be overcomplicated :).
Then, we simply add k to the C list, and call the function again on the new tree and the new S set.

And it's done! Open the file with your favorite haskell interpreter and see for yourself:

*Main> test
Tree [1,2,3,4,5,6] [(1,5),(3,5),(2,5),(5,4),(4,6)]
*Main> prufer test
[5,5,5,4]
*Main>

Here is the source code if you want to look at it more closely, I even included the function to turn a Prüfer sequence back into a regular tree, but I won't detail its inner workings here, 1500~ words are more than enough for such a boring topic :).

6Feb/100

A mathematical introduction to Haskell

Haskell is a functional programming language. This means that your program is not a sequence of actions which are executed one after another, but rather an expression which is evaluated. There is a wide variety of functional programming languages, such as good old Lisp and its dialects, OCaml, and even Python to some extent.

But Haskell is quite different from those, as it is what we call a pure functional programming language. An interesting consequence of this, is that functions in haskell are pretty much the same as their maths counterpart. No side effects, always return the same value for the same parameters. This makes haskell a very efficient programming language for maths.

Let's take a simple example: the factorial function. In maths, you could clumsily define it like this:

factorial(x) = 1 \cdot 2 \cdot ... \cdot (x - 1) \cdot x

From this basic definition, you can deduce a more formal one:

For x<1, factorial(x) is undefined.

For x=1, factorial(x) = 1

For x>1, factorial(x) = x factorial(x - 1)

The haskell factorial function can be defined in the same way:

factorial 1 = 1
factorial x = x * factorial (x - 1)

The first case, x < 1 is undefined. As such, we don't define it.
The second case, x = 1, makes it return 1. We don't need x to define it, so x doesn't appear.
The third case, x > 1, makes it return x * factorial(x - 1), as per the mathematical definition.

Straightforward, isn't it :) ?

Now for something slightly more complicated: the binomial coefficient. It is most commonly noted \binom{n}{k} in English-speaking countries, but out of personal preference and programming convenience I will use C^k_n instead.

So, let's start by a quick mathematical definition. First, there are a few particular cases:

C^n_n = C^0_n = 1

For  k > n, C^k_n = 0

Then, the general case has multiple definitions. The most well known, and also the most useful for small n values is:

C^k_n = \frac{n!}{k!(n - k)!}

A recursive definition is:

C^k_n = \frac{n}{k} C^{k - 1}_{n - 1}

It's usually a bit of a headache to program the binomial coefficient function.
Using the first definition, you have to use factorials, which will reach the upper limit for integers quite quickly, as 70! is greater than the estimated number of atoms in the universe.

The second one uses recursion, which is not handled very well by most programming languages. It may also be necessary to use floating point numbers, which means an inaccurate result.

But it's a piece of cake in haskell. Recursion is the only way to iterate over something in this language, so the interpreters and compilers optimize it very well. Here is one way to do it, using the second definition:

c n 0 = 1
c 0 k = 0
c n k = n * c (n - 1) (k - 1) `div` k

This one requires a bit more explanation:
The cases n = k and k > n are not explicitly specified, but will eventually reach k = 0 and n = 0 respectively thanks to recursion. It would be, of course, possible to write it down directly and get a slight increase in performance, but I like to keep things as short and concise as possible :).

div is the operator for euclidean division in haskell. It is a regular, prefix haskell function. This means the third line could just as well be:

c n k = div (n * c (n - 1) (k - 1)) k

The back quotes in haskell, when placed around a prefix function, turn it into an infix
function. The opposite is to put the function between parentheses, for instance:

a + b

is equivalent to:

(+) a b

This is mostly a cosmetic choice though. It helps readability in this particular case.

Note that I didn't write:

c n k = (n `div` k) * c (n - 1) (k - 1)

Prematurely dividing will lead to a dramatic loss in accuracy, as \frac{n}{k} probably won't be an integer. In this case the euclidean division will only return the floor of the real quotient.

Dividing by k after the product has been evaluated is best. It guarantees the result is exact, as the binomial coefficient is always an integer.

Haskell can be used for any purpose, not just maths. The X window manager xmonad, the version control system Darcs, and the first-person shooter Frag for instance where all written in haskell.

Getting around the lack of traditional loops, sequencing, and mutable state is quite a challenge for any beginner to intermediate programmer. Some of the essential tools for this, such as monads, are so abstract that simply understanding what they are about is difficult. To be honest, I still don't understand how it works myself :).

Regardless, haskell is one of the coolest programming languages out there, and any self-respecting programmer should give it a try.

31Jan/100

Conway’s game of life

The game of life is a cellular automaton. It's basically a grid where each cell can have either of 2 states: alive or dead. You give the game an initial state, and it evolves according to 3 simple rules:

  1. Any live cell with fewer than two live neighbors dies
  2. Any live cell with more than three live neighbors dies
  3. Any dead cell with exactly three live neighbors resuscitates

Simple initial states can lead to surprisingly complex results. For instance, the r-pentomino (also known as f-pentomino) only has 5 live cells at the beginning, but it takes 1103 generations for it to stabilize, and ends up with a population of 116.

r-pentomino

r-pentomino

It generates 6 gliders along the way. Gliders are what we call spaceships, patterns that indefinitely move in a given direction. The glider is also the smallest spaceship in the game, and the most well known pattern, as it as been proposed as the hacker emblem (in the original meaning of computer hacker).

A glider

glider

There are of course many other cellular automatons, which use different rule sets for the death and birth of cells. Day & Night is one of them, and it looks really cool!

Day & Night

I actually wrote 2 cellular automaton implementations, one in Python and the other in C#. I wrote the python one in one evening and 70 lines of code, and the C# one took me a few days and more than 130 lines of code. This says a lot about the expressiveness of both languages, but I'm digressing :) .

You can download the sources here and here. Just keep in mind the Python version won't run in windows, because it uses the ncurses library.

Filed under: Uncategorized No Comments