Tag: Math

  • The Bhuddabrot

    Authors Note:

    This blog post got out of hand. There is a lot of detail. Feel free to skim and if a heading catches your eye, read deeper. There are pictures along the way.

    Introduction

    This is a follow-up post to my post about Playing With The Mandelbrot Set.

    To set the stage, the Mandelbrot set is the set of all points on the complex plane where the function below continues to infinity without the result exceeding 2. It is the set of all complex numbers n applied to this function, where both the real and imaginary parts stay constrained.

    f(n+1) = n^2 + c

    The Bhuddabrot is the inverse. It is the set of all numbers where the function outlined above does escape the bounds of 2. The image is a representation of the complex plane, where each iteration of every point is plotted in sequence. In the Mandelbrot calculation each pixel represents the number of iterations for that position, where here each pixel is a hit-counter describing the number of times a trajectory calculation has hit that point regardless of its starting position. The brighter the pixel, the more times it was “hit” by a Mandelbrot trajectory.

    The important distinction is that the Mandelbrot shows the contours of the shape where the Bhuddabrot shows all the data points generated by doing the Mandelbrot calculation. They are simply different ways of displaying the same data.

    Computation Challenges

    I was struggling with having some noise in the image. The problem is less apparent at lower resolutions, but at high resolutions the noise is a problem. It was clear I needed to iterate on the code to resolve these issues, which I initially suspected of being floating point errors.

    The problem is that a full-resolution render of the image takes my laptop a smidge under 4-days, and that is before cranking the quality slider. Making a code change then waiting for literal days to see what (if any) impact your change made is not going to let this project conclude anytime soon. Sounds like a side-quest!

    The good news is that each calculation point is independent of each other. In theory it should be possible to break the work up to one “work unit” per “input-pixel” and then map the output heat-map onto an image.

    Swoole

    My first stop on my optimization journey was Swoole. Swoole is a tool that allows you to write concurrent code. Using my previous render (let’s say 4 days) as a benchmark, and assuming I was able to divide the work evenly among CPU cores, a c8g.12xlarge AWS EC2 instance with 48 cores would require 2 hours. This napkin maths out to about $4 per run. If I forget to turn the machine off the $2/hr charge is going to get spendy quick.

    Costs aside, maybe I could go slower on smaller instances, what would this look like?
    Well we would need a “work dispenser” that would yield work units from a generator. We would also need each worker to send it’s results back via some sort of channel. The last step would be to create an image out of a reduction of all the results.

    I had a quick chat with Greg in the PHP Freelancer’s Guild Discord and we came to the conclusion that because of the necessity of the shared data, Swoole may not be the best fit for the project. With Swoole we must scale vertically and the type of work doesn’t fit well.

    If you have an idea for how to solve this problem with Swoole, leave a comment!

    Queues And Workers

    What if, instead of having a shared work dispenser, we broke the script up into a few distinct steps.

    1. Pre-calculate the work to be done and insert each work unit into a queue.
    2. Workers pick up the work from the queue and write the resulting trajectory coordinates into a shared database.
    3. Finally, the database is queried for all coordinates in the image and some post-processing is done.

    Let’s do some napkin math on what each simulation would cost.

    • 6x c8g.2xlarge instances for two hours = $3.828 per run
    • 1x db.t3.small MariaDB instance = $36.32 per month

    Ok, so the numbers don’t look great. One way or another the compute is going to cost about $4/run, but here we also have the added overhead of the database.

    The other drawback is there will necessarily be a network round-trip request for each work unit. Millions of work units is a lot of round-trip requests.

    Trade Offs

    With the Swoole method I’m limited by the number of cores on any given physical machine. With the Queue route I’m limited by the network overhead to distribute the work. Both methods will cost about the same (minus the database).

    The advantage of the queue worker method is that I can scale elastically. In theory I could set up an auto-scaling-group to hold a baseline (or even zero) number of workers, and set it to scale up to a bunch of workers to consume the workload.

    In theory, at least, if I went the queue route I could tune how quickly an image is generated by tuning the maximum number of instances it is allowed to spin up. Each image would still have the same overall price tag, but I would have a control for how quickly I can iterate.

    The Code

    OK, let’s talk about the code.

    We are calculating a $trajectory for a $mandelbrot calculation which moves through a series of complex $points. If we exceed our maximum number of iterations let’s return false as this point is inside the Mandelbrot set. Otherwise, keep appending the $points within the bounds of the set to our $trajectory until we find a point outside the set.

    It may make more sense to just look at the code:

    public static function trajectory(Mandelbrot $mandelbrot): false|array
    {
        $trajectory = [];
    
        foreach ($mandelbrot->sequence() as $nth => $point) {
            if ($nth >= $mandelbrot->maxIterations) {
                return false;
            }
    
            if ($point->getReal()->abs()->isGreaterThanOrEqualTo(2) || $point->getImaginary()->abs()->isGreaterThanOrEqualTo(2)) {
                break;
            }
    
            $trajectory[] = $point;
        }
    
        return $trajectory;
    }

    Noisy Image

    The image generated so far has a bit of noise in it. It’s not as pretty as the images you can find on Wikipedia, for example. This simply will not do, as this is supposed to be print-quality and “noise” is not print quality.

    I suspect the noise is coming from floating point errors. So I spent some time implementing my library I link to below, rewriting everything to use my implementation instead. The gist of the change is that instead of working with ~14 decimal digits, I’m working with 64 and rounding the remainder off. I’m sure nothing bad will happen that will be talked about later in this blog post.

    The initial results were promising, here are side-by-side 250x250px renders of using the built-in math and the version using fancy math.

    Bhuddabrot image with included noise
    Bhuddabrot image with much of the noise removed

    The Impossible Task

    Here’s the line of thought. I need to do arbitrary complexity calculations to get the really precise outline of a shape with an infinite curve.
    The problem: You cannot have both.
    The reason is simple. Calculating precisely is necessary to get a clear image. This blows up your memory usage because suddenly you are n out of 5000+ iterations deep and you are out of RAM because you are working with numbers of >=n*2 number of decimal places. A technically solvable problem but not in my lifetime.
    I wonder if this is how Physicists feel about trying to capture an electron, you can either get the speed or velocity but not both. Here you can have to tell a lie to get an image or accept the limitations of floating point number capabilities. That, or use all the compute power in the universe to calculate the shape of an infinite curve.

    I’m not opposed to telling a small lie and putting a disclaimer on the finished product. Or I could be honest and go to print with the errors included. But if I had to tell a small lie, let’s make it as small as possible within reason. I’m not afraid of a small compute bill. It’s entertainment dollars.

    The Actual Code

    The first thing I had to do was figure out how to do arbitrarily precise calculations on complex numbers. Turns out brick/math doesn’t support complex numbers. That’s OK, I can bring my own.

    The next stop was to remove all references to division. The division path will blow up your memory usage if not carefully contained. The answer is to use rationals. “Oh boy I’m smart” I thought.
    The immediate next step is to realize that some numbers are going to be irrational by the nature of what we are trying to draw. Suddenly not feeling so smart.
    Some massaging between rationals where appropriate and decimals where irrational got me over the hurdle. “It’s only a small lie” I tell myself.

    Elevating To The Cloud

    We are not just throwing compute muscle behind a problem, we are being smart about it.

    The Work Queue

    Database Queue Driver

    Right out of the box, having needed a database to store the intermediate results anyways, let’s try with the database queue driver for Laravel.

    The first thing I noticed is that the throughput of the job to queue the work is terrible. TERRIBLE. It’s worse than local development on a sqlite database.

    $ time php artisan app:queue-bhuddabrot-jobs 1000 1000
    real    255m40.108s

    Memory Store Queue Driver

    A relational database is a bit overkill for the problem we are trying to solve. We are creating a big heap of work, then pulling the work off the queue as fast as we can.
    While this is doable with a relational database, we quickly hit a ceiling regarding how quickly we can pull items off the top of the pile.

    Here’s the catch though, adding and subtracting work from the queue is not the slowest part of the operation. Running the actual calculations is the slowest part, so while there is room for improvement optimization efforts here are chipping away at a relatively small slice of the total time spend.

    That said, the final render has 243 million jobs to process. If each job is 1.5kb that adds up to 364.5GB just to hold the tasks.

    So I don’t blow the budget in cache storage space, I wrote some code to throttle the queue size. This means the process is tied to how fast the work can be done, but it’s not the slowest part of the work anyways.

    Infrastructure Benchmarks

    To get some numbers I launched two servers, a c8g.medium and a c8g.large – each was configured to run with 4 workers. That means that there are 8 workers total spread across 3 compute cores.

    The “uppening” was upon us. If this worked then all I’d have to do is create an auto scaling group of a proper size, worker number per core ratio, and maybe bump the database specs.

    A 1k render of the Bhuddabrot image

    It ended up taking 10 hours left to generate a 1,000px by 1,000px image seems alright. It’s a start.

    The next day I ran a similar test, but I tweaked the RGB number of iterations. I set up an auto-scaling group to aim for 65% utilization on a maximum of 3 c8g.large spot-instance nodes, plus the c8g.medium “work dispensing” server. All (up to) 16 workers writing to a t4g.small database back end. This test took around 7 hours from start to finish.

    Scaling this up won’t be a problem. The reality is I just need to tweak the auto-scaling group maximum node count to make it go faster. Maybe tune it to buy spot instances so I don’t pay full price.

    And Then Shit Hit The Fan

    I queued up the final render. Set the auto-scaling group to allow up to 20 spot instances. A week and $75 later the progress bar say’s I’m at 10%. The easiest 10%. An estimated $750+ compute bill for a pretty render is not in the budget at the moment.

    According to the AWS billing dashboard, over the course of about 8 days I had sucked down 1368 computer hours. That’s 8 weeks of computer time.

    So I called up my buddy Chris for help. Surely I’m using the wrong tool for the job, he’s smart, he’ll have an opinion.

    We talked about GPU computing, but PHP doesn’t run on GPUs. That, and doing arbitrary precision math on a different computing architecture sounds like a lot of learning and this project is this close to being done, I don’t want to start over from scratch.

    Back To The Drawing Board

    I stood back and beheld my creation. It had a work dispenser, a caching layer, work servers, and a database layer. What if, just like how lawn mower race drivers strip parts to go faster, we stripped parts off this monstrosity and see how fast we can make it go.

    Web-sockets Are The Answer

    I slapped together some code to act as a “library” of sorts such that a worker could “check out” a unit of work.

    Then I put together a basic websocket server that would distribute work from the library. Once all the work results are reduced down to the resulting image, it saves it.

    Finally, a client script that connects to the server, checks out work, and returns the necessary calculations.

    Further Optimizations

    There were several optimizations along the way. The biggest optimization one can make is to simply “not do the work” in the first place.

    I thought more about the work being done. Each pixel input coordinate must have its calculation run three times with different iteration count limits. This is how you get the red, green, and blue channels.
    What if, instead of doing the same work 3 times with different iteration counts, we do the work once and count the outputs? Then color the red, green, and blue channels depending on the iteration count.

    This optimization alone cut the work to render each image to a third. Being able to calculate all color channels from a single trajectory calculation was a big milestone.

    Noisy Image Part 2

    Remember how I joked about how “this rounding decision surely won’t haunt us later?” Well now is later and this is that section.

    This one took some dialogue to work out. The new system was blazingly fast but the image noise came back. I hunted for (and found) all sorts of bugs in the math libraries but render after render gave the same noise.

    I was running simulations at 500x500px. Occasionally I’d run a 1000x1000px calculation but the noise is present there too.
    On happenstance, I wanted an intermediate quality image so I ran 768×768 and the noise was gone. There might be something here.

    I consulted with ChatGPT and it helped guide me to the answer. It turns out that the overlaying grid is due to a small bias introduced with the combination of using a base-10 numbering system, and the necessity of rounding to the nearest base-10 decimal. This bias makes it subtly more likely that an escaping trajectory will land along a divisible-by-10 pixel space.

    The solution is so simple. The render cannot be of a size divisible by 10. Illustrated below is a 500px and a 502px wide image. Note that not only do the lines disappear, but it is a much sharper image.

    500x500px Bhuddabrot with overlaying grid.
    502x502px Bhuddabrot with no overlaying grid visible.

    The Final Render

    Through the magic of “publishing when it’s done,” you didn’t have to wait for the week or so it took for the final render to render.

    A server monitoring dashboard observing several client workers.

    Anyways, enough stalling, here’s the final print, framed and ready to hang on the wall:

    Me sitting in front of a framed poster.
  • Playing With The Mandelbrot Set

    Playing With The Mandelbrot Set

    Prologue

    I have been fascinated with fractals ever since I was little. My first experience with them was at the Renaissance Fair where a vendor was selling prints of his artist renditions of fractals.

    I have been thinking that I kind of want one of those prints now, but since I am extra I wanted to do it myself. After all a little math isn’t that scary.

    The Math

    Iteration

    The first thing to understand is iteration. Imagine you have a sequence of numbers, where each number depends on the number that came before it, that’s the idea.

    Lets imagine our function takes two inputs, n which describes the “nth” place and a constant c. Our function would return it’s previous state squared plus the constant.

    f(n+1) = n^2 + c

    For a c of 1 you would get: 0, 1, 2, 5, 26… a sequence that continues to infinity.
    For a c of -1 you would get: 0, −1, 0, −1, 0… a sequence that does not continue to infinity.

    Exit Conditions

    I won’t pretend to know the math behind it, but we know that if a sequence ever reaches the absolute value of 2 then it will continue to infinity forever.

    Lets use this criteria to define a new function. This new function will take a constant c and return the length of the sequence of numbers that stays below 2. Since this could repeat forever, lets define a maximum number of iterations.

    The Code

    There are many iterations (heh) of the code. But lets start with what we have covered so far:

    // Define a sequence given a starting constant
    function sequence($constant)
    {
        $next = $constant;
        while (true) {
            yield $next;
            $next = $next**2 + $constant;
        }
    }
    
    // Count the iterations in a given sequence
    function iterations()
    {
        foreach (sequence() as $nth => $value) {
            if ($nth > MAX_ITERATIONS) {
                break;
            }
            if ($value >= 2) {
                return $nth;
            }
        }
    
        return 0;
    }

    It’s Never That Easy

    We are not working with your regular numbers here. We are working with complex numbers. A complex number is a number that has both a real and an imaginary component. For example, 1+2i is a complex number.

    So let’s rewrite our function to take complex numbers instead. To accomplish this I am using the markbaker/complex-functions library.

    function sequenceWithComplexNumbers($real, $imaginary) {
        $next = $test = new Complex\Complex($real, $imaginary);
        while (true) {
            yield $next;
            $next = $next->pow(2)->add($test);
        }
    }
    
    function countIterationsOfSequence($sequence) {
        foreach ($sequence as $nth => $value) {
            if ($nth >= MAX_ITERATIONS) {
                break;
            }
            if ($value->getReal() >= 2 || $value->getImaginary() >= 2) {
                return $nth;
            }
        }
    
        return 0;
    }

    Let’s Use OOP

    We are starting to work with enough code that encapsulating it into a class is starting to make sense. Let’s create a class that takes an input real and imaginary number, as well as a maximum number of iterations for the sequence. This class should represent both the sequence of numbers and the iteration count of the sequence. This isn’t the whole code but describes the interface:

    interface Mandelbrot
    {
        public function __construct(
            private float $real,
            private float $imaginary,
            private int $maxIterations,
        ) {}
    
        public function iterations(): int;
    
        public function sequence(): Generator;
    }

    Generating The Image

    Since we are working with complex numbers we can fill in the complex number plane. We can say that the X axis is the real component, and the Y axis is the imaginary component. For each X and Y, we:

    1. Create a Mandelbrot class instance for the given coordinate.
    2. Generate the sequence given our positions complex c constant.
    3. Count the iterations in the sequence.
    4. Color the pixel in depending on the number of iterations.

    For point #4 there, that’s the only real challenging part. I am using 256 iterations, or counting up to 0xff in hex. We can convert the number to grayscale by converting the iteration count from decimal to hexadecimal, then concatenating itself three times. This gives us 0x000000 to 0xffffff to work with, which is the entire gray color spectrum.

    There may be a follow-up blog post about adding color, but for now lets stick with gray. Oh, and I am using the intervention/image library for generating images.

    $width = $input->getArgument('width');
    $height = $input->getArgument('height');
    $realStart = -2;
    $realEnd = 1;
    $imaginaryStart = -1;
    $imaginaryEnd = 1;
    $filename = $input->getArgument('filename');
    
    $canvas = ImageManagerStatic::canvas($width, $height, '#ffffff');
    
    foreach (range(0, $width) as $x) {
        foreach (range(0, $height) as $y) {
            $mandelbrot = new Mandelbrot(
                $realStart + ($x / $width) * ($realEnd - $realStart),
                $imaginaryStart + ($y / $height) * ($imaginaryEnd - $imaginaryStart),
                MAX_ITERATIONS,
            );
            $iterations = $mandelbrot->iterations();
            $color = $iterations > 0 ? dechex(MAX_ITERATIONS - $iterations) : "00";
            $colorCode = "#$color$color$color";
            $canvas->pixel($colorCode, $x, $y);
        }
    }
    
    $canvas->save($filename, 100, 'png');

    Wrapping Things Up

    The code presented here is just one iteration that got me to the end result. The actual code uses the symfony/console component to handle inputs and outputs for generating a customizable image. This isn’t my finest work yet, I may publish the full code someday.

    The featured image on this post is a 4k-resolution image of the set, generated by this code. It took an hour or two to generate. My real goal for the project was a print-quality poster version, which unfortunately is a little too large for distribution on a blog, but that took a few days to generate.