Optimising – Harder, Better, Faster, Stronger

Well hello there, reader. It’s time to talk about optimisation – the process of realising that what you did the first time was terrible and finding a new way that runs a hundred times better. Fortunately, in the particular example I’m about to go through, I’m not optimising my own code, but a program provided to me.

This idea behind this program is that it renders an image of various 3D shapes by using a series of ray traces. For each and every pixel in the window, the program performs a ray trace that returns the colour of that particular pixel, one by one. Considering there are a few thousand pixels involved, this takes a while. By default, the original took 80.157 seconds to render the entire image on my machine. That’s my time to beat.

First and foremost, I improved the overall performance of the program not by changing the actual logic of the code, but how the machine runs it. Whilst typically the machine runs each iteration of the loop on after another, by activating and implementing parallel processing, the machine is now able to run multiple iterations at the same time. This alone already cut the time to 27.130, 33.85% of the original time.

Beyond that, my primary method to improve the efficiency was to make educated guesses at what each the colour of certain pixels was actually going to be. Effectively, the colour 1 in 3 pixels would be determined by the colour of the two pixels either side of it.

That is to say, if we have Pixels A, B, C in order. If A is Red, and C is Blue, the program then assumes that B will be Purple. Unlike the example, A and C are both typically similar shade of blue, grey, or green, so B also becomes a similar shade that blends well.

Essentially, the core loop looks like this:

// Primary loop through all screen pixels.
	#pragma omp parallel for // Run loop with Parallel Processing.
	for (int y = 0; y < windowHeight; y++)
	{
		#pragma omp parallel for
		// Checking 3 pixels at a time...
		for (int x = 0; x < windowWidth; x+=3)
		{
			// Retrieve the colour of the specified pixel. The math below converts pixel coordinates (0 to width) into camera coordinates (-1 to 1).
			kf::Colour outputA = g_scene.trace(float(x - windowWidth / 2) / (windowWidth / 2), -float(y - windowHeight / 2) / (windowHeight / 2)*((float(windowHeight) / windowWidth)));
			kf::Colour outputC = g_scene.trace(float(x + 2 - windowWidth / 2) / (windowWidth / 2), -float(y - windowHeight / 2) / (windowHeight / 2)*((float(windowHeight) / windowWidth)));
			kf::Colour outputB;

			// Average Pixels A and C to create B.
			outputB.r = (outputA.r + outputC.r) / 2;
			outputB.g = (outputA.g + outputC.g) / 2;
			outputB.b = (outputA.b + outputC.b) / 2;

			// Output Code
		}
	}

To look at this more directly, this is a comparison between the original, and my optimised scene. See any differences?

Raytracer Comparison

Now, with a filter that highlights those differences. Doesn’t look too bad, right?

Raytracer Differences

As you can see, whilst there’s a lot of small places that are a different colour, that difference is slight. However, if we up the contrast as much as possible, effectively turning each pixel into pass/fail, it becomes less than ideal.

Raytraces Contrast

By using this method, the overall time dropped to 13.690, 17.08% of what is once was. Whilst its by no means the fastest possible method, and it doesn’t render a perfect duplicate, it’s still a hell of a lot better than it once was.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s