Saturday, 21 October 2017

An Introduction to WebGL

This month we had Carl, a regular member and graphics professional, give us an introduction to WebGL.


Carl's event page is here: [link], slides directly [here].

The vide recording of the meet up is here: [link].


What is WebGL?

WebGL brings two worlds together - the web and GPU accelerated graphics.

The web is is one of the most open and successful technology platforms on the planet. The number of people using the web every days is in the billions. And it all works (almost perfectly) with any device we're using - smartphones, tablets, laptops, tiny sensors, big cloud servers - and with any software - browsers like Firefox, Chrome and Safari, as well as a huge ecosystem of web software libraries.

A key reason it all just works its that the technical standards by which the web works are open, and largely driven by the community. They're are not secret proprietary and driven by a small number of powerful corporations. The open source and open standards movements have become very important in today's digital world.

On the other hand, the history of GPU accelerated graphics isn't such a fairy tale. Early computers found driving a graphics display a very intensive task. Moving this load away from the main computer's processor to specialised graphics processors was an obvious step. For several years, these specialist graphics remained proprietary, with little interoperability. Then standards emerged allowing programmers to code once, and expect their programs to run on different computers with different graphics hardware acceleration. A non-profit industry group called the Khronos Group looks after a very popular API called OpenGL, the leading API for hardware accelerated 3D graphics. Many vendors of GPU hardware have implemented support for OpenGL for many years.


WebGL can be thought of as a smaller version of OpenGL designed to be used as a web technology, through Javascript, and viewable on any modern web browser - on a smartphone or a laptop.


Big Picture

It helps to understand the several technology bits that work together.


You can see that the web browser contains a javascript engine. This is the same engine that runs normal javascript associated with a web page. The WebGL API is a javascript API, and so should not be too alien for web developers to pick up.

That API essentially feeds data and programming language instructions to the GLSL compiler. Let's explain this a bit more. The fast hardware that is the GPU doesn't talk javascript. It does however understand a language called the GL Shader Language (GLSL). GLSL is similar in many ways to C/C++ and is compiled before the GPU can run it. The WebGL API is simply passing the text of our GLSL programs to the OpenGL drivers to compile and run.

You can explore the reference for the WebGL javascript API here.


Lingo: Fragments, Vertices and Shaders

There's lots of unfamiliar language in the world of coding GPUs, and that can be a barrier to newcomers. Carl introduced the most important concepts.

If you remember that a GPU is designed to accelerate, potentially complex and detailed, 3-dimensional graphics, then it makes sense that the processing must be more constrained than the kind of things we can do on a normal general purpose CPU. Not everything we can do with a CPU can easily be done with a GPU, but what a GPU can do, it can do very fast, and to lots in parallel.

Viewed positively, these constraints can be seen as a pipeline of how information about a scene is processed into images on a display. Here's a simplified WebGL pipeline.


Let's talk that pipeline through:

  1. At the start we have data, numbers, which describe simple shapes. Complex objects, like trees, faces, buildings, are made of these simple shapes, or primitives as they're called. A triangle is a very simple primitive, described by three corners. That's all that's required to define a triangle. Other primitives include points and lines.
  2. A program that runs on the GPU transforms this data into the corners of a shape. A fancy word for corners is vertices, and just one is a vertex. That small program can run very fast on the GPU, and in fact, the GPU can transform lots of data in parallel (lots at the same time). That program is called a vertex shader. Confusing name, but there you are. The vertex shader can do things to the corners, like move it around in space, in effect moving the shape.
  3. The next step is to render that shape, that triangle for example, to a display made of pixels. That process is is called rasterisation. This is when the face of the triangle is coloured in. It could be a red triangle, or be a colour gradient, a texture or something else. Again a small, program on the GPU dos this very fast, and is highly parallel. That small program is called a fragment shader. Again, not the clearest of names, but there you go.


That pipeline makes sense, and everything we do must conform to that pipeline, if we want fast acceleration of graphics by the GPU. In short,
The vertex shader works on the shape corners,  
the fragment shader works on the pixels.


WebGL Simple Example

Carl explained and illustrated a simple WebGL program with vertex and fragment shaders, with data passed through javascript.

What I'll do here is try to use that knowledge from Carl's talk to create my own first WebGL code. It's a good way to see the basics of how we structure our code, and also see the basics working, learning by doing.

I'm following the 2D coloured triangle example at WebGL Academy, a site recommended by Carl.

The first thing we need is a web page element to draw on. A HTML canvas element makes sense.

<canvas id='my_canvas'></canvas>

That creates a canvas with identifier 'my_canvas'.  Now we work entirely in javascript.

The main thing we need to do is create a canvas context. Just like many technology frameworks, a context is a way of creating a bubble for your scene, separate and safe from other bubbles.

var html_canvas = document.getElementById('my_canvas');
var GL = html_canvas.getContext('webgl');

The html_canvas variable is just the HTML canvas we created, obtained by the identifier we gave it, my_canvas. The variable GL is the "webgl" context of the canvas, an object that supports WebGL. For IE and Edge we need the older "web-experimental".

Now we need to set up the shaders, the small GPU programs.

First let's set up the vertex shader, the code that operates on all the corners of our objects. Our code is very simple:

attribute vec2 position;
void main(void) {
    gl_Position = vec4(position, 0.0, 1.0);
}

Let's explain it. Remember this isn't javascript, this is the GLSL language which is similar to C.
  • The first line creates a variable called position. It is of type vec2 which is simply a data structure for 2 numbers, so 2D coordinates. There is also something else, attribute, which is a type qualifier that tells the GSLSL compiler that this variable is pulled into the GPU from a data buffer. That's how we'll pass vertex coordinate data to the shader. The type qualifier for single values is uniform, here we have an array of values which needs attribute.
  • The next line declares a new function main(), which is the main entry point into the GPU code. The name main() is used in many languages to identify the first entry point into executing code.
  • The content of that main() function currently has only one instruction. It sets the gl_Position variable to that position variable, but expands it from 2 numbers to 4, by adding a 0.0 and a 1.0. The 0.0 is the coordinate along the third dimension, so a measure of depth. The 1.0 is, simplistically, a scaling factor. The gl_Position is a special variable, which is used by the fragment shader later. So this is an opportunity to transform (translate, rotate, other) the positions of the vertices, but we haven't here, we've kept them as they are.

Now let's look at the fragment shader, which takes the output of the vertex shader.

precision mediump float;
void main(void) {
    gl_FragColor = vec4(0.,0.,0., 1.);
}

This is simple code again:

  • The first line sets the precision to be used for floating point numbers in the fragment shader. Medium precision mediump is faster than high precision and good enough for textures and colours.
  • A main() function is declared as an entry into the executed code.
  • This main() does only one thing, it sets the gl_FragColor special variable to a four number vector vec4. These 4 numbers describe a colour using RGB and an alpha channel (translucency), so (0, 0, 0, 1) is black. 

The fragment shader is called for every pixel (fragment) inside the triangle described by the vertices that emerge from the vertex shader, which itself gets them from the data we provide through javascript.

How do we compile this GLSL code? The steps are simple but kinda boring boilerplate code. The following shows the exact same approach needed for both shaders.

var shader_vertex = GL.createShader(GL.VERTEX_SHADER);
GL.shaderSource(shader_vertex, shader_vertex_source);
GL.compileShader(shader_vertex);

var shader_fragment = GL.createShader(GL.FRAGMENT_SHADER);
GL.shaderSource(shader_fragment, shader_fragment_source);
GL.compileShader(shader_fragment);

First a shader is created from the context, of the type required (vertex, fragment). Then the source code is associated with it, then it is compiled, with the result remaining in the shader. That's a lot of boring code, but that's all that's happening.

We then need to create a webGL program and attach these compiled shaders. Again, boilerplate code.

var shader_program = GL();
GL.attachShader(shader_program, shader_vertex);
GL.attachShader(shader_program, shader_fragment);

Almost there. We now link the variables in javascript to those in the shaders, so we can pass data through the connection. You can see the javascript js_position associated with the GLSL position variable.

GL.linkProgram(shader_program);
var js_position = GL.getAttribLocation(shader_program, "position");
GL.enableVertexAttribArray(js_position);
GL.useProgram(shader_program);

We're done with shaders now. Let's look at creating the data that describes the triangle so we can pass it through to the webGL pipeline.

var triangle_vertex_data_js = [-1, -1, 1, -1, 0, 1];
var triangle_vertex_data_gl = GL.createBuffer();
GL.bindBuffer(GL.ARRAY_BUFFER, triangle_vertex_data_gl);
GL.bufferData(GL.ARRAY_BUFFER, new Float32Array(triangle_vertex_data_js), GL.STATIC_DRAW);

That looks complicated, but again it's boilerplate code. What's happening is that a javascript array is created with a list of the corner coordinates. The first corner is at (-1, -1). Next a GL buffer is created that we bind to the javascript array. Then the data is copied over after being cast as Float32 numbers.

We now have to tell WebGL which of these points, and in which order, make a face. In this easy example, just the first, second and final third corners make a triangle face. The code mirrors the previous one, create the javascript array of data, create a GL buffer, bind it, and fill it with data.

var triangle_faces_js = [0, 1, 2];
var triangle_faces_gl = GL.createBuffer();
GL.bindBuffer(GL.ELEMENT_ARRAY_BUFFER, triangle_faces_gl);
GL.bufferData(GL.ELEMENT_ARRAY_BUFFER, new Uint16Array(triangle_faces_js), GL.STATIC_DRAW);

Now all that's left is to set up the scene and draw it.

GL.clearColor(0.0, 0.0, 0.0, 0.0);
var do_drawing = function() {
    GL.viewport(0.0, 0.0, html_canvas.width, html_canvas.height);
    GL.clear(GL.COLOR_BUFFER_BIT);

    GL.bindBuffer(GL.ARRAY_BUFFER, triangle_vertex_data_gl);
    GL.vertexAttribPointer(js_position, 2, GL.FLOAT, false, 4*2, 0);
    GL.bindBuffer(GL.ELEMENT_ARRAY_BUFFER, triangle_faces_gl);
    GL.drawElements(GL.TRIANGLES, 3, GL.UNSIGNED_SHORT, 0);

    GL.flush();
    window.requestAnimationFrame(do_drawing);
};
do_drawing();

Let's explain the key points in the code:

  • The first colour sets the colour used when a buffer is cleared. We set it to colourless and transparent. 
  • A new function is created, called do_drawing(), which does the actual drawing. It is called many times, to enable animation, if that is desired The window.requestAnimtionFrame() is how modern browsers allow custom code to be called whenever the browser is ready to draw a new animation frame. We're not actually doing animation here because every frame is the same drawing.
  • Inside the do_drawing() function, we set a viewport to the size of the html canvas and clear it, then bind the triangle vertex data to that buffer, the same for the faces.
  • The GL.flush() causes all queued commands to be executed, in case they are waiting cached somewhere in the network or GPU driver - which can happen. It's a bit like writing data to disk, it doesn't always get to disk, until forced to by a flush or sync. This command queuing is good for performance.
The full code for this WebGL example is always on GitHub at: https://github.com/algorithmicart/webgl/blob/master/index.html

The results of all this is a simple black triangle on a pink canvas, scaled to the size of the available canvas. Here's a screenshot showing the browser window decoration too:


Finally, a WebGL rendered object, from data that was sent through the GPU accelerated pipeline!

Let's add some colour, not because the black triangle is boring, but to illustrate how GLSL on the GPU can do some of the work.

The first thing we need to do is declare new variables for colour in the vertex and fragment shaders. The changes to the vertex shader are:

attribute vec2 position;
attribute vec3 colour;
varying vec3 vColour;
void main(void) {
    gl_Position = vec4(position, 0.0, 1.0);
    vColour = colour;
}

In the vertex shader we set an attribute colour to allow data to be passed from javascript. We also set a varying vColour, which means a variable allowed to change inside GLSL, and has no link to anything outsidre GLSL such as javascript data. For each triangle corner, the vertex shader sets the internal vColour to the colour which will be passed from javascript as data.

The fragment shader changes are:

precision mediump float;
varying vec3 vColour;
void main(void) {
    gl_FragColor = vec4(vColour, 1.);
}

This again declared vColour as an internal mutable variable. The fragment shader simply sets the colour of the pixel to vColour, not black as it did before.

All we need to do now, is actually create the javascript data and pass it through. Here are the changes.

var triangle_vertex_data_js = [
   -1, -1,
   0, 0, 1,
   1, -1,
   1, 1, 0,
   0, 1,
   1, 0, 0];

The triangle data now contains rgb colour values, not just the coordinates of the corners.

var do_drawing = function() {
    GL.viewport(0.0, 0.0, html_canvas.width, html_canvas.height);
    GL.clear(GL.COLOR_BUFFER_BIT);

    GL.bindBuffer(GL.ARRAY_BUFFER, triangle_vertex_data_gl);
    GL.vertexAttribPointer(js_position, 2, GL.FLOAT, false,4*(2+3),0) ;
    GL.vertexAttribPointer(js_colour, 3, GL.FLOAT, false,4*(2+3),2*4) ;
    GL.bindBuffer(GL.ELEMENT_ARRAY_BUFFER, triangle_faces_gl);
    GL.drawElements(GL.TRIANGLES, 3, GL.UNSIGNED_SHORT, 0);

    GL.flush();
    window.requestAnimationFrame(do_drawing);
};

The do_drawing() function now has to have two changes, because that javascript data structure has changed. The numbers show the steps into the array the js_position and js_colour data is to be found. More details here.

The results are rather nice:


We only specified the colours of the corners, so why is the inside of the triangle coloured at all? More to the point, why is it shaded using smooth colour transitions. The reason is that WebGL by default interpolates colour between vertices if it can.


Easier JavaScript Frameworks

The code and complexity of the example just to draw a simple coloured triangle is huge. That's a problem for many reasons - the barriers to entry are high, the code is error prone, even seasoned coders will just prefer not to use WebGL.

Carl explained that today, there exist several abstraction layers over WebGL to reduce the code and complexity for the most common rendering tasks. He lists three.js and babylon.js as examples. Both of their websites link to interesting examples.

The babylon.js playground and tutorial looks really well thought out:



Editors and Tools

We've seen above that writing GLSL shader code as a javascript string and then juggling that to compile, link and run the code is very clunky. Carl recommended online editors which make developing shaders much easier by handling all that machinery behind the scenes, leaving you to the creative task of creating shaders.

He used the editor at The Book of Shaders as a good example:


Despite excellent compatibility across many different browsers and devices, there can be some small differences. The well used Can I Use website is also great for comparing WebGL capabilities.

Carl also listed web tools which show the capabilies of your browser, with a lot of detail, for example showing how many vertices can be created, or the highest level of floating point precision.


Skull Model

Carl demonstrated that you could import 3d objects created elsewhere, and use javascript libraries to convert those models into vertex data for WebGL. He also demonstrated techniques for animating a skull model by using the vertex and fragment shaders to do things like transform the skull into a sphere, or to apply a time-varying texture.



More Resources

The following are hand selected resources and tutorials which I think are good for beginners:

Thursday, 21 September 2017

Art Hackathon - "Future Pangs"

Yesterday we held our first mini art hackathon. It was an experiment to see if we liked the format, and to learn how we might do it better.



Future Pangs

Normally we have a talk or a tutorial, let by a speaker or teacher. Some members suggested that we should turn this upside down and have a less passive meet up - were the main thing we're doing is creating art - not listening or following someone else.

Hackathons are common in some communities - where people get together to work on something, individually or in groups. They're not so formal, but they're very productive and enjoyable.

I had some trepidation about this as I didn't have that much experience organising hackathons. The idea sounds lovely and idyllic - but I imagined all sorts of things going wrong - people being stuck with software installs, being uninspired to create any art, not getting on with their team or partners, ...

Especially for creative events or processes, getting the constraints right is important. The most powerful art is a result of constraints, often self imposed. Constraints like colour palette, media, technique or narrative.

For this hackathon, we had the following constraints:

  • Our art must be created at openprocessing.org, which uses the web version of processing called p5js. We did a beginner's tutorial previously, with tutorial slides and video on the meetup blog. This constraint ensures there is enough freedom, but also enough in common across the hackathon. Openprocessing also makes it trivially easy to share our work.
  • We must allow our code and work to be publicly viewable and freely copyable and reusable (CC-SA).
  • We have 60 minutes to create the art from scratch.
  • The work must be inspired by the theme 'Future Pangs', interpreted this as we wish.



I was asked where the theme Future Pangs came from. It was actually a misremembered phrase. When I was young, I used to read the 2000AD comic, and a common phrase was Future Shocks. I misremembered it as Future Pangs. That worked out even better as there isn't a direct semantic connection between the two words, which encourages us to more freely interpret the theme.


The Session Itself

There was a mix of coding expertise, and a mix of artists and technologists - but you wouldn't have thought given how everyone dived straight into working.


I was really pleased that a few of the more regular experts were happy to help others. This creates a nice supportive vibe.


Myself, I was most pleased that artists and art-students had come along to try using technology to create art.



Sharing & Learning

Participants were encouraged to show their work at the end, and talk through their interpretation of the theme, and how they created their work.

Sharing our challenges and difficulties is also a great way to learn together, as well as help each other as a group. One of the teams used noise, rather than purely random numbers as part of their work. As a group, we discussed the usefulness of Perlin noise over random numbers, something we also touched on when we covered ray-tracing previously.

ezAat.png


Show n Tell

The following is a selection of some of the art created in the meet up, and presented at the end with a short talk about the artist's interpretation of Future Shocks and how they want about creating their work.

These works have an element of animation or evolution, or even interactivity, so click the images to open the work in a new tab.




Tom has created a work that makes key use of recursive forms which grow and continue to emerge. It's a work that captures you and keeps you engaged. For there is a strong sense of mechanical regularity, but also birth and rebirth of these future forms.




Jun has created an interactive piece. Clicking on the canvas moves a circle which grows ever larger as it consumes the smaller living circles. It suggests the future will be dominated by an emergent aggressive entity!




Peter had partnered with a newcomer to use mathematical functions to model the fluctuating behaviour of bees. For me this strongly suggests the diminishing fortunes of species that are essential to the fragile ecosystem today.




Simon was inspired by the work of another artist (example work). His work makes strong use of objects-within-objects, challenging our sense of reality and dream, the difference between the overseer and the observed, boundaries that are being made fuzzy as we live increasingly digital lives.





James used open data from quandl, to visualise our economic health through history up to today, and the used models to predict the future - all of which foretell a doomed future!




Matt created a a very dynamic work which makes very good use of movement plotting lines, the colour of which is taken progressively from a colour palette. It gives the impression of velocity, diversity but also of a cycle of renewal and supercession.




Neil has used simple elements to create a powerful work. By using a carefully chosen colour palette, and columns of shapes - rectangles and ellipses - the work grows and evolves, in a busy congested way, evoking busy overcrowded cities like Hong Kong, New York and London. Despite the business and congestion, the pace and colour gives the impression of optimism and a future happily occupied.




Raihan explained how we was inspired by science fiction films, futuristic and high-tech, and yet with scenes an equipment made of very old low-tech. Green cathode-ray tube displays, beeping panels and chunky keyboards! The Matrix, Blade Runner and Alien are just some classics that make rich use of this techno-dystopia.




Carl created a sublime piece evoking the gentle falling of rain onto a surface, where the drops ripple and spread. The colour scheme and pace of growth and fading, to me, suggest the growth and decline of diverse communities in a global ecosystem. Viewing this work for a few moments, shows a nice balance between large gentle pastel pieces, and the odd more starkly coloured circle, adding just the right amount of spice to the mix!

These works are so good that the idea of an annual exhibition makes a lot of sense!




Borg Druid created a few works, and this one is a very interesting take on colour palettes you get from paint manufacturers. Instead of colour names, we have more emotional terms, which really do match the colours. And all those themes predict our future world ... yikes!

These works are so good that the idea of an annual exhibition makes a lot of sense!




Success and Lessons Learned

Overall the group seemed to like the atmosphere and the chance to use technology and code to create something just for fun. Projecting a nice video of nature, accompanied by gentle piano music seemed to help provide just enough opportunity for escape and inspiration from the corporate meeting room, without being overly intrusive.

I was really pleased that the more experienced members were helping others, and being asked for help too.

What really surprised me was the speed and ease with which the group dived into working, with almost no blocking issues. One artist, who doesn't have a huge experience with coding, was successfully creating interactive 3-dimensional scenes!


A discussion of what could be done better in future raised some good points:

  • Repeating the gentle introduction to Processing and p5js would be really valued. So we'll try to schedule this for December or January. 
  • Some people want to work on their own, some with others. Some have lots of knowledge and experience, others less so. Some know what they want to do, others need inspiration. Next time, we should try to organise the groups so people can join the right team if they want to.
  • The group felt 1 hour was too short. This was actually extended from the original 40 minutes! We'll try to have an extended session next time.


Thanks everyone for taking part, making it a fun success , and helping us learn how to do an ever better art hackathon next time!

Sunday, 20 August 2017

Ray-Tracing for Realistic Scenes

We just had an interesting meetup session introducing ray-tracing, a method which aims to create realistic computer generated scenes.

Here's an example of a ray-traced scene which is so realistic it's hard to believe that it was crafted only from mathematics. Nothing in this image exists in real life as a physical object for us to photograph.


Here are the slides, video, and example code on GitHub.


The Challenge - Render a Realistic Scene

Ever since they were able to create coloured marks on a screen, we've set ourselves the challenge of getting computers to render scenes which are realistic. Scenes realistic enough to look like they had been filmed or photographed, rather than obviously looking like they had been generated by a computer program.

This is a great challenge for many reasons. Practically, we can save on the costs and effort of building physical objects and scenes. Creatively, we are free to imagine all kinds of objects, things, and make them appear real. That in itself, is a big enough reason to take up this challenge.

Over the last few decades, the sophistication of our image making methods has grown massively, helped by a similar exposition of computing power to makes these dreams feasible. Even today there is furious competitive activity trying new ideas to make computer generated scenes more realistic. And it's not just for stand along art, there is a huge hunger for better, more realistic, and more efficient rendering in the film effects and gaming.

The following image, amazingly, is not a photograph. It is entirely generated from a mathematical model of the pebbles. It was created a few years ago, using free open source software. Leading visual effects companies will be using even more sophisticated tools today.

Pebbles - http://hof.povray.org/pebbles.html


Ray-Tracing - Inspired by Physics

So, how do we render a scene, when the objects in the scene are only imagined? There are several approaches we could take. We could just use a painting program and lay down digital brush strokes, to build up an image, just like we would a traditional painting.

Another approach is to think about how nature itself lights a scene, and to try to replicate that in a computer program. That means we're thinking about how light starts at a light source, like the sun, and arrives at a scene, how it falls onto an object, and is perhaps reflected around the scene, or perhaps diffused or absorbed by some materials, before some of it finally enters our eyes for us to perceive the scene.

Trying to follow rays of light around a scene, which includes light sources, objects and ourselves as an observer, is called ray-tracing.

Thought this tutorial we'll try to think about this challenge in plain English first. It's a great way of understanding ideas first, making sure they're sensible, way before attempting to encode them as mathematics or computer code.


Basic Elements of A Scene

Let's be clear about what elements make a scene. We've already said we have objects in the scene - they're the things we're most interested in portraying. They might be balls, boxes, or more complex objects. We know we need light to be able to see anything. Without light, the scene would be pitch black. So we need a light source, or maybe more than one.

It's tempting to stop there, but we need to also think about ourselves as an observer in the scene. What we see depends on where we're located, and the direction in which we're looking. If you're a photographer, you'll be very aware of this!

The following summarises these key elements.


We've included one more element in the scene, a viewport. The idea is to take a rectangle out of around everything to frame a scene. You'll have seen film directors do the same.


It's a nice coincidence that the rectangular framed view is analogous to a computer display (like a laptop screen) and the rectangular format of computer image files.


Follow The Light

Ok - we now have a simple example scene - with the sun as a light source, a red sphere, and ourselves as observers. What next?

We know that we can only see the sphere because light falls on it. And that light must emerge from the light source. That all sounds obvious, but hang in there.. there's a reason for being so explicit about this.

So to simulate this working of nature, we need to draw rays of light from the light source and see where they land. The following shows some rays emerging from the sun.


You can see how some rays never really go anywhere near the sphere, and that's what we expect. Light from the sun shoots out in all directions. Some light does travel in the vicinity of the red sphere but just misses it, and carries on past our area of interest.

Some light rays do hit the sphere. Finally! Let's not get too excited just yet .. some of those rays of light do hit the ball but then bounce out of the scene away from our observing eye. That means those rays of light don't carry any information to our eyes about the scene. In some sense, they're useless to us for rendering the scene. There are some rays that, thankfully, do hit the ball and bounce straight into our eyes, carrying with them information about the colour of the ball.

Great! We've already found a way of building up a scene by following rays from a light source, and selecting only those that get bounced into our eyes. You can see why the technique is called ray-tracing - because we're tracing rays of light through the scene.

That is an achievement - especially if you've never ever considered how to computationally render a scene before.

There is one problem with that initial (good) idea thought. It's extremely inefficient. If you think back to that light source, we know light shoots out in all directions. And the vast vast majority won't go anywhere near the objects of interest. And of the ones that do, a tiny fraction will make it through the viewport and into the observing eye. We'd be wasting so much computer calculations and time following rays which ended up elsewhere.

What's the answer? It's actually really simple and elegant. We work backwards!


Backwards Ray-Tracing

We know we're only interested in light that ends up in the eye, so why not start there and work backwards. This is in fact most ray-tracers work - by tracing rays, from the observer, backwards through the scene. The following shows this:


That diagram also starts to hint at how we might choose rays to follow. You will know that computer images are made up of pixels, little squares of colour. Computer displays are made up of pixels, and computer image files (like jpegs, or pngs) contain information about the pixels in an image.

We need to know what the colour of every pixel in that rectangular frame should be. That means we need to cast a ray from the eye, through each pixel in the viewport. if we did any less, we'd have missing pixels in the computer image we're trying to render.

Next we need to think about how we actually work out whether a ray touches an object or not.


Rays Hitting Objects - Intersection

This section will get a little bit mathematical ... but we should always keep in mind what we're trying to do. In plain English - we're trying to test whether a ray hits an object or not. Simply that.

How do we even get started with this? Well - we need a mathematical way of describing both the ray and the spherical ball.

  • The ray (straight line) is easy. It has a starting point, and a direction. Both can be described using vectors, which many will have learnt about at school. Vectors have a direction, not just a size, this is really useful for us when raytracing. 
  • The sphere (ball) is also easy. To define a sphere, we need to know where its centre is, and how big the sphere is .. which it's radius tell us neatly.

The following summarises how we define a ray and a sphere mathematically.


That's the basic idea but we need to be able to write these ideas in precise (mathematical) form.

A ray line is described by the starting point $\mathbf{C}$ and points along a direction $\mathbf{D}$. How far along that direction is controlled by a variable parameter, let's call it $t$:

$$ \mathbf{R} =  \mathbf{C} + t \cdot \mathbf{D}$$

For small $t$ the point is close to the start, and for larger $t$ the point is further away along the ray.

A sphere is described by remembering that every point on its surface is always the same distance from the centre, that distance being the radius $r$. If they weren't the object might be a lumpy blob! For any point $\mathbf{P}$ on a sphere centred at $\mathbf{S}$ we have

$$ | \mathbf{P} - \mathbf{S} | = r $$

If we square both sides, the logic is the same, but the algebra is easier later:

$$ | \mathbf{P} - \mathbf{S} |^2  = r^2 $$

We've created mathematically precise descriptions of two objects - that's an achievement!

Now we need to work out whether the ray actually hits the sphere or not. This may not seem obvious, but the way to do this is to equate the two mathematical descriptions, and see what falls out of the algebra. By equating the two expressions, we're saying that a point is on the line and it is also on the sphere. What should drop out is the point (or points) at which this is true. If the ray doesn't touch the sphere, the algebra should tell us somehow, perhaps by collapsing to an impossibility like $t^2 = -4$ which doesn't have any real solutions for $t$.

The following illustrates the equating of the line and sphere expressions, and the algebra that emerges. It looks complicated but it's just expanding out brackets which many will have done at school.


Expanding out the terms results in an quadratic equation in $t$. Yes it looks complicated, but it is still a quadratic equation, the same that many solved at school. The general formula for solving quadratic equations $ax^2 + bx + c = 0$ is

$$ x = \frac{-b \pm \sqrt {b^2 - 4ac} } {2a} $$

We don't need to calculate the full solution for $t$ to know whether the ray touches the sphere. The reason is that if we look at the generic solution, there is a bit ${b^2 - 4ac}$ which tells us whether there are 2 solutions,  1 solution (2 repeating), or no real solutions .. remember that quadric equations have 2 solutions. Because that bit is so useful it is often called a determinant, $\Delta$. For us, that determinant is

$$ \Delta = 4 \mathbf{D}^2 \cdot ( \mathbf{C} -  \mathbf{S})^2 - 4 \mathbf{D}^2 \cdot ( \mathbf{C}^2 +  \mathbf{S}^2 - r^2) $$

The following shows what it means for this determinant to be more than, less than or equal to zero:


It's nice to see that a quadric equation emerges, and that it has 2 solutions .. because a ray can intersect a sphere at 2 points. And it's nice that the maths neatly captures the scenario when the ray misses the sphere.

Ok - enough of the maths symbols. Back to ray tracing. Our test for whether the ray touches the ball is simply $\Delta >= 0$. That's it!

Let's try it .. the results are:

sphere_1.png

Our very first ray traced scene! That's a big first step. We've managed to describe a scene mathematically, with a light source, an observer, a viewport, and an object .. and we've used the mathematical descriptions to follow rays back from our eye to see if they hit the sphere. You can find example Python code to do all this on github.

Take a well-deserved break before we continue.


Shading

That sphere we rendered above is great, but it doesn't look three dimensional. The thing that gives the impression of three dimensions, of solidity, is how light is different at different points on an object.

Let's think about a real sphere .. like a snooker ball. There will be light and dark bits. The light bits are the ones that are facing the light source most directly. And the dark bits are the ones that are pointing away from the light source. The following diagram shows this.


We've described that in plain English and it makes sense. How do we translate that idea into maths?

Luckily the answer is easy. We can consider the angle between the normal at each point of the sphere and a vector to the light source. A normal is just a vector pointing directly out from a surface, so it meets the surface at right angles. The following shows these angles:


The small the angle, the more more directly that bit of surface is being illuminated. The larger the angle, the less directly.

Maths is kind to us here. The cosine function nicely indicates this alignment, with the cosine of smaller angles being closer to one, and as the angle grows, the cosine gets smaller ... and negative once the vectors point away from each other. Even better, we don't need to calculate the cosine of the angle, or even the angle itself, because a simple dot product of the two vectors indicates the same quantity because $\mathbf{a} \cdot \mathbf{b} = ab \cdot cos(\theta)$. We just need to make sure we normalise the dot product so that longer vectors don't bias it.

Let's try colouring the pixels according to how aligned the intersected points on the sphere are to the light source. The result is ...

sphere_2a.png

Much better .. It's starting to look three dimensional and solid.

This simple approach to shading a surface, based on how directly it is illuminated - is good enough for many applications. But we want to develop the realism further.

The next improvement is to realise that what we've done is to only consider how well a point on an object is illuminated. That's not the same as considering how much light is reflected to the observer's eye. The two things are distinct but related. First a surface is illuminated by a light source, then some of that light is reflected away, perhaps towards the observer, but mostly likely not.

The following diagram shows why only considering illumination is not enough. The two points shown have the same illumination, but one reflects light to the observer more directly than the other, which actually reflects it away.


We can reuse our idea of using angles to see how directly any reflected light travels to the observer. A small angle means the light is reflected more directly into our eye.

How do we work out what the reflected ray is? This diagram explains best the link between the vector from a point on the surface to the light source, $\mathbf{L}$, and the reflected ray, $\mathbf{R}$.


The two rays $\mathbf{L}$ and $\mathbf{R}$ are like mirror reflections about the normal $\mathbf{N}$. If we add them together we should get a result that's straight up the normal, but perhaps a different length. That symmetry is what will help us work out $\mathbf{R}$.

$$ \mathbf{L} + \mathbf{R} = 2 (\mathbf{L} \cdot \mathbf{N}) \mathbf{N} $$

That's just saying the sum is twice the projection of \mathbf{L} onto \mathbf{N}. It's super easy to re-arrange to

$$ \mathbf{R} = 2 (\mathbf{L} \cdot \mathbf{N}) \mathbf{N} - \mathbf{L}  $$

The following shows the two effects we're taking into account now - illumination and reflection to the observer.


Combining these two effects to modify the base red colour of the sphere gives us the following result.

phong_both_factors.png

That's much much more realistic. We can see a highlighted area of the surface, and a darker area too, which is just like real spheres that we see.


Phong Specular Highlights and Matt Surfaces

We considered how we might model the specular highlights, often seen in more shiny or glossy surfaces. This photograph of a snooker ball shows these high-intensity highlights clearly, and we recognise them as characteristic of smooth, glossy shiny objects.

shad2-plasticball.png

A simple approach is to squeeze the function that maps the angle between the normal and the light source. That way, the increase in light is focussed on a smaller bit of the surface. That worked but actually the light intensity wasn't increased, so the highlight area just became smaller. A lesson learned there is to increase the intensity of the contributed light as that function is squeezed. This squeezing is called Phong highlighting.

The opposite of shiny, glossy is to a matt surface that diffuses light that falls on it. If we think about how light is reflected from such a surface, we realise that the surface is very irregular at a small scale, and this causes light to bounce in all sorts of random directions, including into nooks and crannies so that it never emerges. The following illustrates this.


How do we describe this behaviour in mathematics. There are several ways, including randomising the normal so the reflected ray is bounced in different directions. Another approach is to keep the normal as it is, but add a small random vector to the reflected ray. That's a milder adjustment but does seem to work, as the following comparison between a shiny and a matt ball shows.


The example code is on github.


Reflections

Being able to show realistic reflections is one of the key things that attract people to ray-tracing. Everything we've done up to this point can be approximated well enough without the very involved calculations required by ray-tracing.

In the 19080s ray-traced images like the following were state of the art, used to show off the cleverness of the software creator and the power of computers.

600_459974122.jpeg

How do we include reflections into our ray-tracer?

Let's think again in plain English before we dive into any maths. We already have the idea of a reflected ray worked out and being used. A reflection is simply us being another object on the surface of an object we're looking it. That means the light has bounced from another object onto the one we're looking directly at, before arriving at our eyes. The following image shows this more clearly.


Following rays backwards, you can see that one of the paths taken by the ray (shown in orange) hits the first object, is reflected into a second object and then a third object before it finally goes back to the light source.

We have some new thinking to do here .. so take a break, a coffee or a breath before proceeding!


Reflection Depth

How many reflections do we want to model? This is a good questions because reflections could go on forever, and we don't want to get stuck in a computational rabbit-hole!

We don't want to write a program that is specific to a number of reflections. We want a program where the depth, or number, of reflections is configurable.

A good way to do this is to make the ray a recursive function. A recursive function is one that calls itself. That might be a bit mind boggling, but have a look around the internet for simple examples.

The reason this powerful idea suits ray-tracing reflections is that each ray creates a new ray when it hits an object, that is, the reflected ray .. and that ray in turn can create another reflected ray when that hits an object .. and so on .. until we reach a depth at which we want to stop.

The following diagram shows the ray function spawning another ray function ... and it also shows what information is passed back up the chain - the colour information at each object  ... which we want to accumulate. This is right because when we see reflections we're looking at the accumulation of colour information from all subsequent reflections.


To make a recursive function work, we need to think really carefully about what information it takes as input, and what information it accumulates and returns. For us, this isn't so difficult. A ray function needs to know

  • the start position of the ray,
  • the direction, and
  • the current depth

and it returns

  • whether the ray intersected an object or not
  • the accumulated colour (which may have been added to on intersection)

You can see from the diagram that for a maximum depth of 3, it is possible for a ray to have collected information from 3 onward reflections and intersections.

Let's see some results:

test.png

That's pretty amazing! We can see the green ball reflected in the yellow ball. In fact we can also see the red ball reflected in that yellow ball too.

We have reflections working! Example code is at github.


Random Spheres Art

Here's a nice triptych of random spheres, with reflections a key feature. It's created using only the ideas we've worked through above.

testr2.png  testr1.png  testr3.png


Objects in Ray-Tracing

Before we go on to define another kind of object, in addition to the sphere we have worked with up to now, it is worth thinking about the minimum set of things any such object definition must provide.

There are only 3 things:



  • being able to test whether a ray intersects the object or not
  • being able to provide a normal vector (pointing out) at any point on the object
  • a material colour at any point

Any kind of objects, simple like a plane, or complicated like a torus, must be able to provide those three bits of information when needed. Let's look at a plane next.


Flat Plane

How do we define a flat plane mathematically? It may not be obvious but it is true that we only need a point on the plane and its normal vector. That is, a point, any point, through which the plane lies. This pins it to a point in space. But the plane could have any orientation through that point, which is why we need the normal to tell us which way the plane is facing. The following diagram illustrates this.


We now need to work out how to test for intersection. That's actually easy, because a ray will always hit an infinite flat plane unless it is parallel to it. But we need to know where a ray hits a plane, so that we can then work out things like reflections and illumination from a light source.

The key to this is to realise that a vector from that defining point, let's call it $\mathbf{X}$, and a point $\mathbf{P}$ on that plane, is always perpendicular to the normal, $\mathbf{N}$. The following illustrates this:


Let's write out what we just said, in mathematical form:

$$ (\mathbf{P} - \mathbf{X}) \cdot \mathbf{N}= 0 $$

Substituting that point $\mathbf{P}$ with the definition of a ray,

$$ (\mathbf{C} + t \cdot \mathbf{D} - \mathbf{X}) \cdot \mathbf{N}= 0 $$

It's really simple to re-arrange that so we have $t$ on one side:

$$ t = \frac{(\mathbf{X} - \mathbf{C}) \cdot \mathbf{N}} {\mathbf{D} \cdot \mathbf{N}} $$

That bottom part of the fraction, $\mathbf{D} \cdot \mathbf{N}$, like a determinant we saw before. If it is zero, we can't divide by zero, and so there is no intersection. That's when $\mathbf{D}$ and $\mathbf{N}$ are perpendicular (ray parallel to plane).

The results do work well ... but only after we refine the accumulation of colour by diminishing how much is accumulated in proportion to the depth - otherwise an unrealistic amount of light is accumulated as the number of reflections increases. See the slides for more detail on this.

test_d4.png


Texture

In real life, most objects aren't uniformly coloured. They have variations of colour in, often recognisable, patterns. Wood and marble stone patterns are easily recognisable, for example.

HT1WQ7hFF8eXXagOFbXW.jpg

How do we add texture to our objects. At the moment we're using a simple base colour, like red for the sphere we started with. We know we need to vary the colour that's returned when a ray intersects an object .. but how do we vary it?

We know a pattern is spatial - the variation in colour depends on the location we're looking at. This is the key. We need to be able to connect, link a location on an object with a part of a known pattern. As we vary the location on the object, we move around the pattern. We can do this in two ways - we can have a pattern that is a bitmap image texture, or we can use a mathematical expression to define a pattern.


Here's an example of a texture defined by a mathematical expression where the red element of the objects colour is $sin(3x) + sin(4y) + sin(4z)$.

texture_4.png

Here's one where the colour is defined only by the vertical $y$ component of the position, where the red element is $sin(y^2)$.

texture_2.png

Here's one where the texture is defined by a bitmap of a marble texture.

bm_texture_1.png

The reason the texture looks oddly stretched is because it is not trivial to map a flat texture to a spherical surface .. the same problem as projecting the Earth's surface to a flat map.

We can even use a function for random noise to define a texture. Purely random noise isn't always useful in many areas of computer graphics. Instead, a smoother noise is more realistic. Often Perlin noise, or OpenSimplex noise, is used .. you can see from this comparison the difference. With this kind of noise, successive values are close to previous values, making the transition similar to real world phenomena like mountains or clouds.

ezAat.png

The following shows sphere textures based on opensimplex noise using the $(x,y,z)$ components of the surface normal.

opensimplex_2.png

These are really nice, and because we use the surface normal, we avoid the problems of mapping spherical surfaces to 2-dimensional textures.

Mathematically defined textures are really fun to experiment with, and the possibilities are endless!


Light-Fall Off

A nice lighting effect which we see in photography and in paintings is light falling very rapidly with distance from the light source.

_DSC8365 janet during power cut.jpg

This should be easy to implement. We simply need to calculate the distance from an intersection point to the light source and apply a function which forces a rapid fall off.

In physics, we know the light should fall off as an inverse square of the distance. In practice, we can use sharper functions to exaggerate the fall-off for artistic effect. Here's an set of functions based on $tanh(x)$.


The effect is rather pleasing:

falloff_1.png

Using the effect on textured objects also works well:

opensimplex_3noise.png


Fog

The last effect we looked at was fog. This is different to everything we've looked at before because it is an atmospheric effect, not an effect on the surface of objects.

68736036-gorgeous-fog-wallpapers.jpg

There are many methods for modelling atmospheric effects, and we'll look at two simple ones.

The first is one inspired by how many textbooks describe atmospheric effects. They think of a fog as a volume, through which light passes, and has a chance of interacting. This makes sense. Fog is made of particles (just as smoke is), and light can pass through it, or hit a particle, which is why fogs obscure a scene.


The deeper the fog, the larger the probability that a light ray has hit a fog particle. For rays that don't hit an object, but carry on out of the scene, we can set an artificial large distance. The results look like this:

fog_fog_2a.png

Well, we do see a diminishing of colour for distant objects, but the overall effect isn't very pleasing. The image is too grainy, and the background is unrealistic.

Let's try again, and think in plain English for ourselves. We want to have a smoother diminishing of object colour with distance, towards a fog colour (white, but could be black or brown smog). We can try using a smooth gradient and not the speckling effect we get from the random probability method above. We also want to have some variation in our fog, a lumpy fog. We can use a noise function to create this. Here's a summary of the idea:


For testing we'll use a more visible green fog, to see more clearly the effects of our ideas. Here's an image of just the lumpy noise applied:

fog_fog_3.png

Here's the result with the lumpy fog texture augmented with the distance based intensity.

fog_fog3b.png

Looking at one of the closer spheres, you can indeed see the lumpiness of the fog.


POV-Ray

Finally, we looked at a real ray-tracer. There are several expensive and very sophisticated software renderers, some very proprietary to visual effects companies, we looked at POV-Ray.

POV-Ray is free and open source, and was started around 1986. It was very popular in the 1990s and early 2000s as being the leading, and accessible, ray-tracing software. I myself spent many hours exploring POV-Ray, using a book, before the internet was as rich and available as it is now.


I would encourage you took explore POV-Ray because you describe a scene in simple code, and this forces you to understand more closely the effects and methods being applied.

Here's a simple scene created in POV-Ray:

a.png

The foreground is a height field created from an image of a Julia fractal. The sphere moon ha a texture where the colours are also based on the same fractal image.

We can add fog to the image to create a more realistic image:

f.png

We can improve this even further by applying a focal plane effect where objects nearer or further from that plane are blurred, much like a real camera with a large aperture. The landscape has also had a texture applied to it too, giving it the appearance of stratified rocks.

i.png

Not a bad result for such a short time experimenting with POV-Ray!

All the code is available on github.


Interesting Questions

There were some great interesting questions from members of the audience during and after the meetup:

Q. Real objects absorb and reflect certain parts of the light spectrum. How does this ray-tracer reflect that? 

A. It doesn't. Our simple ray-tracer has a very simple model of light, reflection and accumulating object colour. Today's most sophisticated ray-tracers do model light as a continuum of frequencies and more correctly model the selective absorption of certain frequencies by different materials.


Q. Accumulating colour with larger depths can cause the values to overshoot the normal colour range (0-255 for RGB values). How do you handle this?

A. You're right and the slides do show this overshoot happening. I switched from integers (uint8) to floating point (float64) numbers for the colour components. This way I don't need to worry about overshooting. Before the image is rendered, a squishing function based on $tanh()$ again is used to bring all tlevalues back into the 0-255 range. This also has the benefit of handling high and over saturation realistically.


Q. How long does it take?

A. Ray-tracing has traditionally been seen as very time consuming. This is still true today. Our own very simple code only takes seconds for moderately sizes images. The largest factor affecting ray-tracing time is the number of objects or the number of rays being spawned. A scene with 80 objects would take many minutes. Most of the simple scenes took a few seconds. This is very fast compared to computing in the 1980s which could take many hours or days. Our own code is in Python, a friendly easy to learn language, but not one that is very fast. POV-Ray is written in C/C++ and ism much faster, with single-sphere scenes being rendered almost instantaneously.


Q. Why don't you spawn multiple rays at each intersection, rather than just one?

A. That's a great idea for further exploration :)