Kukuku wrote:EdgeDamodred wrote:The voxels themselves cannot be rearranged but the fact that we don't use voxel displays(yet, think holograms), allows them to be eventually subject to traditional rasterization techniques. The voxels have to blah blah blah, wordy words.
None of what you wrote made sense to me. And you didn't even need an accent. Congratulations.
MASSIVE WALL OF TEXT AHEAD WITH NO TL;DR VERSION EXCEPT PRETTY FORMULAS MAKE PRETTY PICTURES
Yeah I myself do hate it when someone explains it with nothing but terminology to make what they're saying shorter(hence the reason mathematical short hand drives me nuts even though I can actually understand the mathematics themselves most of time, but remembering all those symbols and the fact that it's not written linearly drives me bonkers. Which is why I like programming because you have to write it linearly).
First off you understand that your screen is just a bunch of rows and columns making a lot of tiny, evenly sized, evenly spaced dots. Usually this is expressed as a resolution such as 1920 x 1080. In this case I have 1,920 dots going across, and 1080 going down. Each dot is a single color made up of a combination of intensity of the three primary colors of light, red, green and blue. Each dot is called a pixel which is short for Picture Element.
A voxel is essentially the same concept except it uses a third axis. So instead of only the x axis and y axis used to tell where an individual pixel lies on the screen, we have a z axis as well to state how far "away" from the screen a pixel is from whatever we designate as the origin. It stands for Volumetric Pixel, or Volumetric Picture Element.
However our screens only work in pixels as it is a two dimensional display and not a three dimensional display, which are not common place. (Side note: the current "3D" TV's/monitors are not true 3D displays as they rely on trick that causes our brains to interpret a "3D" effect called stereoscopic 3D. Honestly it looks more like a pop up book and is rather annoying in most cases and has been around for decades yet they feel the need bring it back every 25 years or so.).
Now as a way to actually display something on a 2D screen voxels are actually pretty useless in and of themselves. However the concept of a voxel can still be used to define a large volume "efficiently" or at least easily. In the case of Minecraft, and apparently CraftStudio, the world is essentially divided into evenly sized cubes called voxels and a player can tell each voxel what "color" it is in the game. A specific color determines the properties of the voxel such as whether it's solid, can it be mined, what type of mining surface it is, etc. So the world can be efficiently stored because it is just a 3D grid of numbers that is maximum length along each axis. Each number takes up a relatively small amount of memory and the size of the overall data never changes, just the values.
As I said we can't display voxels on the screen, so we need some way of converting them into pixels. This is where the standard graphics pipeline comes into play. Since each voxel represents a cube it can only be seen on 3 sides at most at any one time. Go ahead, find a block somewhere in your house and try to see more than 3 sides of it at once with some method of reflection(a mirror, a recently waxed car, a really shiny kitchen floor, etc.). So now to display a single voxel we just need three squares that are positioned and angled relative to our camera's view.
Unfortunately our monitor only understands pixels. While square themselves they must always be positioned and oriented to the surface of the screen. So now we need to use some mathematics and logic to turn get this to work. First up is the idea of Transforms. A transform is simply something that describes an object's position(Translation), orientation(Rotation) or which it is facing, and scale(how stretched out or squished along a particular axis). The transform is stored in a matrix, which is nothing more than a 2D grid. However the positions in the do have meaning as they are relative to some equation(s), in this particular case the set of equations that translate,or move something directly to, rotate, reorient something, and scale, stretch or shrink something, a point or set of points that make up an object. What this will allow us to do later on is apply a complex formula to a transform simply by multiplying the transform matrix by another matrix.
Now we need to convert each square surface into a set of triangles, in this case 2. The reason for them being triangles is because a triangle is the simplest form of polygons there is and all other polygons can be made up of just triangles. That and some other reasons such as a triangle is always convex(if you draw a line from one point to any of the other point within the shape the line will be contained completely within the bounds of the shape) which a lot of other algorithms depend on. Polygons in general are used over other shapes such as circles or lines or points or some of those other geometric shapes that are bizarre, because they can define areas fairly precisely with fairly low memory cost and are pretty easy to work with.
Next up is the concept of relative spaces. This one's a bit trickier to conceptualize without actually seeing it. But basically every possible Transformation is relative to a particular origin. When I say "5 meters to the left", you have no frame of reference, but if I say "5 meters to the left of your desk" you know where I'm talking about. In that case, all transformations are relative to the origin, which is your desk. If I say "100 kilometers north from Atlanta, Georgia's center", we're talking about a different relative space. The cool thing is we can actually convert from one space to another. I'm not going to go into the math but basically it involves multiplying one transform by another matrix .
We need to take an object and convert it through several different spaces. Our square is made up of 2 triangles it is made up of six points, with 2 of them shared between both triangles. Those points are relatives to the voxel's origin(along with the points that make up the other 5 sides. We need to convert those points into World Space, so they become relative to the world's origin.
Here's another tricky concept to understand, Camera, View, or Screen Space. When it comes to actually drawing stuff on screen, the screen does not move, you have to move the world in front of the screen because it only draws a certain range from its own origin(-1 to 1 or 0 to 1 along an axis depending on what your using to draw, either OpenGL or Direct3D).
While we're done moving points into different spaces we're not done with multiplying them by matrices. The next one up is applying a projection to the points to given a sense of depth. We have two types of projections we can apply, an Orthographic and Perspective.
Even though Orthographic may sound like the more involved one as you may or may not have ever heard the word, it's actually the simplest. It essentially just removes the depth or z value of a point and creates a rather flat look where objects will be shown at the same size regardless if they are further back from the camera. This kind of projection you find common in CAD(Computer Aided Drawing) and 3D modeling/level editing packages as it is useful for easily comparing two objects size wise without navigating to each of them and busting out a measuring tape.
Perspective Projection gives us that sense of depth but it is mathematically harder. The nice thing is we can wrap the equation into a matrix that we can use to apply it to our points simply by multiplying the point by the matrix. If you ever taken an art class in school you may have done perspective drawing, usually involving drawing a bunch of lines all converging on a center point to give a sense of depth to a building so the back end looks smaller the "further back" it goes into the picture. The nice thing is that this can be done mathematically and has been done by artists for centuries. In Europe you can find certain historical homes where entire rooms appear to be filled with book cases and furniture but are in fact just 6 flat walls.
Note: The nice thing about using matrix math to do the space conversions is we can combine all the matrices(the ones that convert from Object Space to World Space, to Screen Space, and apply a projection) by multiplying them in the reverse order into a single matrix and then just multiply each point by that single matrix.
Okay great! We now have cubes defined as voxels converted from squares to triangles to points that make up the triangle and have converted and projected them so they hopefully show up on screen...we still haven't even drawn the damn things yet!
Now we enter the fun world of Rasterization! Rasterization is converting a Vector Graphic into a Raster Graphic! "Oh noes! More termeseses Precious!" Time to break these down.
Raster Graphic is pretty easy as it is what your screen is and any photographic you store digitally is, an image defined by a set of pixels. A Vector Graphic is an image defined by mathematical equations(which a triangle is). The cool thing about Vector Graphic is they do not suffer from distortion due to pixelation the closer you look in on them, they will always appear smooth, even curved edges. But for various reasons we no longer use Vector Graphic displays however if you can find some old arcade machines(especially the Star Wars arcade game that featured the trench run) you can see some examples of vector displays. While the edges appear smooth, the areas defined by them are either empty, creating wireframe look, or a single solid color, which isn't terribly useful especially if two or more vector images are next to each other and are the same color. Plus filled areas take significantly more processing power. So we need to convert them into Raster Graphics.
Sadly this can't be done by simple matrix multiplication and basically requires a brute force algorithm. There are basically 2 well known algorithms to do this. One finds the slope of each edge of the triangle and then goes up each pixel along the edge of the triangle and fills in that row to a certain point based on another formula I believe involves the center of the triangle. So basically you fill in 3 smaller triangles for each triangle. The second algorithm is much more brute force but is faster and is the standard one used by most graphics hardware today. It simply defines the smallest box around the triangle and then tests whether or not each pixel resides within the bounds of the triangle's edges.
There's quite a bit more to it but we've gotten to the point where we've converted worlds defined by voxels into pixel positions. There's still the matter of textures and and lighting and the other fun stuff. If you want I will post more if anyone is interested, or you can google, The Rendering Pipeline, though you're more likely to come up with more technical explanations.