Tile-map rendering in Tristeon

In 2D level design, tilesets and tilemaps are commonly used to accelerate the process of creating scenes or images. However, rendering these elements can become inefficient as the level size increases, resulting in lower performance.

To address this issue, I designed a new technique, which uses a full-screen fragment shader to sample tiles on a per-pixel basis. Unlike traditional rendering techniques that render tiles one by one, this approach only samples tiles if their pixels are visible, resulting in a consistent rendering speed of O(1) regardless of level size. In this post, I will discuss this new technique in detail, including its benefits and limitations.

Definitions

A tileset is a grid-like object that can hold multiple 2D graphics at once, all within a single texture. Each cell in the tileset grid is called a tile. Although tiles often take the shape of squares due to the grid structure, they can have any shape.

A tilemap is a composition of tiles used to create a scene or image. Tilesets and tilemaps are common practices in 2D level design as they accelerate the process by allowing artists to reuse and adjust existing art, and specialized tools can be created for creating and manipulating them.

The Problem

When tiles are rendered linearly as individual objects, the rendering cost quickly becomes unmanageable, as demonstrated in the table below:

Tile countRender time
10.030ms
100.177ms
1001.512ms
1.0003.780ms
10.00037.230ms
100.000379.803ms
1.000.0003915.341ms

While there are existing approaches to addressing this issue, such as culling or building a vertex buffer to represent the tiles and reduce the number of draw calls, each of these methods has its own limitations; culling only works with a limited camera zoom, and dynamically building vertex buffers comes with an associated cost too. To overcome these limitations, I designed a new new approach to rendering large tilemaps.

Technique

Rendering a full-screen shader
A full-screen triangle is rendered using a technique described by Randall Rauwendaal in https://rauwendaal.net/2014/06/14/rendering-a-screen-covering-triangle-in-opengl/. This approach simplifies rendering a full-screen quad by eliminating the need for buffers.

Fragment shader
The next step is to utilize the fragment shader of the triangle to map each pixel to its corresponding tile. In the main function of the fragment shader, the screen UVs are scaled to identify the tile beneath the pixel and determine the appropriate UVs to pass to the getTileUV() function.

Center UVs
Texture coordinates typically define the (0, 0) point as the bottom-left corner, but for us, this point is set at the center of the screen. Therefore, the texture coordinates require some slight adjustments:

vec2 coords = texCoord;
coords.x -= 0.5f;
coords.y -= 0.5f;

Camera position & zoom
Subsequently, we need to adjust the current coordinates to align with the camera’s position. By performing this step early on, we can then utilize the coordinates to determine the position on the grid in world space.

coords.x += float(camera.posX) / (camera.pixelsX / camera.zoom);
coords.y += float(camera.posY) / (camera.pixelsY / camera.zoom);

In order to use the camera’s position, which is measured in pixels, we need to convert it into uniform coordinates. Screen coordinates are usually represented by values between 0 and 1, where {x / width, y / height} represent the position of a pixel on the screen.

To achieve this, we need to divide the camera’s position by the camera’s pixel count. Zooming can then be achieved by adjusting the number of pixels the camera displays, which is done by dividing the pixel count by the camera’s zoom value.

Determine tile size
By utilizing the camera’s pixel measurements and zoom value, we can ascertain the range of texture coordinates of a tile displayed on the screen.

float normalizedTileWidth = float(level.tileRenderWidth) / (camera.pixelsX / camera.zoom); 
float normalizedTileHeight = float(level.tileRenderHeight) / (camera.pixelsY / camera.zoom);

Using the texture coordinate range, we can create a grid of tiles on the screen, and any pixels that represent tiles outside of the level’s dimensions can be discarded at this point.

float tileX = (coords.x / normalizedTileWidth);
float tileY = (coords.y / normalizedTileHeight);
if (tileX >= level.width || tileY >= level.height || tileX < 0 || tileY < 0) 
    discard; //Discard all out of map tiles

Calculate tile index and UV
We can now use the previously calculated tileX and tileY variables to determine both the index of the tile and the UV coordinates within the tile:

//Calculate tile index by taking the integer part of tileX and tileY
uint dataX = uint(floor(tileX));
uint dataY = uint(floor(tileY));
uint dataIndex = dataY * level.width + dataX;

//Calculate tile UVs by taking the decimal part of tileX and tileY
float tileU = mod(tileX, 1.0f);
float tileV = mod(tileY, 1.0f);

Read tile data
Now that we have calculated the tile’s index, we can use it to access the tile-map and determine the tile that is placed at that position. For this purpose, we use the texelFetch() function, which reads the buffer texture by taking an integer index to access the texels directly. It works similar to the texture2D() function.

//Read the tile's tile-set first and discard if it's not the one currently being rendered
int tileSetValue = texelFetch(level.data, int(dataIndex) * 2 + 1).r;
if (tileSetValue != int(tileSet.id))
    discard;

//Read the tile's value
int dataValue = texelFetch(level.data, int(dataIndex) * 2).r;
if (dataIndex == -1)
    discard; //Discard empty tiles

//Convert the index to x,y coordinates for the tile-set
ivec2 tileIndex = ivec2(dataIndex % int(tileSet.cols), tile / int(tileSet.cols));

getTileUV()
The function below computes the tile-set UV based on UV coordinates and the tile’s x and y coordinates on the texture.

vec2 getTileUV(vec2 uv, uint tileX, uint tileY)
{
    //Coords beyond our tileset mess up spacing so we clamp them
    tileX = tileX % tileSet.cols;
    tileY = tileSet.rows - uint(1) - (tileY % tileSet.rows);
    ivec2 texSize = textureSize(tileSet.texture, 0);

    //Determine the amount of pixels per tile
    uint tilePixelsX = (uint(texSize.x) - ((tileSet.spacingLeft + tileSet.spacingRight) + ((tileSet.cols - uint(1)) * tileSet.horizontalSpacing))) / tileSet.cols;
    uint tilePixelsY = (uint(texSize.y) - ((tileSet.spacingTop + tileSet.spacingBottom) + ((tileSet.rows - uint(1)) * tileSet.verticalSpacing))) / tileSet.rows;

    //Determine the start point of the tile depending on spacing
    uint startX = tileSet.spacingLeft + (tileX * tilePixelsX) + (tileX * tileSet.horizontalSpacing);
    uint startY = tileSet.spacingBottom + (tileY * tilePixelsY) + (tileY * tileSet.verticalSpacing);

    //Scale UV to tile coords, then normalize into texture coords
    float x = ((uv.x * tilePixelsX) / texSize.x);
    float y = ((uv.y * tilePixelsY) / texSize.y);

    //Add start pixels, also scaled into normalized texture coords
    float u = x + (startX / float(texSize.x));
    float v = y + (startY / float(texSize.y));
    
    return vec2(u, v);
}

Sample Texture
The last step is to convert the data index and the tile’s UV into the UV to be used on the tile-set. We can use the getTileUV() function described above to calculate the UV by passing it the tile UV values we calculated earlier. This value can then be used to determine the fragment color.

vec2 tileSetUV = getTileUV(vec2(tileU, tileV), uint(tileIndex.x), uint(tileIndex.y));
FragColor = texture(tileSet.texture, tileSetUV);

Results

After implementing the complete tile-map shader (available at https://github.com/Tristeon/Tristeon/blob/master/bin/Internal/Shaders/TileShader.frag), we can now take a look at its results:

Tile-map sizeTile countRender time
1×110.034ms
10×101000.032ms
100×10010.0000.035ms
1.000×1.0001.000.0000.050ms

By eliminating the need to iterate over each tile and render them separately, we’ve achieved near-O(1) speeds. I was unfortunately unable to test tilemaps larger than 1 million tiles due to OpenGL buffer texture limits.

1 million tiles rendered in 0.050ms

Future Improvements

Shape limitation
The current renderer only functions with tiles that have a square shape, and accommodating other shapes, like hexagonal grids, would entail a considerable amount of extra effort.

Driver limitations
Although buffer textures can be utilized on any device that runs OpenGL, their minimum size, which is mandated by the GPU’s OpenGL drivers, is 65,536 texels.

The algorithm employs 2 integers per tile, and each integer is represented by a distinct texel. If the graphics driver adheres to the minimum texel requirement set by the standard, then the algorithm can support a maximum of 32,768 tiles. This constraint may restrict the developer to smaller tile-maps, such as 300×100 (which consists of 30,000 tiles).

Future improvements

Easing size limits
The buffer texture could be substituted with a Shader Storage Buffer Object (SSBO), which has a substantially larger minimum size of 128MB by default (as per https://www.khronos.org/opengl/wiki/Shader_Storage_Buffer_Object). However, utilizing SSBOs involves a trade-off since some older GPUs may not support OpenGL 4.3, which is required for SSBO compatibility.

Mip-mapping
The technique enables the renderer to render a considerable amount of tiles without compromising on performance. Nevertheless, the visual quality of the tile-map renderer diminishes at lower zoom values since the tiles tend to blend together at lower mip-map levels. To address this issue, the mipmapping behavior could be modified, or alternatively, the tile-set could be separated into distinct tile textures during its creation.

Comments are closed.

Create a website or blog at WordPress.com

Up ↑