yes, this is andy

← Back to all posts

Shader Optimization Techniques

#rendering #shaders #optimization

Shader Optimization Techniques

Shaders run on every pixel, every frame. Small optimizations compound massively.

The Golden Rule

Profile first, optimize second. Modern compilers are smart. Don’t waste time optimizing code that isn’t slow.

Common Bottlenecks

Texture Fetches

The slowest operation in most shaders:

// BAD - Multiple fetches from same texture
vec3 color1 = texture(albedoMap, uv).rgb;
vec3 color2 = texture(albedoMap, uv + offset).rgb;

// GOOD - Fetch once, reuse
vec4 albedo = texture(albedoMap, uv);
vec3 color1 = albedo.rgb;
vec3 color2 = albedo.rgb;

Dependent Texture Reads

Reads where UV depends on previous texture fetch:

// SLOW - Dependent read
vec2 offset = texture(noiseMap, uv).rg;
vec3 color = texture(colorMap, uv + offset).rgb;

// FASTER - Pre-compute in vertex shader when possible

Math Operations

Some operations are more expensive:

Fast: add, multiply, mad (multiply-add) Medium: reciprocal, rsqrt Slow: division, sqrt, pow, sin, cos, tan

// BAD
float result = value / constant;

// GOOD - Pre-compute reciprocal
float invConstant = 1.0 / constant;
float result = value * invConstant;

Optimization Patterns

Move to Vertex Shader

Calculations that don’t need per-pixel precision:

// Vertex shader
out vec3 viewDir;
void main() {
    vec3 worldPos = (model * vec4(position, 1.0)).xyz;
    viewDir = normalize(cameraPos - worldPos);
    // ...
}

// Fragment shader - viewDir already calculated!

Use Lower Precision

Mobile especially benefits:

// Use mediump where possible
mediump vec3 normal;
mediump float roughness;

// Only use highp when necessary
highp vec3 worldPosition;

Vectorize Operations

GPUs love vector math:

// BAD - Scalar operations
float r = texture(tex, uv).r * factor;
float g = texture(tex, uv).g * factor;
float b = texture(tex, uv).b * factor;

// GOOD - Vector operation
vec3 color = texture(tex, uv).rgb * factor;

Branch Prediction

Avoid dynamic branching:

// BAD - Dynamic branch
if (useTexture) {
    color = texture(tex, uv).rgb;
} else {
    color = baseColor;
}

// GOOD - Use mix/lerp
float useTex = float(useTexture);
color = mix(baseColor, texture(tex, uv).rgb, useTex);

Advanced Techniques

Approximations

Sometimes exact isn’t necessary:

// Exact pow(x, 5) is expensive
float pow5(float x) {
    float x2 = x * x;
    return x2 * x2 * x;
}

// Fast approximate sqrt for normalization
vec3 fastNormalize(vec3 v) {
    return v * inversesqrt(dot(v, v));
}

Lookup Tables (LUTs)

Pre-compute expensive functions:

// Pre-computed in texture
float fresnel = texture(fresnelLUT, vec2(NdotV, roughness)).r;

Pack Data

Use all texture channels:

// Pack multiple values in one texture
// R: Metallic
// G: Roughness
// B: Ambient Occlusion
// A: Height
vec4 surfaceData = texture(packedMap, uv);

Measuring Performance

Use GPU profilers:

Look for:

The Reality

Most shaders are fine. Focus optimization on:

Readability > premature optimization.

[!note] A shader that’s 10% slower but maintainable is usually the right choice.