The renderer at the end of the previous chapter knows direct lighting well. Directional sun, spot lights, point lights, all shadowed correctly with whichever filter the scene calls for. What it doesn’t know is the sky. Step outside on an overcast day: there is no sun, but the world is lit. Indoors, every wall is reflecting some fraction of the room’s light back. None of that exists in the renderer yet.
This chapter replaces analytic point lights with an environment: an HDR panorama wrapped around the scene as a continuous emitter. The path tracer in the previous series did this honestly, importance-sampling the panorama into a CDF and casting tens of thousands of rays per frame. That’s not viable in real time. The trick is to pre-filter the environment — once, at load — into representations that can be evaluated against in a few texture lookups per pixel.
Two pre-filters cover the two halves of the BRDF. Diffuse — low-frequency, smoothly varying — projects beautifully onto spherical harmonics: nine coefficients that summarize the entire diffuse irradiance from any direction at O(1) shader cost. Specular — sharp, view-dependent, roughness-dependent — needs sample-by-sample work, but a small number of importance-sampled GGX samples with the right mip-level selection gets there.
The environment as light source
The HDR panorama is loaded as a single equirectangular texture with linear floating-point RGB. The shader maps a world-space direction to a UV coordinate via standard latitude/longitude unwrapping:
vec2 uvOfW(vec3 dir) {
return vec2(
0.5 - atan(dir.x, dir.z) / (2.0 * PI),
acos(dir.y) / PI
);
}
Every direction in 4π steradians is now a single texture lookup away. The panorama’s pixel values are radiance: they describe how much light is arriving from every direction. The integral over the hemisphere of those radiance values, weighted by the BRDF, is the surface’s outgoing radiance. The renderer’s job is to compute that integral cheaply.
Eight HDRIs are loaded at startup, spanning warm sunset, moonless night, urban cobblestone with shop lights, suburban garden noon, coastal balcony, warm interior lounge, sunrise, and a studio lit by saturated colored lights. The shader doesn’t care which is active: the SH coefficients re-project per map at load time, the panorama re-binds, and the pipeline runs identically for any of them.
Diffuse, projected to nine numbers
The diffuse term integrates the environment’s radiance against cos(θ) over the upper hemisphere relative to each surface normal. The integrand is low-frequency — the cosine smooths everything out — which means it can be very accurately approximated by a low-order spherical harmonic projection.
A second-order SH expansion uses nine basis functions Y_lm(direction) for l ∈ {0, 1, 2}. At load time, every pixel of the panorama is projected onto each basis function and summed, producing nine RGB coefficients:
// SH projection: L_lm = ∫∫ L(ω) · Y_lm(ω) · sinθ dθ dφ
float Y[9];
evalBasis(x, y, z, Y);
float weight = sinTheta * dTheta * dPhi;
for (int k = 0; k < 9; ++k)
Llm[k] += L * Y[k] * weight;
One trap: the (x, y, z) direction passed to evalBasis must be the exact inverse of the uvOfW mapping the shader will use at runtime. If the load-time projection and the shader evaluation disagree about which direction a pixel represents, the SH coefficients are computed against one orientation and sampled against another. The result is a smoothly wrong irradiance with no obvious visual signature: the function is still continuous, just rotated against itself.
A neat optimization: the cosine-weighted hemisphere convolution that turns radiance into irradiance can be folded into the coefficients at this stage, because the convolution kernel is also expressible in low-order SH and acts diagonally on each band. The Ahat factors per band are constants (l=0: π, l=1: (2/3)π, l=2: (1/4)π). Multiplying them in once at the end means the shader never needs to apply them at runtime:
const float Ahat[9] = {
M_PI,
(2.0/3.0)*M_PI, (2.0/3.0)*M_PI, (2.0/3.0)*M_PI,
(1.0/4.0)*M_PI, (1.0/4.0)*M_PI, (1.0/4.0)*M_PI,
(1.0/4.0)*M_PI, (1.0/4.0)*M_PI,
};
for (int k = 0; k < 9; ++k) coeffs[k] = Llm[k] * Ahat[k];
What the shader sees is nine pre-baked RGB values that, when dotted against the SH basis evaluated at any normal N, produce the exact diffuse irradiance for that direction — to second-order accuracy:
vec3 evalSHIrradiance(vec3 n) {
float x = n.x, y = n.y, z = n.z;
float Y[9];
Y[0] = 0.5 * sqrt(1.0 / PI);
Y[1] = 0.5 * sqrt(3.0 / PI) * y;
Y[2] = 0.5 * sqrt(3.0 / PI) * z;
Y[3] = 0.5 * sqrt(3.0 / PI) * x;
Y[4] = 0.5 * sqrt(15.0 / PI) * x * y;
Y[5] = 0.5 * sqrt(15.0 / PI) * y * z;
Y[6] = 0.25 * sqrt(5.0 / PI) * (3.0 * z*z - 1.0);
Y[7] = 0.5 * sqrt(15.0 / PI) * x * z;
Y[8] = 0.25 * sqrt(15.0 / PI) * (x*x - y*y);
vec3 irradiance = vec3(0.0);
for (int k = 0; k < 9; k++) irradiance += shCoeffs[k] * Y[k];
return max(irradiance, vec3(0.0));
}
vec3 diffuse = (albedo / PI) * evalSHIrradiance(normalize(N));
Nine multiply-adds. The full diffuse integral, evaluated in the time it takes to read nine constant uniforms.
The studio map is the most diagnostic of the eight. With multiple colored lights arranged around the scene, the SH coefficients capture each one’s dominant direction; rotating the sphere through the resulting irradiance field steps cleanly through the colors instead of muddying them into a single average tint. The same coefficients are doing the same nine multiply-adds; what changes is the input panorama, and the SH machinery handles it without any per-environment tuning.
The accuracy is limited by the SH order. A second-order projection captures the directional flow of light correctly but smooths over high-frequency features (sharp sun edges, narrow window slits). For a low-frequency diffuse response, that smoothing is a feature: it matches the physical convolution against cos(θ). For specular, where high-frequency detail is the entire point, SH won’t work.
Specular, importance-sampled
Specular reflectance varies sharply with view direction and roughness: a polished sphere shows a tight reflection of the sun, a rougher one smears it across half the sky. The GGX BRDF concentrates its energy in a lobe whose width depends on roughness. Sampling that lobe correctly is the work of importance sampling.
For each fragment, the shader draws N quasi-random sample pairs (e1, e2) from the Hammersley sequence: a low-discrepancy sequence whose two coordinates come from different sources. The first is a stratified i / N step that walks the unit interval in 1/N increments. The second is the Van der Corput radical inverse of i, which is the integer’s binary representation reflected about the binary point (the bits b₁ b₂ b₃ ... reinterpreted as the fraction 0.b₁ b₂ b₃ ...):
float x = float(i) / float(numSamples); // stratified
float y = radicalInverse_VdC(i); // bit-reversed
The pairs are packed into vec4 for std140 alignment and uploaded once to a UBO; changing N at runtime reallocates the buffer rather than recomputing per-frame. Hammersley is preferred over uniform random because the stratified-plus-radical-inverse construction covers the unit square evenly at every prefix. Even a small N covers the GGX lobe with much lower variance than the same number of plain random samples would.
// (e1, e2) drawn from Hammersley, warped into the GGX lobe
float theta = atan(roughness * sqrt(e2) / sqrt(1.0 - e2)); // GGX half-angle
vec3 D = vectorOf(e1, theta / PI); // half-vector in local frame
vec3 L = tangent * D.x + R * D.y + bitangent * D.z; // world-space sample dir
The trick is that the sample’s probability density is itself an output of the GGX distribution function D(m). A direction with high D (smooth surface) means the sample fell in a tight, concentrated lobe, so the environment at that direction should be sampled sharply. A direction with low D (rough surface) means the sample is spread across a wide lobe, so a blurred environment value is the correct one to sample.
Equirectangular textures with mipmaps are the natural fit. The shader picks a mip level proportional to the inverse of the PDF:
ivec2 size = textureSize(envMap, 0);
float level = 0.5 * log2(float(size.x * size.y) / float(N))
- 0.5 * log2(max(D, 0.0001) / 4.0);
level = clamp(level, 0.0, maxLod);
vec3 Li = textureLod(envMap, uv, level).rgb;
Sample-by-sample, the shader fetches a mip-correct radiance value, plugs it through the Cook-Torrance terms, and accumulates:
float D = DistributionGGX(N, H, roughness);
float G = GeometrySmith(N, V, L, roughness);
vec3 F = FresnelSchlick(HdotV, F0);
specColor += Li * NdotL * (G * F) / (4.0 * NdotV * NdotL);
The accumulator is normalized by the number of accepted samples — those with dot(N, L) > 0 — rather than total N, which prevents surfaces near the horizon from darkening because half their hemisphere falls below the surface plane.
A single sample produces a single sharp reflection of the environment; ten samples produce ten superimposed reflections that visibly blend; forty samples produce a smooth, well-converged specular term that already looks correct on most materials at most roughnesses. Above forty, the gains are diminishing and the cost is linear.
With the specular machinery in place, the most diagnostic test is to fix a brushed-metal surface (metallic = 1, roughness ≈ 0.5) and swap environments to see which sky regions the lobe actually picks up.
Two observations land here. The coastal balcony’s strong sun produces the expected concentrated highlight, but a fainter secondary lobe also appears tracking sunlight bouncing off the nearby walls. The specular integral is picking up indirect bounce light that the HDRI panorama already captured, no extra work required. The indoor lounge has two distinct sources, a window sun and a ceiling light, and the two highlights they produce are visibly different intensities, directly reflecting the radiance difference between them in the source map. Sharp lobe, soft lobe, different sources: the BRDF doesn’t make any of this up; it integrates whatever the panorama already encoded.
Counting reflections to validate sampling
The cleanest way to verify the importance-sampling direction logic is to set the mip level to 0, make the surface fully metallic at medium roughness, and step N up from 1.
At N = 1, the sphere shows a single clean reflection of the environment. N = 2 produces two superimposed reflections; N = 3, three. Each additional sample contributes a distinct reflected image, and the fact that you can count them is direct visual proof that the per-sample directions are being computed correctly. A bug in the GGX warp or the tangent-space rotation would produce overlapping garbage or zero variation, not a clean count.
Holding N fixed and varying roughness produces the complementary verification. At roughness 0, every sample collapses to the same direction and the images merge into one sharp reflection. As roughness climbs, the samples spread apart in a way that traces the GGX lobe’s widening, which is also exactly what the shader is computing in theta = atan(roughness * sqrt(e2) / sqrt(1 - e2)). Two different parameters, two different visual signatures, both consistent with the math.
The material parameter space
With both halves of the IBL working, the renderer covers the full (metallic, roughness) parameter space. Sweeping each axis independently makes the contribution of each term legible:
Energy conservation enforces the trade. As metallic rises, kD = (1 - F) · (1 - metallic) drives the diffuse contribution toward zero: metals are physically diffuse-free. As roughness rises, the GGX NDF broadens: the specular energy is spread over more of the sky and the highlight loses contrast. Both terms remain in unit balance: the surface never reflects more light than it received.
Real models bring real complexity: color, normal, metallic, roughness all sampled per-UV from texture maps so a single object can have varied material response across its surface:
The aged-bronze lion is the cleanest demonstration that material is spatially varied per-UV. The sheen doesn’t only track the sun’s reflection direction; it tracks the metallic channel of the texture map, brighter on the smoother ridges of the mane and duller on the patches of worn-down metal that read as lower-metallic. The texture map is doing real work; the shader is reading it and responding correctly per fragment.
The rough marble against the colored-studio HDRI is the diffuse counterpart. With no metallic suppression of the diffuse term, the surface picks up the studio’s saturated lights cleanly — transitioning through yellow, green, blue, and back as the scene rotates, with each tint corresponding to the dominant SH-projected direction of the room’s lights. The metals show what kS does when the environment is sharp; the marble shows what kD does when the environment is colored.
The katana is the cleanest geometric demo in the set. With the blade laid horizontally against a single sun, the specular highlight sweeps from one end of the metal to the other as the scene rotates: a continuous travelling reflection that’s harder to fake than a fixed highlight. The back face of the blade, oriented away from the sun, stays dark with only the faint diffuse metallic sheen. The contrast between the front (bright sweeping highlight) and back (flat dark) is the specular integral’s directional sensitivity made visible.
The vintage camera is a mixed-material model: a low-metallic resin body with embedded high-metallic components (lens rings, side plates). Both responses appear in the same frame: the lens surfaces produce a soft spread highlight consistent with their low roughness, while the metal trim produces a sharper, brighter reflection consistent with metallic = 1. The shader runs identically for every fragment; the visual separation between the resin and metal is entirely driven by the per-UV material maps. Same loop, different parameters per pixel: exactly the deferred pipeline’s payoff.
Tone mapping HDR back to LDR
The IBL pipeline produces values that can be arbitrarily bright: the panorama’s sun is several orders of magnitude brighter than its sky. Display devices, however, are 8-bit per channel and clamp at (1, 1, 1). A naïve clamp renders any pixel with even modestly high radiance as pure white, losing all the detail in the highlight.
Tone mapping is the curve that compresses the HDR range back to LDR while preserving as much perceptual detail as possible. Several operators are common:
| Operator | Curve | Character |
|---|---|---|
| Reinhard | x / (1 + x) |
Simple, desaturates highlights |
| ACES | (x(ax + b)) / (x(cx + d) + e) |
Industry-standard filmic look |
| Uncharted 2 (Hable) | Explicit shoulder/toe parameters | Tunable, rich shadows |
| PBR Neutral | Khronos-spec, desaturation-preserving | Faithful to specular highlights |
All operators run after exposure has been applied (a linear multiplier on radiance) and are followed by gamma correction (pow(color, 1/2.2)) to convert the linear output to sRGB. ACES is the personal default: its filmic shoulder reads as “cinematic” without being too aggressive about color shift.
What’s still flat
The renderer at the end of this chapter is a competent real-time PBR engine. It has correctly-shadowed direct lighting, an HDR environment as the ambient and specular source, energy-conserving BRDFs, and tone-mapped output that holds up on a normal display. Walked through, the scene looks pretty good.
But every fragment receives the full environment irradiance for its normal direction. There’s no concept of some directions of light being blocked by nearby geometry. A statue’s eye socket is lit by exactly the same SH irradiance as a flat patch of floor next to it. The algorithm doesn’t know the eye socket is a recessed cavity surrounded on all sides by skull. Corners, crevices, contact points between objects — places that physically receive less ambient light because the surrounding geometry blocks part of the hemisphere — all read as fully exposed.
The next chapter is the screen-space approximation that fixes that. For each pixel, sample a small disc of nearby fragments, ask how many of them sit above the current surface (and would therefore block ambient light from above), and use the result to attenuate the IBL term. It’s an old trick — ambient occlusion — but the version that runs on a G-buffer in screen space is fast enough to apply to every visible pixel every frame.