In the article, Natalya Tartarchuk describes a way to shader such dynamic terrain that computes the normal on-the-fly.
This is very interesting since it could allow one to dynamically modify the height map and have the normals adjust automatically.
After spending a few hours debugging the normals I finally got something visually pleasing, although I've noticed a few artifacts.
After some more testing I've noticed that there is some error involved in computing the normals on the GPU (compared to the reference algorithm on the CPU).
The first image is the reference image, the normals have been generated on the CPU using the central difference algorithm.
In order to visualize the normals, they are moved into the [0,1] range by doing Normal * 0.5 + 0.5:
Reference normals generated on the CPU
The second picture is ATI's algorithm running on the pixel shader.
I had to tweak some of their variable since it seemed to rely on "magic" (but documented) value:
ATI's algorithm
The following picture shows the error between the normals computed as GPUNormal - CPUNormal:
Divergence from CPU normals. Grey areas mean error = 0 (since 0 * 0.5 + 0.5 = 0.5).
The third picture is the central difference algorithm ported on the GPU and running in the pixel shader.
The normals generated by the central difference algorithm in the pixel shader
The resulting error
My implementation of the central difference algorithm on the GPU must be pretty bad considering the divergence. I also tried the two algorithms on the vertex shader but sadly the error was just too great (except on the flat plane).
Now the question is, is it worth it to generate the normals on the GPU ?
I would say, if you're not having a dynamic terrain, you're better off with the normals on the CPU since it leads to much greater visual quality and won't steal one of those precious sampler slot so needed on SM3 hardware.
One advantage of generating the normals on the fly could be to reduce the size of the vertex buffer (take out normal + tangent from the vertex declaration => up to 6 floats per vertex removed) but that is to be balanced with the size of the height map.