When GANs Lose: Why KNN Beat GAN/GAIN for Image-Log Gap Imputation

There is a failure mode that catches good machine-learning engineers more often than they admit: reaching for the most capable model on the menu before asking what invariant the problem actually demands. A generative adversarial network is more expressive than a five-nearest-neighbour average by any reasonable measure of capacity. So when the task is "fill the missing strips in a borehole image log," the GAN looks like the grown-up answer and KNN looks like a placeholder you will replace once you have time. In a roughly twenty-month engagement with a mid-sized Middle East carbonate operator, we built both, benchmarked them on the metric that actually mattered downstream, and the placeholder won outright. The GAN/GAIN imputer — GAIN, the generative adversarial imputation network — was, in our own Phase 1 wording, "not very good" for this signal. KNN with n_neighbors = 5 became the default imputer for the entire classical pipeline. This piece is the engineering postmortem on why the more powerful model lost, because the reason generalises far beyond image logs.

The signal, and the one invariant it cannot lose

A high-resolution borehole image log — produced here by two different microresistivity imaging tools — reads micro-resistivity off pad-mounted electrodes pressed against the borehole wall. The pads do not wrap the full circumference, so the unrolled image covers only up to roughly 80% of the wall; the rest is never measured. Those unsampled columns arrive in the image data as a sentinel value — coded −9999 — which becomes NaN the moment you load the array. Every image log in the program shipped with these vertical null bands running its full length.

borehole image log

What makes this an unusual imputation problem is the downstream consumer. The geological features we care about in a fractured carbonate play — bedding planes, open and healed fractures — project to sinusoids in the unrolled image: a planar surface cutting a cylinder traces a sine wave when you flatten the cylinder, with amplitude encoding dip and phase encoding azimuth. A single fracture is one continuous sinusoid sweeping the full image width. A pad gap punches a vertical hole straight through that curve. So whatever you write into the gap is not cosmetic — it becomes part of the sine wave a detector will trace and part of the dip and azimuth a geologist will eventually read off.

That collapses the imputation question to a single, sharply-posed criterion. We are not asking "what is the most plausible pixel value in this band." We are asking "does the fill keep a sine wave a sine wave across the cut." Call it sinusoid continuity. It is a global, geometric property of the whole image row — and it is the invariant the imputer is not allowed to lose.

Four candidates, and the one that should have won on paper

We benchmarked four families on the dynamic image channel — and, for KNN, the static channel as well — across the early wells of the 14-well vertical dataset logged with two different microresistivity imaging tools:

1D linear interpolation — fill each gap row by linearly interpolating between the last valid pixel on the left and the first on the right. No training, embarrassingly parallel.
KNN imputation — scikit-learn's KNNImputer with n_neighbors = 5: each missing pixel is the mean of its five nearest neighbours in feature space, where neighbours are rows whose observed pixels are most similar.
Iterative imputation — scikit-learn's IterativeImputer: model each feature with missing values as a regression on the others, cycling until convergence.
GAN / GAIN inpainting — a generative adversarial imputer that learns to hallucinate plausible conductivity into the masked band, the way image-inpainting networks reconstruct scratched photographs.

On capacity alone the GAN is the obvious favourite. Inpainting is a near-solved-looking problem in computer vision, and a generator that has seen enough carbonate fabric ought to paint convincing rock into any hole. That confidence is exactly where the experiment turned instructive.

A high-resolution borehole image-log pad gap — the dead strip left between two different microresistivity imaging tools' pads — cuts a vertical null band through the unrolled borehole image, and whatever fills it becomes part of the sinusoid a detector traces — so the imputation question is well-posed: does the fill keep a sine wave a sine wave across the cut? Pick a method and the recovered fill redraws across the gap: KNN imputation (n_neighbors=5) interpolates along the curve and stays continuous (teal); the GAN inpaints locally-realistic texture that breaks the curve and exits at the wrong phase (the orange discontinuity is the argument); 1D-linear flattens it to a chord and leaves vertical-line artifacts; the iterative imputer stays continuous but is too slow for per-well runs. KNN won. The method ranking and compute markers (1D ~0.115 s vs KNN ~2.625 s on a 4 m interval; 1D ~11 s whole-well; KNN never finished a whole-well pass) are the article's own; the borehole image texture and the recovered-sinusoid curves are schematic.

Why the GAN lost: a loss-objective mismatch, not a tuning bug

The GAN produced the most realistic-looking fills and the least usable curves. Inside the null band it generated locally convincing texture — the speckle and contrast of carbonate — but where a sinusoid entered the gap on the left at one phase and should have exited on the right at the continuing phase, the generator had no notion that those two stubs belonged to the same curve. It solved each gap as an independent texture-completion problem. The sine wave went in; generic rock came out; the curve was broken on the far side.

The temptation is to treat this as a hyperparameter problem — more training, a deeper generator, a continuity penalty bolted on. It is not. It is a structural mismatch between the training objective and the task invariant, and naming it precisely is the whole lesson.

A GAIN-style adversarial objective rewards local realism. The generator is optimised to fool a discriminator that judges whether a patch looks like real, observed data. That objective is satisfied perfectly by a fill that is statistically indistinguishable from rock at the patch scale. Nothing in it ties the left edge of the gap to the right edge. Sinusoid continuity, by contrast, is a global geometric constraint that spans the entire image width: the phase of the curve leaving the gap is fixed, by the physics of a planar feature, by the phase entering it. The GAIN loss never encodes "the thing crossing this band is a single surface whose azimuth pins its phase." So the model optimised exactly what we asked it to — local plausibility — and that is precisely the wrong thing to optimise for a continuity-critical signal. The more expressive the generator, the more convincingly it filled the gap with the wrong curve.

The realism trap

Local realism and downstream usefulness are different objectives. A fill can be visually indistinguishable from real rock and still destroy the global structure — the sinusoid — that every downstream algorithm depends on. Benchmark an imputer on the feature you are about to detect, not on how convincing its pixels look. A discriminator that cannot see the whole curve cannot defend it.

This is why we say the GAN lost on inductive bias, not on capacity. Its bias — "make this patch look real" — is a faithful match for inpainting a damaged photograph and a faithful mismatch for preserving a planar-feature projection. Capacity does not rescue a wrong bias; it amplifies it.

Why KNN won, and where it nearly didn't

KNN imputation preserved continuity for an almost embarrassing reason: it has no generative ambition. By filling each missing pixel from the mean of its five nearest neighbours — rows that already carry the local trend of the curve — KNNImputer interpolates along existing structure instead of inventing new structure. Where a sinusoid passes through, the neighbours on either side encode where the curve is heading, and the imputed pixels land on that trajectory. The sine wave stays a sine wave. The same property that makes KNN a "boring" model — it can only reproduce combinations of what it has already seen — is exactly the property that protects a global geometric invariant. It cannot hallucinate a discontinuity because it cannot hallucinate at all.

KNN was not the fastest method, and that nuance matters for any engineer choosing a default. 1D linear interpolation was dramatically cheaper: on a four-metre interval it averaged 0.115 s against KNN's 2.625 s, and it imputed a whole well in roughly 11 s — where KNN never finished a whole-well pass at all. But speed bought artifacts. Linear fills stretch and flatten the curve to a chord through wide gaps and leave a vertical-line signature that downstream edge detectors mistake for real geology. The iterative imputer preserved continuity acceptably but cycles regressions to convergence, which on image logs of this size is a non-starter for a per-well pipeline. KNN sat at the only viable corner of the trade-off: continuity-preserving, less compute-intensive than the iterative imputer, and good enough on throughput to be operational. That is why it became the default.

The quantitative tell arrived when we pushed each imputed image through the dip-and-azimuth fitting pipeline and compared against the operator's ground-truth interpretation. At one fracture at 2697.17 m, the alternative fills produced dip/azimuth estimates of −19.75° / 331.16° (1D linear) and −37.17° / 260.81° (iterative) — negative dips, which are physically meaningless and a direct fingerprint of a fill bending the curve off its true geometry. The KNN-imputed image was the one that kept the estimate (51.1° / 119.34° at the same depth) inside the realm of the physically possible. When your error metric returns a value that cannot exist in the physical world, the imputer, not the detector, is usually the culprit.

The rule worth keeping

Generalise past image logs and the postmortem hardens into a model-selection rule. Choose the model whose inductive bias preserves the invariant your downstream signal cannot lose — then benchmark on that invariant, not on a proxy for it. For these pad gaps the invariant was sinusoid continuity, a global geometric constraint; the model whose bias respected it was the least expressive one on the bench. The GAN's adversarial objective optimised local realism, a proxy that looked like the goal and was not. We would have caught this faster if we had scored every method on predicted dip/azimuth from day one instead of eyeballing how real the fills looked — the realism trap is seductive precisely because the wrong model produces the prettiest pictures.

Two engineering corollaries fall out of this for any ML team porting a generative method into a measurement pipeline. First, the right fill is a function of the consumer, not a property of the image. Once we moved from the classical sinusoid-fitting pipeline to a supervised transformer detector, the better input was no fill at all — nulls set to a consistent zero sentinel, which the network learns to read as "absent" rather than mistaking a fabricated fill for rock. For vug detection we filled the nulls with the local median colour instead of interpolating, specifically to avoid imputation conjuring false vugs at the gap edges. There is no universal imputer; there is only the right fill for the next stage. Second, a generative model earns its keep only against the baseline it claims to beat — and across operators we have worked with, from the Middle East to the United States, the baseline a fancy imputer has to beat is rarely the one engineers expect. Here it was not the GAN. It was KNN with five neighbours.

Key takeaways

The GAN lost on inductive bias, not capacity. GAIN's adversarial objective rewards local realism — a patch that looks like real rock — but sinusoid continuity is a global geometric constraint spanning the whole image width that the loss never encodes. More generator capacity amplifies the wrong bias rather than fixing it.
KNN imputation (n_neighbors=5) won precisely because it has no generative ambition: filling each pixel from its five nearest neighbours interpolates ALONG the existing curve, so the sine wave survives the pad gap. The 'boring' model protects the invariant that the expressive one destroys.
Benchmark on the downstream feature, not on pixel realism. The realism trap is that the wrong model produces the most convincing-looking fills. Scoring on predicted dip/azimuth exposes it: at a 2697.17 m fracture, 1D (-19.75deg) and iterative (-37.17deg) fills gave physically impossible negative dips; KNN stayed believable.
KNN was not the fastest — 1D linear ran ~0.115 s vs KNN's ~2.625 s on a 4 m interval and finished a whole well in ~11 s where KNN never finished one — but speed bought artifacts. KNN sat at the only viable corner: continuity-preserving, cheaper than the iterative imputer, operational.
The right fill depends on the consumer. The classical pipeline needed KNN; the supervised transformer detector did better with a zero sentinel ('no data') than any fill; vug detection used a local-median fill to avoid false vugs. There is no universal imputer — and a generative model only earns its keep against the real baseline, which here was KNN, not the GAN.

When GANs Lose: Why KNN Beat GAN/GAIN for Image-Log Gap Imputation

The signal, and the one invariant it cannot lose

Four candidates, and the one that should have won on paper

Why the GAN lost: a loss-objective mismatch, not a tuning bug

Why KNN won, and where it nearly didn't

The rule worth keeping

Continuous AI for explorers

About Earthscan

Products

Legal

Follow us on