RESEARCH8 min read · 13 Jul 2022

VeerNet — a transformer-residual hybrid for raster well-log digitisation

VeerNet — published in MDPI Journal of Imaging — is the deep-learning architecture EarthScan built specifically for the raster-log digitisation problem: turning scanned paper well-logs into queryable digital curves. Hybrid CNN-residual encoder + transformer + upsampling decoder. F1 = 35%, IoU = 30% on a 10K-image benchmark.

by Tannistha Maiti · Senior AI Researcher

veernetraster-log-digitisationwell-logsencoder-decodertransformerresidual-blocksloss-functionslovasz-lossmdpilegacy-import

“
Decades of legacy raster archives — many wells still producing — sit trapped behind manual digitisation. VeerNet automates the unsexy bottleneck no one wants to fix.
”

Why raster log digitisation is the unsexy bottleneck no one wants to fix

Well-logging is foundational to oil and gas extraction — it's how operators determine formation depth, lithology, and ultimately oil reserves in place. Modern logs are recorded digitally in formats like LAS (Log ASCII Standard). But decades of historical logs exist only as raster images — scanned copies of paper logs saved as PNGs or PDFs.

VeerNet on the 10K-image benchmark

10,000

Paired raster + LAS training examples in the benchmark dataset

F1 = 35%

Best loss function (Lovász) overall classification + digitisation score

IoU = 30%

Intersection-over-Union on held-out test wells

~50×

Throughput improvement over manual click-based digitisation

VeerNet's mechanism, not its metrics. A scanned multi-curve log track flows through a residual encoder, a self-attention transformer bottleneck, and an upsampling decoder that emits a per-pixel curve mask. A CNN's local receptive field (grey window) cannot link a fracture in the upper third of a track to the trace continuing through the lower third; the orange self-attention span bridges exactly that gap — which is, the article says, where VeerNet earns its name. Drag the scrubber (or arrow-key it) to advance the pipeline and watch the long-range attention relay stitch the two disconnected fragments. Every shape is structural/schematic: the article describes the architecture and the self-attention argument in prose but gives no kernel sizes or span lengths, so the instrument carries no benchmark numbers.

Raster archives are everywhere:

The 1970s–1990s drilling boom in mature basins (North Sea, Gulf of Mexico, North Slope) produced thousands of paper logs that were never digitised.
Acquisition data for older fields lives in operator and joint-venture archives as boxed paper or low-resolution scans.
Well-logging companies have rooms-full of paper from before the first digital tape drives.

These archives are valuable — many of those wells are still producing, or sit adjacent to wells under reactivation review — but they're trapped behind manual digitisation. Tracing curves by hand from a scanned image is the kind of work senior petrophysicists hate (rightly) and junior staff burn out on (also rightly).

VeerNet — published in MDPI Journal of Imaging, 9(7), 136 — is the architecture we built specifically to automate this.

Previous work — and why it didn't scale

The literature on well-log digitisation is sparser than you'd expect. Two main approaches have been tried:

Commercial software with manual seed-points. Tools like NeuraLog, GeoGraphix LogMaker, and similar commercial offerings give an interpreter a digitisation UI: click on the curve trace at regular depth intervals, the software interpolates between clicks. Time per log: 1–4 hours depending on log quality. Inter-interpreter consistency: weak.

Unsupervised computer-vision techniques. Edge detection, Hough transforms, traditional clustering on the curve pixels. Works when the log is clean (high-contrast scan, no overlapping curves). Fails when the log has multiple curves on the same track (almost always), faded ink, or hand-annotated grid markings.

Neither scales. Both require manual intervention. Both are slow.

The need: an end-to-end approach that takes a scanned log image as input and produces digitised curves as output, with no interpreter in the loop.

VeerNet architecture — encoder-decoder with a transformer middle

VeerNet uses an encoder-decoder architecture optimised for this specific problem:

Encoder — a residual-block stack that extracts hierarchical features from the input log image. Same residual-connection design as ResNet/U-Net: deep without losing gradient signal.
Transformer middle — a self-attention block at the bottleneck refines the internal representation, capturing long-range dependencies between distant regions of the log image. This is where VeerNet earns its name and its accuracy gains over pure CNN baselines: a fracture in the upper third of a track and the curve trace continuing through the lower third are related, but a CNN's local receptive field can't see that. Self-attention can.
Decoder — upsampling and convolution operations generate per-pixel spatial masks indicating curve presence. Output: a probability mask the same size as the input, where each pixel scores "is this part of a curve?"

The architecture balances preserving the key signal (the curve traces) with reducing dimensionality (a curve is a sparse 1-pixel-wide feature in a wide log image; you can't naively flatten without losing it).

Experimental setup

The training pipeline:

Dataset: 10,000 well-log images, sourced from a mix of LAS files (digitised, used as ground truth) and raster files (the scanned-image inputs). Curves are extracted from the LAS data and rasterised at the same scale as the corresponding raster image to create paired training examples.
Train/val split: standard 80/20.
Loss-function ablation: five candidate losses tested in turn:
- Dice loss — popular for unbalanced segmentation
- Tversky loss — generalisation of Dice with explicit precision/recall trade-off
- Lovász loss — direct optimisation of IoU
- Focal loss — down-weights easy examples
- Sparse Cross-Entropy (SCE) — the standard baseline
Compute: multiple workers on AMD CPU cores and NVIDIA A100-SXM-80GB GPUs.

Results

VeerNet achieves an overall F1 score of 35% and Intersection-over-Union of 30% in classifying and digitising well-log curves on the held-out test set.

Two losses win the ablation:

Lovász loss — direct IoU optimisation produces the highest F1.
SCE — close second; preferred in production for training stability.

We compared ground-truth and predicted values for two specific curve types:

Gamma Ray (GR) log — strong correlation between predicted and native LAS data. The smoothness of the GR signal helps the segmentation model lock onto the trace.
Caliper (CALI) log — weaker correlation. CALI traces are sharper and more discontinuous; we hypothesise the dataset is too small for the network to generalise on this curve type.

Statistical analysis (Pearson correlation on overlapping depth intervals) confirms the visual impression: GR is solidly recovered; CALI needs more training data.

Limitations and what's next

Two honest limitations:

Low signal-to-noise on the source rasters. Many of the legacy logs in our training set are scans of photocopies of original paper, with multiple generations of degradation. The network does as well as the data allows; collecting higher-quality scans would push F1 up materially.

Caliper accuracy needs more data. The CALI log result is a function of training-set size, not architecture. We're collecting more CALI examples for the v2 dataset.

Three planned enhancements:

OCR head for scale reading. Currently the depth/value scales of each track are read by hand. Adding an OCR sub-network to read the printed scales would close the loop on full-end-to-end digitisation.
Reservoir-specific fine-tuning. Each basin has its own log conventions and quirks; fine-tuning VeerNet on operator-specific archives produces noticeably better accuracy than the generic model.
Joint multi-curve training. Currently we train one model per curve type. A multi-task variant should learn shared low-level features across curves and improve sample efficiency.

In production

VeerNet is the model that powers the ES Raster Digitizer product — operators load their raster archive, VeerNet runs the digitisation, and the interpreter validates and corrects rather than tracing from scratch. Throughput: ~50× the previous interpreter-only workflow.

Full paper on MDPI Journal of Imaging.

Key takeaways

VeerNet is the architecture under ES Raster Digitizer — a transformer-residual hybrid that turns scanned legacy logs into queryable LAS files at ~50× the throughput of manual click-tracing.
Lovász loss won the F1/IoU ablation; SCE was the production-stability runner-up. The pair are the right defaults for segmentation when class balance is moderate.
Self-attention at the bottleneck buys you long-range context that pure CNNs miss — exactly the property a multi-curve track needs to disentangle overlapping traces.
Caliper is harder than GR because the trace is sharper and more discontinuous — this is a training-data problem, not an architecture problem. v2 dataset addresses it.

Glossary

Caliper (CALI): A wireline log measuring borehole diameter at fine depth resolution. Sharper, more discontinuous traces than smooth lithology indicators — which is why CNN-based digitisation finds CALI harder than GR.
Dice loss: A region-overlap loss popular for class-imbalanced segmentation. Computes 2|A∩B| / (|A|+|B|) — penalises both missed positives and false alarms symmetrically. The default choice when foreground is sparse.
F1 score: The harmonic mean of precision and recall — 2·(P·R)/(P+R). The standard summary metric for class-imbalanced classification, where raw accuracy is misleading.
Focal loss: A modified cross-entropy that downweights well-classified examples and focuses learning on hard, misclassified ones. Originally proposed for dense object detection (Lin et al., 2017); standard remedy for class imbalance.
IoU: Intersection-over-Union — the area where prediction and ground truth overlap divided by the area covered by either. Range 0..1. The Pascal-VOC and COCO standard metric for segmentation tasks.
LAS: Log ASCII Standard — the industry text format for well-log curves. Each curve is a depth/value pair series with a header. Modern logs are recorded directly to LAS; raster archives are what VeerNet exists to convert *into* LAS.
Lovász loss: A surrogate loss that directly optimises Intersection-over-Union (IoU) — making it ideal when IoU is the evaluation metric (which it usually is for segmentation). Tends to win F1/IoU ablations against Dice and cross-entropy when class balance is moderate.
Self-attention: The mechanism at the heart of the Transformer architecture (Vaswani et al., 2017). Each output element is computed as a weighted sum of all input elements — capturing long-range dependencies that local convolutional kernels cannot see.
Tversky loss: A generalisation of Dice loss with explicit α/β weights for false positives vs. false negatives. Lets you pick a precision/recall operating point at training time. Popular in medical-imaging segmentation.

VeerNet — a transformer-residual hybrid for raster well-log digitisation

Why raster log digitisation is the unsexy bottleneck no one wants to fix

Previous work — and why it didn't scale

VeerNet architecture — encoder-decoder with a transformer middle

Experimental setup

Results

Limitations and what's next

In production

Glossary

Related research

Counting on the Background: the class-imbalance trap in multiclass log segmentation

Digitization of raster logs — VeerNet, a U-Net + transformer hybrid

Sharp Wells and Flat Basins: Loss-Landscape Geometry in Thin-Curve Segmentation

EarthScan insights, in your inbox.