“Decades of legacy raster archives — many wells still producing — sit trapped behind manual digitisation. VeerNet automates the unsexy bottleneck no one wants to fix.
”
Why raster log digitisation is the unsexy bottleneck no one wants to fix
Well-logging is foundational to oil and gas extraction — it's how operators determine formation depth, lithology, and ultimately oil reserves in place. Modern logs are recorded digitally in formats like LAS (Log ASCII Standard)Log ASCII Standard — the industry text format for well-log curves. Each curve is a depth/value pair series with a header. Modern logs are recorded directly to LAS; raster archives are what VeerNet exists to convert *into* LAS.. But decades of historical logs exist only as raster images — scanned copies of paper logs saved as PNGs or PDFs.
VeerNet on the 10K-image benchmark
Paired raster + LAS training examples in the benchmark dataset
Best loss function (Lovász) overall classification + digitisation score
Intersection-over-Union on held-out test wells
Throughput improvement over manual click-based digitisation
Raster archives are everywhere:
- The 1970s–1990s drilling boom in mature basins (North Sea, Gulf of Mexico, North Slope) produced thousands of paper logs that were never digitised.
- Acquisition data for older fields lives in operator and joint-venture archives as boxed paper or low-resolution scans.
- Well-logging companies have rooms-full of paper from before the first digital tape drives.
These archives are valuable — many of those wells are still producing, or sit adjacent to wells under reactivation review — but they're trapped behind manual digitisation. Tracing curves by hand from a scanned image is the kind of work senior petrophysicists hate (rightly) and junior staff burn out on (also rightly).
VeerNet — published in MDPI Journal of Imaging, 9(7), 136 — is the architecture we built specifically to automate this.
Previous work — and why it didn't scale
The literature on well-log digitisation is sparser than you'd expect. Two main approaches have been tried:
Commercial software with manual seed-points. Tools like NeuraLog, GeoGraphix LogMaker, and similar commercial offerings give an interpreter a digitisation UI: click on the curve trace at regular depth intervals, the software interpolates between clicks. Time per log: 1–4 hours depending on log quality. Inter-interpreter consistency: weak.
Unsupervised computer-vision techniques. Edge detection, Hough transforms, traditional clustering on the curve pixels. Works when the log is clean (high-contrast scan, no overlapping curves). Fails when the log has multiple curves on the same track (almost always), faded ink, or hand-annotated grid markings.
Neither scales. Both require manual intervention. Both are slow.
The need: an end-to-end approach that takes a scanned log image as input and produces digitised curves as output, with no interpreter in the loop.
VeerNet architecture — encoder-decoder with a transformer middle
VeerNet uses an encoder-decoder architecture optimised for this specific problem:
- Encoder — a residual-block stack that extracts hierarchical features from the input log image. Same residual-connection design as ResNet/U-Net: deep without losing gradient signal.
- Transformer middle — a self-attentionThe mechanism at the heart of the Transformer architecture (Vaswani et al., 2017). Each output element is computed as a weighted sum of all input elements — capturing long-range dependencies that local convolutional kernels cannot see. block at the bottleneck refines the internal representation, capturing long-range dependencies between distant regions of the log image. This is where VeerNet earns its name and its accuracy gains over pure CNN baselines: a fracture in the upper third of a track and the curve trace continuing through the lower third are related, but a CNN's local receptive field can't see that. Self-attention can.
- Decoder — upsampling and convolution operations generate per-pixel spatial masks indicating curve presence. Output: a probability mask the same size as the input, where each pixel scores "is this part of a curve?"
The architecture balances preserving the key signal (the curve traces) with reducing dimensionality (a curve is a sparse 1-pixel-wide feature in a wide log image; you can't naively flatten without losing it).
Experimental setup
The training pipeline:
- Dataset: 10,000 well-log images, sourced from a mix of LAS files (digitised, used as ground truth) and raster files (the scanned-image inputs). Curves are extracted from the LAS data and rasterised at the same scale as the corresponding raster image to create paired training examples.
- Train/val split: standard 80/20.
- Loss-function ablation: five candidate losses tested in turn:
- DiceA region-overlap loss popular for class-imbalanced segmentation. Computes 2|A∩B| / (|A|+|B|) — penalises both missed positives and false alarms symmetrically. The default choice when foreground is sparse. loss — popular for unbalanced segmentation
- TverskyA generalisation of Dice loss with explicit α/β weights for false positives vs. false negatives. Lets you pick a precision/recall operating point at training time. Popular in medical-imaging segmentation. loss — generalisation of Dice with explicit precision/recall trade-off
- LovászA surrogate loss that directly optimises Intersection-over-Union (IoU) — making it ideal when IoU is the evaluation metric (which it usually is for segmentation). Tends to win F1/IoU ablations against Dice and cross-entropy when class balance is moderate. loss — direct optimisation of IoUIntersection-over-Union — the area where prediction and ground truth overlap divided by the area covered by either. Range 0..1. The Pascal-VOC and COCO standard metric for segmentation tasks.
- FocalA modified cross-entropy that downweights well-classified examples and focuses learning on hard, misclassified ones. Originally proposed for dense object detection (Lin et al., 2017); standard remedy for class imbalance. loss — down-weights easy examples
- Sparse Cross-Entropy (SCE) — the standard baseline
- Compute: multiple workers on AMD CPU cores and NVIDIA A100-SXM-80GB GPUs.
Results
VeerNet achieves an overall F1 scoreThe harmonic mean of precision and recall — 2·(P·R)/(P+R). The standard summary metric for class-imbalanced classification, where raw accuracy is misleading. of 35% and Intersection-over-Union of 30% in classifying and digitising well-log curves on the held-out test set.
Two losses win the ablation:
- Lovász loss — direct IoU optimisation produces the highest F1.
- SCE — close second; preferred in production for training stability.
We compared ground-truth and predicted values for two specific curve types:
- Gamma Ray (GR) log — strong correlation between predicted and native LAS data. The smoothness of the GR signal helps the segmentation model lock onto the trace.
- Caliper (CALI) logA wireline log measuring borehole diameter at fine depth resolution. Sharper, more discontinuous traces than smooth lithology indicators — which is why CNN-based digitisation finds CALI harder than GR. — weaker correlation. CALI traces are sharper and more discontinuous; we hypothesise the dataset is too small for the network to generalise on this curve type.
Statistical analysis (Pearson correlation on overlapping depth intervals) confirms the visual impression: GR is solidly recovered; CALI needs more training data.
Limitations and what's next
Two honest limitations:
Low signal-to-noise on the source rasters. Many of the legacy logs in our training set are scans of photocopies of original paper, with multiple generations of degradation. The network does as well as the data allows; collecting higher-quality scans would push F1 up materially.
Caliper accuracy needs more data. The CALI log result is a function of training-set size, not architecture. We're collecting more CALI examples for the v2 dataset.
Three planned enhancements:
- OCR head for scale reading. Currently the depth/value scales of each track are read by hand. Adding an OCR sub-network to read the printed scales would close the loop on full-end-to-end digitisation.
- Reservoir-specific fine-tuning. Each basin has its own log conventions and quirks; fine-tuning VeerNet on operator-specific archives produces noticeably better accuracy than the generic model.
- Joint multi-curve training. Currently we train one model per curve type. A multi-task variant should learn shared low-level features across curves and improve sample efficiency.
In production
VeerNet is the model that powers the ES Raster Digitizer product — operators load their raster archive, VeerNet runs the digitisation, and the interpreter validates and corrects rather than tracing from scratch. Throughput: ~50× the previous interpreter-only workflow.
Full paper on MDPI Journal of Imaging.
Key takeaways
- VeerNet is the architecture under ES Raster Digitizer — a transformer-residual hybrid that turns scanned legacy logs into queryable LAS files at ~50× the throughput of manual click-tracing.
- Lovász loss won the F1/IoU ablation; SCE was the production-stability runner-up. The pair are the right defaults for segmentation when class balance is moderate.
- Self-attention at the bottleneck buys you long-range context that pure CNNs miss — exactly the property a multi-curve track needs to disentangle overlapping traces.
- Caliper is harder than GR because the trace is sharper and more discontinuous — this is a training-data problem, not an architecture problem. v2 dataset addresses it.
Glossary
- Caliper (CALI)
- A wireline log measuring borehole diameter at fine depth resolution. Sharper, more discontinuous traces than smooth lithology indicators — which is why CNN-based digitisation finds CALI harder than GR.
- Dice loss
- A region-overlap loss popular for class-imbalanced segmentation. Computes 2|A∩B| / (|A|+|B|) — penalises both missed positives and false alarms symmetrically. The default choice when foreground is sparse.
- F1 score
- The harmonic mean of precision and recall — 2·(P·R)/(P+R). The standard summary metric for class-imbalanced classification, where raw accuracy is misleading.
- Focal loss
- A modified cross-entropy that downweights well-classified examples and focuses learning on hard, misclassified ones. Originally proposed for dense object detection (Lin et al., 2017); standard remedy for class imbalance.
- IoU
- Intersection-over-Union — the area where prediction and ground truth overlap divided by the area covered by either. Range 0..1. The Pascal-VOC and COCO standard metric for segmentation tasks.
- LAS
- Log ASCII Standard — the industry text format for well-log curves. Each curve is a depth/value pair series with a header. Modern logs are recorded directly to LAS; raster archives are what VeerNet exists to convert *into* LAS.
- Lovász loss
- A surrogate loss that directly optimises Intersection-over-Union (IoU) — making it ideal when IoU is the evaluation metric (which it usually is for segmentation). Tends to win F1/IoU ablations against Dice and cross-entropy when class balance is moderate.
- Self-attention
- The mechanism at the heart of the Transformer architecture (Vaswani et al., 2017). Each output element is computed as a weighted sum of all input elements — capturing long-range dependencies that local convolutional kernels cannot see.
- Tversky loss
- A generalisation of Dice loss with explicit α/β weights for false positives vs. false negatives. Lets you pick a precision/recall operating point at training time. Popular in medical-imaging segmentation.