Ask a reservoir manager where the bottleneck in their subsurface programme really sits and they will rarely point at the model, the seismic, or the rig. They point at people — specifically, at the handful of senior geoscientists who can read a borehole image log and turn it into a fracture set, a bedding framework, and a vug-porosity estimate the asset team will actually bet money on. That expertise does not scale. It takes years to build, it walks out of the building at retirement, and on any given week it is rationed across more wells than it can reach. The interesting question about interpretation automation is therefore not "how accurate is the model" — it is "what does the freed time buy."
We have spent a long engagement on exactly this question with a mid-sized Middle East carbonate operator, and the honest answer reframes the whole business case. Collapsing manual interpretation from days-to-weeks per well down to minutes does not let you cut the team. It lets the team you already have do the work you were quietly deferring.
Where the weeks actually go
Three picking tasks dominate the manual workflow on fractured, vuggy carbonate, and each is slow for its own reason.
Fractures and bedding. A planar feature cutting a borehole traces a sinusoid on the unrolled image. A geologist picks these by eye, one trace at a time, across kilometres of imagery from two different microresistivity imaging tools, recording depth, dip, and azimuth for each. It is meticulous, fatiguing, and proportional to footage — there is no shortcut that does not cost accuracy.
Vugs. Carbonate porosity hides in vugs — irregular pore voids scattered across the image. Quantifying vug percentage by hand means an expert tracing and tallying voids interval by interval. In this operator's own reckoning, that traditional manual identification runs to hours per well.
Correlation. Once a single well is interpreted, the value is locked inside that one borehole until someone stitches it laterally to its neighbours — bedding density, fracture density, strike — to say anything about the reservoir between wells. That is the analysis everyone wants and almost nobody has time to reach.
Stack those across a field, and "interpretation" becomes a queue. The senior picks are spent producing the raw logs, never on the reasoning the asset actually needs.
The engineering that compresses the clock
Compressing weeks to minutes is not a spreadsheet assumption — it is a specific stack of model and software engineering, and it is worth being precise about what each layer does.
The fracture and bedding picker is a Detection Transformer (DETR). Rather than tiling the image with anchor boxes and post-processing with non-maximum suppression — machinery built for boxes, not sine waves — the model emits a set of geological objects directly, each query resolving to one sinusoid with its own regressed depth, dip, and azimuth. Set predictionSet prediction: the model outputs a fixed-size, unordered set of object candidates and is scored against the ground-truth set as a whole using a Hungarian bipartite matching loss, rather than predicting a grid of anchors. It is the right inductive bias for features that are variable in count and prone to genuine overlap, like crossing fractures. is the right inductive bias because fractures are unordered, variable in count, and frequently overlap. Making that converge on a small-data problem was itself an engineering programme: a heavy regime of data augmentation to manufacture training variety from a handful of interpreted wells, dynamic-image normalisation to preserve sinusoid contrast, and a deliberately lean backbone to avoid overfitting. The vug detector is a different discipline entirely — a classical computer-vision pipeline (local-variation enhancement, adaptive thresholding, contour extraction, and a cascade of area, circularity, and vicinity filters) that turns a manual tally into a deterministic, per-interval computation.
Both were then wrapped as production tools — internally AutoFrac and AutoVug — with the correlation layer (well-to-well, "W2W") on top performing kriging across the interpreted logs. That MLOps and software-engineering layer is what converts a research metric into a clock. Automated picking runs roughly five times faster than the manual workflow; the vug tally that took hours per well returns in minutes; and the W2W tool turns single-well interpretation into a field view on demand.
The dividend is capacity, not headcount
Here is the part that executives consistently get wrong on the first pass. They model an interpretation-automation project as a cost line: if a task that took three weeks now takes two hours, the saving must be the labour you no longer pay for. That framing quietly assumes the backlog was zero — that every well your team could interpret, it already did.
No carbonate asset we have worked with is in that state. The queue is always full. So the freed hours do not show up as a smaller payroll; they show up as work that finally gets done. In this engagement, the operator's own measure of the productized stack was not a reduction in staff but a 60% lift in interpretation productivity and a 75% improvement in interpretation accuracy — the same people, reaching more wells and reaching them better, with the well-to-well correlation hitting 95% target-location precision and around 90% stratigraphic success in identifying productive zones for infill and directional drilling. That last number is the tell: target-location precision is not an interpretation output at all. It is a reservoir decision — exactly the higher-value work the team had no hours for when it was hand-picking sinusoids.
“The model does not replace the geoscientist. It moves the geoscientist off rote tracing and onto the judgment calls that actually move recovery — and keeps them on the hook for anomaly review and final sign-off.”
It matters that this is framed as redeployment, not removal. Across the operators we partner with — in the Middle East and the United States — the binding constraint is never a surplus of expert interpreters waiting to be made redundant; it is a deficit of expert hours against a growing pile of unread logs. The engineering layer is valuable precisely because it lets that scarce judgment compound: the senior who used to spend three weeks producing one well's picks now spends two hours validating the machine's picks across several wells and the rest of that time deciding where the next well should go.
How to put a number on it
For a reservoir team trying to size the prize, the arithmetic is simpler than the technology. The freed capacity per well is the manual interpretation time minus the automated time — and on this kind of carbonate that gap is days or weeks against minutes. The honest move is not to bank that as a cost cut. It is to ask: what is the marginal well's worth of senior-geoscientist time worth, redeployed?
Where the dividend D is the reclaimed hours valued at the redeployed worth of expert time v_expert, net of the review time c_review you deliberately keep in the loop. Set v_expert to the value of the decisions those hours unlock — earlier infill calls, better landing points, fewer unread wells — rather than a loaded hourly rate, and the business case stops being a thin labour saving and becomes what it actually is: a step-change in how much of your subsurface a fixed team can reason about.
That is what two hours per well instead of three weeks buys. Not a smaller team. A team that finally gets to the reservoir questions it was always too busy to answer.
Key takeaways
- The bottleneck in carbonate interpretation is scarce senior-geoscientist hours, not model accuracy. Manual fracture/bedding picking is proportional to footage, and manual vug quantification runs to hours per well — so interpretation becomes a permanent queue.
- Collapsing that to minutes is a specific engineering stack: a Detection Transformer using set prediction for fractures/bedding, a classical computer-vision pipeline for vugs, and an MLOps/serving layer (AutoFrac, AutoVug, W2W) that converts research metrics into a production clock — roughly 5x faster picking.
- The dividend is capacity, not headcount. The freed hours land on a full backlog, so they show up as more wells reached and higher-value work done, not a smaller payroll. In this Middle East engagement the productized stack delivered +60% interpretation productivity and +75% accuracy with the same team.
- The real prize is reservoir decisions: well-to-well correlation reached 95% target-location precision and ~90% stratigraphic success for infill/directional drilling — judgment work the team had no hours for while hand-picking sinusoids.
- Size the case as a capacity dividend: reclaimed hours valued at the worth of the decisions they unlock, net of the human review you deliberately retain for anomalies and sign-off — not as a labour cost cut.