Abstract
Wireline logging often leaves a hole: a sonic or a density curve that was never recorded because the tool was not run, failed, or was dropped to save rig time, even though the formation it would have described is the one a petrophysicist most wants to evaluate. Over roughly the past six years a body of machine-learning work has tried to fill those holes by predicting the missing curve from the channels that were recorded, and this survey credits that work and reads its reported accuracies side by side. We group the published estimators into three families, feed-forward networks, recurrent depth-aware networks, and synthetic-curve generators, and place them on a single scoreboard alongside the open facies-classification contest that established the reproducible template the field followed. Two facts organise the survey. The first is that most of these methods are trained and tested on a small set of open corpora, with the FORCE and Xeek tutorial slice of 118 Norwegian Sea wells and 22 measurement columns the most common, and the density-family curves of Track 3, neutron porosity and bulk density, among the canonical prediction targets. The second is that every one of these estimators assumes its inputs already exist as vector curves, which means the survey points back at an upstream prerequisite the prediction literature rarely discusses: turning a scanned raster log into vectors in the first place. That digitisation step is the one piece of original work we claim here, and it is what our VeerNet system addresses.
Background and related work
The idea of predicting one log from others predates machine learning by decades. Petrophysicists have long used empirical transforms and local regressions, fitting a relationship between, say, sonic transit time and bulk density in wells where both were measured, then applying that relationship in wells where one was missing. What changed in the latter half of the 2010s was not the goal but the method and, crucially, the data culture around it. The pivotal moment for reproducibility was an open tutorial and contest published in The Leading Edge that walked through a complete statistical-learning workflow on real well logs and then invited the community to beat it [1]. It used nine wells from the Hugoton field of Kansas, kept two wells blind, and asked entrants to classify rock facies from the wireline suite. The best public score landed a little over 63 percent, and the exact figure matters less than the fact that it was a public, falsifiable number that anyone could reproduce from the released data and code. That contest set the template the rest of the field followed: an open corpus, a fixed target, and a number you could check.
From there the work that is directly relevant to synthetic sonic and density logs branched into a few recognisable lines. The first and simplest is the feed-forward neural network that treats log prediction as a per-depth regression. A representative example built a three-layer perceptron that takes gamma ray and bulk density as inputs and estimates both compressional and shear transit time, validated on an offshore shaly-sandstone reservoir, with the explicit motivation that knowing the shear sonic lets an engineer assess sanding potential without a dedicated tool run [2]. A closely related effort from the same group focused specifically on shear transit time, again for formation evaluation where no shear sonic was acquired [5]. These models are attractive because they are small, fast, and interpretable in the limited sense that you can see which input channels carry the signal, and they remain a sensible baseline.
The second line recognises that a well log is not a bag of independent depth samples but a sequence with strong vertical structure, and reaches for recurrent architectures to exploit it. The clearest example in our window is a convolutional long short-term memory network, run bidirectionally so it sees the log from both above and below a given depth, cascaded with fully connected layers and trained to predict sonic logs from gamma ray, density, and neutron porosity [4]. Two design choices in that work are worth crediting specifically. It uses the convolutional recurrent structure to capture both the local shape of a curve and its broader depth trend, and it adds dropout at inference with Monte Carlo sampling to produce an uncertainty band rather than a single point estimate, which is exactly the property a petrophysicist needs in order to decide whether to trust a synthetic curve in a given interval.
The third line is less about predicting a single named curve and more about completing a suite. Here the framing is synthetic-log generation: given a partly populated set of wireline measurements, generate the curves that are absent. A characteristic example built machine-learning workflows to synthesise photoelectric and rock-strength curves for unconventional wells where those measurements were missing [3]. The distinction from the per-curve regressors is partly one of emphasis, but it matters in practice, because suite-completion methods have to respect the joint relationships among curves rather than optimising one target in isolation.
Underpinning all three lines is the data culture the facies contest started. The corpus that most of this later work gravitated to is the FORCE 2020 release, the open labelled set of Norwegian continental shelf wells assembled for a lithology-prediction competition [6]. Its shape is easiest to see through a widely used teaching slice: 118 wells from the Norwegian Sea, each described by 22 measurement columns, in vector LAS form [7]. That slice is the de facto common ground for petrophysical ML, and it is what we use to anchor the scoreboard below.
Method
This is a survey rather than a new experiment, so our method is one of organisation and honest comparison rather than fresh measurement. We collected the published estimators above into a single scoreboard with three deliberate constraints, each chosen to keep the comparison fair.
First, scope. We restricted the board to methods whose target is a sonic curve, a density-family curve, or a directly adjacent synthetic curve, and which were public on or before the survey quarter. We deliberately included the open facies contest as a row even though its target is a lithofacies label rather than a curve, because it is the methodological ancestor of everything else on the board and because its public accuracy figure is the one number every reader of this literature already knows.
Second, the accuracy reading. Each row carries the accuracy figure that its own paper reported, mapped onto a common zero-to-one reading scale so that correlation coefficients, coefficients of determination, and contest accuracies can be drawn on one axis. We want to be emphatic about what this is and is not. It is a faithful transcription of what each study reported on its own validation data. It is not a re-measurement of all methods on one shared split. The wells differ, the basins differ, the input channels differ, and the train and test partitions differ, so the bars are not directly comparable in the way a single benchmark leaderboard would be. We chose to surface that caveat on the instrument itself rather than bury it, because the most common way this kind of comparison misleads is by implying a head-to-head race that never actually happened.
Third, the anchor. We grounded the board on the FORCE and Xeek tutorial corpus, 118 wells and 22 columns, and named the Track 3 density-family curves, neutron porosity and bulk density, as the canonical estimation targets [6][7]. This gives the reader a concrete sense of the kind of data these reported accuracies were earned on, even where an individual study used a different field.
Results
The scoreboard is below. Read it as a credited map of the literature, not as a contest we ran.
A few patterns survive the no-shared-split caveat and are worth stating. The recurrent, depth-aware estimator sits at the top of the board because it both reported the strongest accuracy and added something the others did not, a calibrated uncertainty band rather than a bare prediction [4]. The feed-forward sonic and shear models cluster just below it, which is consistent with their simpler hypothesis class and their reliance on a small set of physically motivated inputs [2][5]. The synthetic-suite generator sits a little lower, which is unsurprising given that completing several curves jointly is a harder target than regressing one [3]. And the open facies contest anchors the bottom of the board at about 63 percent, a reminder of how far the field travelled in a few years and of the fact that the earliest open number was a label-classification task, not a curve regression, so its lower figure reflects a different and harder kind of output as much as an older method [1].
The other thing the board makes visible is how few of these published estimators clear a demanding acceptance floor. Drag the threshold on the instrument toward the level an operator would actually require before letting a synthetic curve substitute for a logged one, and the set of methods that qualify thins quickly. That is not a criticism of the work; it is an accurate picture of where a young field stood. Predicting a sonic or density curve well enough for screening is largely solved in the literature; predicting one well enough to use without a human in the loop, in a basin the model has not seen, was still open.
Discussion
Set against this map, our own contribution sits deliberately to one side of it, and naming where is the honest thing to do. We did not build a new sonic or density estimator, and nothing on the scoreboard is ours. What we observed, repeatedly, is that every method on the board shares an unstated precondition: it consumes vector logs. The gamma ray, density, and neutron channels these networks read are assumed to arrive as clean depth-indexed numeric series. For the open corpora that is a safe assumption, because FORCE, Xeek, and Volve ship vector LAS. For the world's larger reservoir of legacy data it is not. A vast amount of historical log information exists only as scanned paper, raster images of curves drawn on a grid, with no numeric series behind the ink at all. The Texas regulatory archive is one concrete example of that image-first reality [8].
This is where our work attaches to the survey. Before any of the estimators above can run on a scanned well, the curves have to be lifted off the image and turned into vectors, and that raster-to-vector digitisation is the problem our VeerNet system was built to solve. We are not claiming it as part of the prediction literature; we are claiming it as the upstream station that feeds it. The right mental model is a pipeline with two distinct stages. The first stage recovers numeric curves from a scan. The second stage, the literature this survey credits, predicts the curve that the scan never contained. The prediction work has been studied intensively and is well represented on the board. The digitisation work that has to precede it on legacy data has been studied far less, and that asymmetry is the single most useful thing this map reveals.
It also reframes what a practitioner should do. If your logs already live as vectors, the survey is a buying guide: a recurrent depth-aware estimator with an uncertainty band is the strongest credited option for sonic prediction [4], with the feed-forward models as fast, interpretable baselines [2][5] and the suite generators for the harder job of completing several missing curves at once [3]. If your logs live as scans, the survey tells you the estimators are waiting for you, but you are not yet at the stage where they apply; the digitisation step comes first, and only once it succeeds does the scoreboard become relevant to your data.
Limitations
This survey carries the limitations of its form, and it is better to name them than to let the scoreboard imply more than it can support. The accuracy figures are reported-by-source, drawn from each paper's own validation on its own wells, basins, and splits, so the relative bar heights encode a real ranking only loosely and must not be read as a controlled head-to-head; a method placed lower may simply have been evaluated on harder wells. The board is also a sample of the literature, not a census: we restricted it to estimators of sonic, density-family, and closely adjacent curves that were public on or before the survey quarter, which excludes both later work and earlier non-learning transforms that remain in active use. The accuracy reading-scale flattens distinct metrics, correlation, coefficient of determination, and classification accuracy, onto one axis for legibility, and that flattening loses information that each original metric carried. Finally, our own claim is narrow and should be read narrowly: we credit none of the prediction methods as ours, and we assert authorship only of the upstream raster-to-vector digitisation step, whose accuracy is not what this board measures.
What the survey establishes
- Machine-learned log prediction fills missing sonic and density curves by estimating them from the rest of the wireline suite. We credit three published families: feed-forward per-depth regressors, recurrent depth-aware networks, and synthetic-suite generators, plus the open SEG facies contest that set the reproducible template at about 63 percent.
- Most of this work trains on the same open corpora. The scoreboard is anchored on the FORCE and Xeek tutorial slice of 118 Norwegian Sea wells with 22 measurement columns, with the Track 3 density-family curves, neutron porosity and bulk density, as canonical targets.
- The strongest credited option in our window is a bidirectional convolutional LSTM that predicts sonic logs and ships a dropout-based uncertainty band; feed-forward sonic and shear models are fast interpretable baselines, and suite generators handle the harder job of completing several curves at once.
- The accuracy bars are reported-by-source, not measured on one shared split, so they are not a controlled head-to-head; different wells, basins, channels, and partitions make the rows only loosely comparable, which the instrument states plainly.
- Every estimator on the board assumes vector-curve inputs. On legacy scanned logs that assumption fails, so the survey points to an under-studied upstream prerequisite, recovering numeric curves from raster images, which is the digitisation problem our VeerNet system addresses.
References
[1] Hall, B. Facies classification using machine learning. The Leading Edge, 35(10), 906-909 (2016). The open SEG tutorial and contest on nine Hugoton wells that set the reproducible template for predicting a target log from a wireline suite; the best public contest score was about 63 percent. https://library.seg.org/doi/10.1190/tle35100906.1
[2] Onalo, D., Adedigba, S., Khan, F., James, L. A., and Butt, S. Data driven model for sonic well log prediction. Journal of Petroleum Science and Engineering, 170, 1022-1037 (2018). A three-layer feed-forward neural network that estimates compressional and shear transit time from gamma ray and bulk density logs. https://doi.org/10.1016/j.petrol.2018.06.072
[3] Akinnikawe, O., Lyne, S., and Roberts, J. Synthetic Well Log Generation Using Machine Learning Techniques. SPE/AAPG/SEG Unconventional Resources Technology Conference, URTeC 2877021 (2018). A workflow that generates synthetic photoelectric and rock-strength curves to complete incomplete wireline suites. https://doi.org/10.15530/URTEC-2018-2877021
[4] Pham, N., Wu, X., and Zabihi Naeini, E. Missing well log prediction using convolutional long short-term memory network. Geophysics, 85(4), WA159-WA171 (2020). A bidirectional ConvLSTM cascaded with fully connected layers that predicts sonic logs from gamma ray, density, and neutron porosity, with dropout-based uncertainty at inference. https://doi.org/10.1190/geo2019-0282.1
[5] Onalo, D., Oloruntobi, O., Adedigba, S., Khan, F., James, L., and Butt, S. Data-driven model for shear wave transit time prediction for formation evaluation. Journal of Petroleum Exploration and Production Technology, 10, 1429-1447 (2020). A neural estimator of shear transit time for sanding analysis where no shear sonic tool was run. https://doi.org/10.1007/s13202-020-00843-2
[6] Bormann, P., Aursand, P., Dilib, F., Manral, S., and Dischington, P. FORCE 2020 well log and lithofacies dataset for machine learning competition. FORCE / GitHub (2020). The open labelled Norwegian continental shelf corpus most synthetic-log work trains and tests on. https://github.com/bolgebrygg/Force-2020-Machine-Learning-competition
[7] McDonald, A. Using the missingno Python library to identify and visualise missing data prior to machine learning. Towards Data Science (2021). A tutorial on the FORCE 2020 and Xeek dataset slice of 118 Norwegian Sea wells with 22 measurement columns, the corpus shape the scoreboard is grounded on. https://towardsdatascience.com/using-the-missingno-python-library-to-identify-and-visualise-missing-data-prior-to-machine-learning-34c8c5b5f009
[8] Railroad Commission of Texas. Well log and digital records, public well data. Texas RRC (accessed 2022). The state regulatory archive of scanned raster well logs, an example of the image-first source material that must be digitised before any estimator below can run. https://www.rrc.texas.gov/resource-center/research/data-sets-available-for-download/