Skip to main content

Research

Hamiltonian Monte Carlo for reservoir characterisation

A webinar recap of Dr. Shib Sankar Ganguli's Bayesian-multivariate Hamiltonian Monte Carlo approach for estimating total organic carbon in shale reservoirs — and why uncertainty quantification matters more than point estimates when characterising unconventional plays.

Tannistha Maitiby Tannistha Maiti7 min read
Research

Uncertainty is a feature, not a bug. A posterior distribution lets the engineer say ‘90% confidence TOC is between 3.2% and 4.8%’ instead of ‘TOC = 4.0%, trust me.’

In this talk, Dr. Shib Sankar Ganguli (CSIR–National Geophysical Research Institute) presents his recent paper "A Bayesian multivariate model using Hamiltonian Monte Carlo inference to estimate total organic carbon content in shale" — a method that flips the usual reservoir-characterisation problem on its head: instead of producing a single best-guess TOC value, it produces a posterior probability distribution that tells you how much you should trust that value.

For unconventional plays where TOC data is sparse and expensive to acquire, that uncertainty matters more than the point estimate.

The Bayesian alternative — at a glance

1 → ∞

Single-number TOC estimates → posterior distribution

90% CI

Credible interval reported per depth, not a point estimate

Duvernay

Western-Canadian shale formation used as benchmark

PyMC / Stan

Off-the-shelf HMC samplers — no commercial petrophysics suite ships this

Classical vs Bayesian · Same logs, different output

Why uncertainty quantification matters

Same well, same logs, two interpretations. Classical methods give you three thin lines that often disagree. The HMC posterior gives you one mean curve plus a 90% credible band — so “TOC > 4% across a kilometre lateral” becomes a probability statement, not a point claim.Synthetic Duvernay-calibrated data — illustrative.

Classical · point estimates only

TOC = 4%
Empirical (ΔlogR)
Physical (bulk-density)
Core measurement (sparse)

HMC · posterior + credible band

TOC = 4%
Posterior mean
90% credible band
Hover the right panel for a posterior credible interval at any depth.
The takeaway: the engineer who has the right-panel output can say “90% confidence TOC is between 3.2% and 4.8% here” — and the drilling decision becomes a probability calculation, not a single-number bet.

Why this problem is hard

Total organic carbon content drives the hydrocarbon potential of a shale reservoir. The standard ways to get TOC are:

  • Direct — pyrolysis on core samples. Accurate, expensive, sparse.
  • Empirical — methods like the ΔlogR overlay (Passey et al., 1990) using sonic + resistivity logs. Cheap and ubiquitous, but the calibration constants drift across formations.
  • Physical-law-based — bulk-density inversion using a known kerogen-density assumption. Sensitive to lithology heterogeneity.

All three give you a single number per depth. None give you a confidence interval.

When the operator's drilling decision depends on whether TOC > 4% across a kilometre of lateral, "single number" isn't good enough.

The Bayesian alternative

The proposed approach combines geophysical well logs with Hamiltonian Monte Carlo (HMC) sampling of the posterior:

  1. Likelihood model — a multivariate regression linking gamma-ray, density, sonic, and resistivity to TOC, with a measurement-noise term.
  2. Priors — weakly informative priors on the regression coefficients, allowing the data to dominate where it's plentiful and the prior to dominate where it isn't.
  3. HMC sampler — exploits the geometry of the log-posterior to take long, high-acceptance steps. Mixes orders of magnitude faster than vanilla Metropolis–Hastings on this kind of correlated parameter space.

The output isn't a TOC curve — it's a distribution of TOC curves, one per posterior sample. The mean is your best guess; the spread is your uncertainty.

Performance on the Duvernay benchmark

Using benchmark data from the Devonian Duvernay formation in Western Canada, the Bayesian approach outperformed both empirical and physical-law-based baselines:

  • Lower mean absolute error
  • Lower root-mean-square error
  • Higher correlation coefficient against measured core TOC

Crucially, the uncertainty bands correctly bracketed the true TOC values most of the time — i.e. the model knew where it was unsure.

Real-field validation and 3D extension

The approach was further validated on real field data from Silurian sources. The posterior distributions matched reference TOC observations closely, supporting the calibration choices.

The natural extension is to combine the well-log-driven posterior with seismic data for 3D spatial estimation of TOC away from wellbores — turning a 1D log analysis into a volumetric play-screening tool.

The author also noted obvious extensions:

  • Replace the linear regression with a non-linear regression (gradient-boosted trees or a small MLP), at the cost of needing more training data.
  • Couple the HMC inference to a deep generative prior over rock-property fields for stronger spatial regularisation.

The output isn't a TOC curve — it's a distribution of TOC curves, one per posterior sample. The mean is your best guess; the spread is your uncertainty. Drilling decisions become probability calculations.

Why this matters for production reservoir teams

Two things to take away.

First — uncertainty is a feature, not a bug. A posterior distribution lets a reservoir engineer say "with 90% confidence, TOC at this depth is between 3.2% and 4.8%" instead of "TOC = 4.0%, trust me." Drilling decisions get better when uncertainty is explicit.

Second — the methods exist; the integration doesn't. HMC samplers are available off-the-shelf in Stan, PyMC, and NumPyro. The blocker isn't the algorithm — it's that no commercial petrophysics suite ships with a Bayesian inversion path. Until that changes, this stays a research-paper technique.

"We need to integrate advanced methods, such as the proposed Bayesian approach, in commercial software packages to enhance accessibility and applicability in the oil and gas industry."

— Dr. Ganguli, on the closing slide

Watch the full webinar

Bayesian Reservoir Characterisation — Dr. Shib Sankar Ganguli, full webinar on YouTube

Reference

Ganguli, S. S., Kadri, M. M., Debnath, A. (2022). A Bayesian multivariate model using Hamiltonian Monte Carlo inference to estimate total organic carbon content in shale. Geophysics, 87(5), M163–M177. DOI: 10.1190/geo2021-0644.1

Key takeaways

  1. Uncertainty is a feature, not a bug. “90% confidence TOC is between 3.2% and 4.8%” beats “TOC = 4.0%, trust me” for any drilling decision that depends on the threshold.
  2. HMC mixes orders of magnitude faster than vanilla Metropolis-Hastings on correlated parameter spaces — the gradient information is what unlocks long, high-acceptance steps.
  3. The Bayesian approach beats both empirical and physical-law baselines on Duvernay across MAE, RMSE, and Pearson correlation. Crucially, the uncertainty bands correctly bracket the true TOC most of the time.
  4. The methods exist in PyMC + Stan + NumPyro. The blocker is integration into commercial petrophysics suites — until that ships, this stays a research-paper technique.

Glossary

Credible interval
Bayesian analogue of a confidence interval — a range that contains the parameter value with a stated probability (e.g. 90%). Distinct from a frequentist confidence interval: it's a direct statement about the parameter, not about repeated sampling.
Duvernay
A Devonian shale formation in Western Canada — one of the largest unconventional plays in North America. Used here as the benchmark dataset because it has both abundant well logs and core-TOC measurements for ground-truth validation.
HMC
Hamiltonian Monte Carlo (Duane et al., 1987; Neal, 2011) — a Markov-chain Monte Carlo sampler that exploits gradient information from the log-posterior to take long, high-acceptance steps. Mixes orders of magnitude faster than vanilla Metropolis–Hastings on correlated parameter spaces.
Likelihood
The probability of observing the data given a parameter setting — the data's vote on different parameter values. In Bayesian inference: combined with the prior to produce the posterior.
Metropolis–Hastings
The original MCMC algorithm (Metropolis et al., 1953; Hastings, 1970) — proposes a random step from the current parameter value, accepts or rejects based on a posterior-ratio rule. Simple but slow on correlated parameter spaces; HMC's main competitor.
Posterior distribution
In Bayesian inference: the probability distribution over parameters after observing data — proportional to likelihood × prior. The output of HMC sampling: not a point estimate, but a sampled distribution you can compute means, variances, and credible intervals from.
Prior
The probability distribution over parameters before seeing data — encodes domain knowledge or weakly-informative defaults. 'Weakly-informative' priors let the data dominate where it's plentiful, and the prior dominate where it isn't.
TOC
Total Organic Carbon — the fraction of organic matter in a rock, expressed as a weight percentage. Drives the hydrocarbon potential of a shale reservoir; the headline number for play-screening unconventionals.
ΔlogR
An empirical TOC-estimation method (Passey et al., 1990) that overlays sonic and resistivity logs and reads TOC from their separation. Cheap and ubiquitous — but the calibration constants drift across formations, and there's no uncertainty quantification.
EarthScan
Continuous AI for explorers

info@earthscan.io

Go to Top

© 2026 Copyright. Earthscan