Topology-Aware Losses and Connectivity Metrics for Thin-Structure Segmentation

Abstract

A digitised well-log curve is only useful if it is connected: a petrophysicist reads a single continuous trace from top to bottom, and a curve that arrives in a dozen disconnected fragments is not a partial answer but a different, unusable object. Yet the metric the segmentation field reaches for by default, intersection over union, cannot tell those two outcomes apart, because a handful of missing pixels at a gap barely move an overlap ratio computed over a long thin structure. This survey asks why that blindness exists and what the literature has done about it. We trace the development of topology-aware objectives for thin and tubular structures from the early move beyond pixel-wise delineation losses through skeleton-aware centreline objectives and on to persistent-homology and Betti-matching losses, and we pair that with the parallel development of connectivity-sensitive metrics that score the number and arrangement of connected components rather than raw pixel overlap. We credit each contribution to its originating work and locate the objectives on the axis the field has organised itself around: whether a loss penalises only where mask and label disagree pixel by pixel, or whether it also penalises a topological break that costs almost no pixels. We then read the synthesis against a real raster-log curve-extraction task, where a binary curve model reached a peak intersection over union of 0.51 and a peak F1 of 0.55 and a Dice-trained multiclass model reached per-curve intersection over union of 0.26 and 0.21, numbers that look like ordinary overlap scores yet say nothing about whether the recovered curve is one piece or many. The survey's central finding is that for one-pixel structures the metric and the objective must change together: reporting connectivity does not by itself improve it, and a loss that optimises overlap alone has no gradient that rewards closing a gap.

Why overlap is silent about a break

The default training objective for dense segmentation evolved to solve a class-imbalance problem, not a connectivity one. Overlap-aware losses such as the Dice objective optimise the ratio of intersection to union between the predicted and the ground-truth mask ^[8], which makes them largely indifferent to a large easy background and is exactly why they became the reach for imbalanced foregrounds. That same property is what makes them blind to topology. Intersection over union is a set-overlap statistic; it has no term for whether the foreground set is connected. A prediction that captures ninety-something percent of the true curve pixels but drops a few at five points along its length scores almost identically to one that captures the same pixels in a single continuous run, because the few dropped pixels are a negligible fraction of the union. The two predictions are nearly identical to the metric and worlds apart to anyone who has to trace the curve.

The literature noticed this first in the domains where thin connected structures are the whole point: roads in aerial imagery, vessels in retinal and angiographic scans, neurons in microscopy. Mosinska and colleagues argued explicitly that a pixel-wise loss is the wrong objective for delineation, and proposed training a network so that its predictions match the ground truth not only pixel by pixel but in the higher-level features a pretrained network extracts, features that respond to connectivity and to the presence of thin linear structure rather than to isolated pixels ^[1]. This is the founding move of the topology-aware line: stop scoring only where mask and label disagree pixel by pixel, and start scoring whether the predicted structure has the right shape.

A more direct formalisation followed by reaching for algebraic topology. Hu and colleagues introduced a topology-preserving segmentation loss that compares the persistent-homology summaries of the prediction and the label, so the network is penalised when it produces the wrong number of connected components or the wrong number of holes, even where the per-pixel agreement is high ^[2]. Clough and colleagues developed a closely related persistent-homology loss that lets the practitioner specify the desired topology as a prior and drives the prediction toward it ^[3]. These objectives put the Betti numbers, the counts of connected components and loops that summarise a shape's topology, directly into the gradient, which is precisely the quantity an overlap loss omits.

The most widely adopted member of the family took a more tractable route. Shit and colleagues introduced clDice, a connectivity-preserving loss for tubular structures that computes a Dice-style overlap not on the masks but on their morphological skeletons, the one-pixel-wide centrelines of the predicted and true structures ^[4]. Because a break in a tube severs its skeleton, the skeletal overlap collapses where the volumetric overlap barely flinches, so clDice gives the network a strong gradient to keep the structure connected while remaining cheap enough to train at scale. Later work pushed the topological faithfulness further: a discrete-Morse-theory formulation that reasons about the structure's skeleton and critical points ^[5], and an induced-matching objective that aligns the persistence barcodes of prediction and label so that the topological features are matched rather than merely counted ^[6]. Running alongside this whole line is the metric-surrogate tradition, the Lovasz-Softmax loss that optimises the intersection-over-union measure directly ^[7], which is worth holding in view precisely because it shows that optimising overlap better still does not optimise connectivity: a faithful surrogate for a blind metric inherits the blindness.

How this reading was assembled

This is a structured reading of the published field, not a new benchmark, and it is worth being exact about how it was assembled. We took the objectives and metrics that the topology-aware segmentation literature treats as canonical for thin and tubular structures, traced each back to its originating paper to recover the precise quantity it scores, and sorted them along a single organising axis: whether the quantity is computed from raw pixel overlap, which is topology-blind, or from a representation in which a connectivity break is expensive, namely a skeleton, a connected-component count, or a persistent-homology summary. For each objective we recorded what its authors claimed it fixes relative to a plain overlap loss, and for each metric we recorded whether two predictions that differ only in connectivity receive different scores under it.

To keep the synthesis anchored to a real task rather than floating among method papers, we read it against a concrete reference point from a raster-log curve-extraction problem: digitising the thin curves drawn on scanned paper well logs into vector traces. The curves there are one to three pixels wide against a near-empty background, which is the regime in which the overlap-versus-topology gap is at its widest, because a single missing pixel can sever a curve while costing the overlap ratio almost nothing. The reference numbers we quote, a peak intersection over union of 0.51 and a peak F1 of 0.55 on a binary curve model, and per-curve intersection over union of 0.26 and 0.21 under a multiclass Dice baseline, are real and sourced from the engagement archive. They serve here not as a contribution but as a measured illustration of the survey's point: those are respectable-looking overlap scores that nonetheless certify nothing about whether the recovered curve is one connected trace. The interactive exhibit below is built on the same footing, with real anchor numbers and an illustrative geometry of the blindness, not a freshly measured response curve.

Results: an overlap score that certifies nothing about shape

The synthesised picture has a clear shape, and it matches what the topology-aware literature has argued since 2018: an overlap metric and a connectivity metric can disagree without limit on a thin structure, and a loss that optimises only the former has no reason to improve the latter.

The reference task makes the disagreement concrete. The binary curve model's peak intersection over union of 0.51 and peak F1 of 0.55 are perfectly ordinary segmentation numbers, the kind a practitioner would read as a moderate result and move on. Under the multiclass setting the per-curve overlap is lower still, 0.26 on the first curve and 0.21 on the second under a Dice loss. None of these four numbers contains any information about connectivity. A curve recovered as one continuous trace and the same curve recovered as eight fragments separated by one-pixel gaps would post nearly identical overlap, because the missing gap pixels are a vanishing fraction of the union over a long curve. The overlap score is doing its job, which is to measure overlap; it simply is not measuring the thing that decides whether the output is usable.

A thin-curve prediction broken into six runs by five gaps, scored two ways at once. Drag the lever along the bottom track to stitch the open gaps closed one at a time, from a shattered curve on the left to a single whole one on the right; the orange crosses mark the gaps still open. The ledger on the right reports the same prediction under two scores live: a teal bar for pixel overlap (intersection over union against the ground-truth curve) and an orange bar for connectivity (the share of the curve that forms one connected component, a Betti-style reading). Watch the divergence. Closing a gap adds back only a sliver of pixels, so the overlap bar barely climbs, while the connectivity bar swings across its full range, because a curve in many pieces and a curve in one piece are worlds apart to anything that has to trace it. The lower-right ledger anchors the exhibit in the real field-reported numbers from the raster-log work: a peak intersection over union of 0.51 and a peak F1 of 0.55 on the binary curve task, and a per-curve multiclass intersection over union of 0.26 and 0.21 on curve 1 and curve 2 under a Dice loss. Those figures are sourced from the engagement archive; the pixel-versus-connectivity divergence the lever traces is an illustrative geometry of the blindness those scores share, not a freshly measured response curve.

The exhibit makes that divergence something you can drive. As the lever stitches the broken runs of a predicted curve back together, the pixel-overlap bar barely climbs, because each closed gap returns only a sliver of pixels, while the connectivity bar swings across its full range, because a curve in many pieces and a curve in one piece are entirely different objects to anything that traces them. This is the gap the topology-aware losses were built to close, and reading the family against it explains why each one works the way it does. clDice closes the gap by scoring overlap on the skeleton rather than the mask, so that a break, which severs the skeleton, is expensive even though it is cheap in pixels ^[4]. The persistent-homology losses close it by putting the Betti numbers into the gradient, so that producing the wrong number of connected components is penalised directly ^[2] ^[3]. The discrete-Morse and induced-matching formulations close it by reasoning about, and matching, the structure's topological features rather than counting them ^[5] ^[6]. And the metric-surrogate route, Lovasz-Softmax, deliberately does not close it: it optimises the overlap measure faithfully ^[7], which is useful when overlap is the goal and beside the point when connectivity is.

The matching half of the picture is the metric side. The same literature that built connectivity-preserving losses also reports connectivity-sensitive metrics, the Betti-error that counts how far the predicted topology is from the truth and the centreline-Dice score that measures skeletal overlap ^[4] ^[6], precisely because optimising a topological objective while still reporting only intersection over union would hide whether the objective worked. The survey's reading is that the two halves are not separable. A topology-aware loss trained against a topology-blind metric cannot be told whether it succeeded, and a connectivity metric reported alongside a pure-overlap loss measures a property the optimiser was never asked to produce.

Discussion: the metric and the loss are one decision

Laid out on the overlap-versus-topology axis, the objectives stop looking like a list of tricks and start looking like answers to one question: where do you spend the gradient when a structure breaks. A plain overlap loss spends nothing there, because the break is nearly free in its currency. The skeleton-based answer, clDice, spends it on the centreline, where the break is catastrophic ^[4]. The homology-based answers spend it on the Betti numbers, where the break shows up as a miscount of components ^[2] ^[3] ^[5] ^[6]. The early feature-matching answer spends it on the responses of a network that already reacts to thin connected structure ^[1]. Each is a different representation in which a topological break is no longer cheap, and that, more than any architectural detail, is what they share.

Where our own work sits in this landscape is worth stating plainly, because it marks the boundary between this survey and our build. This review is a reading of the public field. VeerNet, the architecture we designed and built for raster well-log digitisation, is the system the reference numbers come from, and it lives downstream of this map: the curve-extraction task it solves is exactly the thin-structure regime the topology-aware literature is about, and the overlap scores it reports are exactly the kind that the survey argues must be paired with a connectivity reading to be interpretable. The survey explains why those overlap numbers, read alone, understate what is at stake in the task, and it locates the levers, skeletal and homological objectives, that the field offers for the part of the problem an overlap loss cannot reach.

The practical reading for a practitioner is not to memorise a ranking of losses but to recognise that the metric and the objective are a single decision. If the downstream consumer needs a connected trace, then connectivity has to appear both in what is reported and in what is optimised, or the system is being graded and trained on a property nobody is steering. Reporting a Betti-error or a centreline-Dice alongside intersection over union is the cheap half; choosing a loss whose gradient actually flows toward closing a gap is the half that changes the model.

Limitations

This is a survey, and it carries a survey's limits. It synthesises what the published papers report; it does not re-implement the topology-aware losses under a common protocol, and where it quotes measured numbers, those come from a single raster-log reference task and a single architecture rather than from a fresh multi-method benchmark on a shared dataset. The overlap-versus-connectivity divergence rendered in the exhibit is an illustrative geometry of the blindness the metrics share, built on the real anchor numbers, not a measured response curve for any particular loss. The reference task is also narrow by design: one-to-three-pixel curves on scanned logs against a near-empty background, which is the extreme thin-structure regime where the gap is widest, so the size of the overlap-connectivity disagreement observed there will compress on tasks with thicker or more space-filling foregrounds. The survey scopes itself to the connectivity-preserving objective families the topology-aware literature treats as canonical for thin and tubular structures; the broader space of boundary-aware, distance-transform, and active-contour losses, and the many hybrid combinations the field has explored, sits outside its frame, and the persistent-homology objectives in particular carry computational costs at training time that this reading notes but does not quantify. A reader should treat the synthesis as a map of why overlap and topology diverge and where the field has placed its levers, not as a substitute for running the connectivity-aware comparison on their own structures and their own metric.

Key takeaways

Intersection over union is a set-overlap statistic with no term for connectivity: a thin curve recovered as one continuous trace and the same curve recovered as many fragments separated by one-pixel gaps post nearly identical overlap, because the missing gap pixels are a vanishing fraction of the union.
The topology-aware loss literature is best read along one axis, whether the objective scores raw pixel overlap (topology-blind) or a representation where a break is expensive: feature-matching delineation losses, skeleton-based clDice, and persistent-homology and Betti-matching losses each make a connectivity break costly in a different currency.
The real raster-log reference shows why overlap numbers mislead in isolation: a peak intersection over union of 0.51 and peak F1 of 0.55 on the binary curve task, and per-curve multiclass IoU of 0.26 and 0.21 under a Dice loss, are ordinary-looking scores that say nothing about whether the digitised curve is one piece or many.
clDice closes the gap by scoring Dice on the skeleton, where a break severs the centreline; persistent-homology losses put the Betti numbers directly into the gradient; the Lovasz-Softmax surrogate deliberately does not close it, because optimising the overlap measure faithfully inherits its blindness to topology.
The metric and the objective are one decision. Connectivity has to appear both in what is reported (a Betti-error or centreline-Dice alongside IoU) and in what is optimised, or the model is being graded and trained on a property no one is steering. This survey maps the public field; VeerNet, our raster-log architecture, is the downstream system whose thin-curve task lives in exactly this regime.

References

[1] Mosinska, A., Marquez-Neila, P., Kozinski, M., and Fua, P. Beyond the Pixel-Wise Loss for Topology-Aware Delineation. CVPR (2018). The founding move: match predictions to ground truth in higher-level features that respond to thin connected structure, not pixel by pixel. https://arxiv.org/abs/1712.02190

[2] Hu, X., Li, F., Samaras, D., and Chen, C. Topology-Preserving Deep Image Segmentation. NeurIPS (2019). A persistent-homology loss that penalises the wrong number of connected components or holes even where per-pixel agreement is high. https://arxiv.org/abs/1906.05404

[3] Clough, J. R., Byrne, N., Oksuz, I., Zimmer, V. A., Schnabel, J. A., and King, A. P. A Topological Loss Function for Deep-Learning Based Image Segmentation Using Persistent Homology. IEEE TPAMI (2020). A persistent-homology loss that lets the desired topology be specified as a prior and drives the prediction toward it. https://arxiv.org/abs/1910.01877

[4] Shit, S., Paetzold, J. C., Sekuboyina, A., Ezhov, I., Unger, A., Zhylka, A., Pluim, J. P. W., Bauer, U., and Menze, B. H. clDice - A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation. CVPR (2021). Scores Dice on the morphological skeleton, so a break that severs the centreline is expensive even when it is cheap in pixels. https://arxiv.org/abs/2003.07311

[5] Hu, X., Wang, Y., Fuxin, L., Samaras, D., and Chen, C. Topology-Aware Segmentation Using Discrete Morse Theory. ICLR (2021). Reasons about the structure's skeleton and critical points via discrete Morse theory to enforce correct topology. https://arxiv.org/abs/2103.09992

[6] Stucki, N., Paetzold, J. C., Shit, S., Menze, B., and Bauer, U. Topologically Faithful Image Segmentation via Induced Matching of Persistence Barcodes. ICML (2023). Aligns the persistence barcodes of prediction and label so topological features are matched rather than merely counted. https://arxiv.org/abs/2211.15272

[7] Berman, M., Triki, A. R., and Blaschko, M. B. The Lovasz-Softmax loss: a tractable surrogate for the optimization of the intersection-over-union measure in neural networks. CVPR (2018). The metric-surrogate counterpoint: optimising overlap faithfully still inherits its blindness to connectivity. https://arxiv.org/abs/1705.08790

[8] Milletari, F., Navab, N., and Ahmadi, S.-A. V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. 3DV (2016). Introduces the Dice loss in its standard differentiable form, the overlap-aware objective whose topology-blindness this survey is about. https://arxiv.org/abs/1606.04797

Topology-Aware Losses and Connectivity Metrics for Thin-Structure Segmentation

Abstract

Why overlap is silent about a break

How this reading was assembled

Results: an overlap score that certifies nothing about shape

Discussion: the metric and the loss are one decision

Limitations

References

Continuous AI for explorers

About Earthscan

Products

Legal

Follow us on