The task has not changed in fifty years. Somewhere in an image there is a line, and a program has to say where. What has changed, repeatedly, is the method, and the interesting thing about the method is not that it got better in the abstract. It is that each generation solved the same problem by pushing work off the engineer's desk and onto data. The first line finders were rules a person wrote down. The latest ones are functions a person fits. Everything in between is the gradient between those two, and the well-log net EarthScan built to read curves off scanned paper is a point far along it, not a new curve entirely. This note is the heritage: the arc from edge operators to segmentation nets, and why the newest step belongs on the same line as the oldest.
We should be plain about what this is not. It is not the VeerNet whitepaper, which is about a product and what it does for a well-log digitisation pipeline. This is the primer that sits under it: the classical-methods lineage a computer-vision engineer would recognise, applied to the one case we know intimately. The numbers from our own runs appear only to place our net on the arc, not to sell it.
The first line finders were rules a person wrote
The earliest practical way to find a line was to find its edges, and the earliest practical way to find an edge was to notice that an edge is where image intensity changes fast. Sobel and Feldman's operator made that noticing mechanical: convolve the image with a fixed 3x3 gradient stencil, and the response is large where the intensity is steep [1]. It is a beautiful piece of engineering precisely because it is so small. There is no training, no dataset, nothing learned. There is a kernel someone chose and a threshold someone sets, and that is the whole method.
Canny turned the same idea into a design problem with more moving parts: smooth first, take gradients, thin the response to single-pixel ridges, then walk two thresholds to decide which ridges are real edges and which are noise [2]. The two thresholds are the point. They are dials the operator turns by hand, per image, and getting them wrong is how a Canny edge map either dissolves into speckle or drops the faint line you cared about. This is the defining property of the first generation: it is fast, it needs no data, and it fails brittlely, because every decision it makes was set in advance by a person who could not see the specific image it would run on. On a clean scan the hand-set knobs are fine. On a faded log with coffee rings and a curve that grazes a grid line, the same knobs that found the strong lines drown the weak one.
Voting turned edges into whole lines
Edge operators find edge pixels. They do not find lines, in the sense of a single geometric object you can name and measure. A curve on a log is not a set of bright pixels, it is a trajectory, and the gap between those two is where the Hough transform lives. Duda and Hart's formulation, generalising Hough's earlier patent, reframed the problem as voting [3]. Every edge pixel casts votes for all the lines that could pass through it, tallied in an accumulator indexed by line parameters, and the cells that collect the most votes are the lines. It is the first generation on this arc where the answer is a structure rather than a mask.
But look at what the engineer still has to set. The resolution of the accumulator bins is a choice. The vote threshold that separates a real line from an accidental pile-up of votes is a choice. The parameterisation that decides which shapes are even expressible is a choice. Hough moved the decision up a level, from pixels to geometry, and it moved the same fragility with it: the whole thing is a set of hand-tuned rules, now about how votes are counted rather than how gradients are thresholded. Two generations in, the pattern is already visible. The method changes; the knobs do not go away, they move.
Learning the filters instead of choosing them
The third step is the one that broke the habit of choosing the filter. Instead of a person picking a gradient kernel, you let the data pick the filters, by defining a bank of learnable convolutions and fitting them to labelled examples of what a line looks like. The gradient stencil that Sobel chose by hand becomes one filter among many that gradient descent is free to discover, discard, or reshape. This is the hinge of the whole arc. Up to here, every knob was set by a human before the method ever saw the image. From here on, the knobs are parameters, and the thing that sets them is data.
This did not make the earlier generations obsolete so much as absorb them. A learned first-layer filter often looks like an edge detector, because an edge detector is a good thing to be in the first layer. The difference is that nobody wrote it down. The engineer's job shifts from tuning the operator to curating the data and the loss, a different discipline with its own failure modes, but recognisably the same task in new clothes: find the line, now by learning what a line is instead of asserting it.
The net that reads a curve, pixel by pixel
The last step on the arc is to stop finding edges or voting for lines and instead label every pixel as line or not, then read the geometry back out of the mask. That is semantic segmentation, and the shape that made it work on the small, oddly sized datasets we actually have is the encoder-decoder with skip connections that Ronneberger, Fischer, and Brox introduced [4]. The encoder compresses the image down to a coarse, meaning-rich summary; the decoder expands it back to full resolution; the skips hand the decoder the fine detail the encoder threw away. The output is a per-pixel verdict, which for a well log means: is this pixel part of the gamma-ray curve, the resistivity curve, or the paper behind them.
This is where our net sits, and its dimensions are the modern echo of every knob the earlier generations set by hand. It is a 5-stage encoder paired with a 5-stage decoder, with 2 attention layers on the bottleneck that let it weigh context down the length of a log column rather than only within a local window, borrowing the attention mechanism from Vaswani and colleagues [5]. None of its behaviour on a line is written down. It is fitted from 15,000 synthetic training instances, generated to stand in for exactly the heuristics an earlier generation would have hand-tuned. The Sobel kernel, the Canny thresholds, the Hough bins: their descendants are all in there, as learned weights, set by the data instead of by us. That is the argument in one sentence, and the exhibit below is that sentence drawn out.
The payoff for making that trade shows up in the numbers the net reaches: peak IoU of 0.51 on the mask, peak recall of 0.97 on the curve pixels, and peak R-squared of 0.9891 once the mask is turned back into a depth-indexed curve and compared against ground truth. We report these as coordinates, not as a scoreboard. A recall of 0.97 means the net rarely misses a curve pixel, which is what you buy when you stop asking a human to pre-set a threshold that has to work on every scan and instead let the model learn the decision per pixel. The IoU of 0.51 is the honest cost of a thin-structure segmentation problem where a curve is a few pixels wide against a large background, and it is exactly the kind of number a Canny map cannot even produce, because Canny was never in the business of claiming a region. Different generation, different metric, same job.
Why the newest step is a continuation, not a rupture
It is tempting to tell this history as a series of replacements, each method killing the last. That is the wrong shape. The truer shape is a single trade, run four times: rules the engineer sets, traded for structure the data provides. Sobel set a kernel. Canny set thresholds. Hough set bins and a vote count. The learned-feature generation stopped setting the filter and started fitting it. Our net stops setting anything about a line and fits all of it, from 15,000 instances, across an encoder-decoder with a pair of attention layers. Each generation kept the same task and moved one more decision from the engineer's hand into the data. The knobs never disappeared. They became parameters.
Reading the well-log net this way changes how we treat it. It is not a box that made classical vision irrelevant; it is the current end of a fifty-year habit of turning hand-set rules into learned ones, and it inherits the obligations of everything before it. It still needs the geometry read back out of its mask, the same way Hough needed you to interpret its accumulator. It still has knobs, they are just fitted now, which means the discipline moved from tuning them to feeding them. Placed on the arc, the net stops looking like a break and starts looking like what it is: the latest, and not the last, way to teach a computer to find a line.
Limitations
This is a heritage account, and it compresses fifty years of computer vision into four representative steps, which means it skips a great deal. There were important line finders that do not fit the clean rules-to-data narrative, and the real history is branchier than a single arc. The generation ordering is didactic rather than strictly chronological; learned features and segmentation nets overlap in time and lineage far more than a left-to-right axis suggests, and the exhibit labels that axis as documented heritage rather than a measured quantity for exactly that reason. The three metrics we cite, peak IoU 0.51, peak recall 0.97, and peak R-squared 0.9891, are real archive figures from our own runs, but they are peaks on our synthetic-heavy validation setting, not a benchmark against other methods on a shared dataset, so they place our net on the arc without ranking it against anyone else's. The claim that earlier knobs reappear as learned parameters is an interpretive one; it is a useful lens, not a theorem, and a learned filter resembling an edge detector is an observation about first layers, not a guarantee about what any particular weight is doing. Finally, whether a net that scores well on held-out masks produces a curve a petrophysicist would trust is a question this primer does not answer; that lives in the product work, deliberately kept separate here.
References
[1] Sobel, I., and Feldman, G. A 3x3 Isotropic Gradient Operator for Image Processing. Stanford Artificial Intelligence Project (SAIL), 1968. The fixed-kernel gradient edge operator, trained on nothing, tuned by a threshold. https://en.wikipedia.org/wiki/Sobel_operator
[2] Canny, J. A Computational Approach to Edge Detection. IEEE TPAMI, PAMI-8(6), 1986, pp. 679-698. The multi-stage detector with two hand-set hysteresis thresholds, still the reference hand-tuned edge finder. https://ieeexplore.ieee.org/document/4767851
[3] Duda, R. O., and Hart, P. E. Use of the Hough Transformation to Detect Lines and Curves in Pictures. Communications of the ACM, 15(1), 1972, pp. 11-15. Line finding reframed as voting in a parameter accumulator, with bins and a vote threshold the engineer sets. https://dl.acm.org/doi/10.1145/361237.361242
[4] Ronneberger, O., Fischer, P., and Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. MICCAI 2015, LNCS 9351, pp. 234-241. The encoder-decoder with skip connections that made dense per-pixel labelling work on small datasets, and the shape our curve net inherits. https://arxiv.org/abs/1505.04597
[5] Vaswani, A., Shazeer, N., Parmar, N., et al. Attention Is All You Need. NeurIPS 2017. The attention mechanism behind the two refinement layers on our bottleneck, weighing long-range context down a log column. https://arxiv.org/abs/1706.03762