(This is a work in progress; currently quite rough)
- Problem statement
- Integrating top-down and bottom-up processes
Bottom-Up Progress - Stem detection and reconstruction
We're continuing to develop an approach for reconstructing 3D stems directly from 2D image features. This will provide a rich set of stem hypothesis to be refined and expanded during top-down inference.
The approach can be summarized in two steps:
- Detection of high-quality stem fragments in 2D and
- Reconstruction of 3D stem curves from 2D curve fragments in several views
Goal: Detect stem regions in images
Typical stem features:
- Strong, parallel edge pairs at borders
- textureless interior
These characteristics are shared by images of text (letters, numbers, etc.)
Idea: use state-of-the-art text-detection as first-step in algorithm to detect stems.
Step 1: Stroke-width transform
Algorithm extracts "strokes" in an image, i.e. regions between two parallel edges with relatively constant width. Pixels with with similar stroke-widths are grouped into several disjoint regions. Designed for text detection, but works equally well for plant stems.
Step 2: Extract medial axis from stroke regions
* Convert stroke regions to points on stroke's medial axis
* Infer linear ordering on points and prune outliers using euclidean graph analysis
* Use cubic interpolating spline to smooth-out quantization error and detector noise
Result: High quality stem-fragments
<Image here: 1. original image, 2. detected edges, 3. stroke regions, 4. medial axis curves>
Some fragments can be merged using heuristics, others will be joined using top-down inference.
Top-Down Progress - Edge Distance
General idea - compare edges in two images.
why? - need to evalaute hypothesized structure. color/patch based methods fail at surface boundaries. Our boundary-to-surface ratio is too high.
criteria - fast/parallel; encourages good fit; close fits aren't ruled out
Edge Distance #1: Chamfer distance
For each point, find nearest neighbor. Sum of squared distances
Additional terms to penalize missed correspondences.
- Fast: Implemented in CUDA; simple operations
- Asymmetric: d(a,b) != d(b,a)
- Fails in "Double-edge" scenario.
- Discontinuities when correspondence switches
- Requires several approximations to achieve parallelism
- Difficult to represent as a probability distribution -> unintuitive parameters require hand-tuning
Edge Distance #2: Gaussian Mixture distance
Each point in B is "generated" by a point in A plus some gaussian noise. Model includes some uniform noise, too.
Correspondences are unknown (average over all possible correspondences).
- Fast: Implemented in CUDA
- Elegant probabilistic model
- Function is smooth - no correspondence switches
- Slower than Chamfer distance due to blurring operation.
- Fails to enforce one-to-one correspondence
- Precise correspondences are unknown.
- Fails to penalize "missing data"
Edge Distance #3: Blurred-Difference distance
- Fast: Implemented in CUDA
- Penalizes noise and missing data.
- Semi-learned from data.
- Blurring radius feels arbitrary
- Implausible generative model
- Faulty assumption of conditional independence of pixels
- can't adjust noise penalty vs. missing penalty
- Troubles with old Edge-based likelihood
- New Edge-based likelihood
- Stroke-width transform for stem-detection
- 3D reconstruction of curves from stem fragments