Connected Component Labeling and Its Applications
Connected Component Labeling is commonly used to refer to the task of
grouping the connected pixels in an image. (Note that there are
different ways to define connectedness.) The basic approach is to scan
the image and assign labels to each pixel until the labels for the
pixels no longer change. This basic approach is slow because the labels
propagates one layer in an iteration. There are two common strategies
to speed up this process. The most common strategy leads to a twopass
algorithm. This algorithm uses a data structure to record label
equivalence information. It scans the image once to assign provisional
labels and discover the label equivalence information, and scans the
image a second time to assign the final labels. Another very successful
strategy is the onepass algorithms, that find all connected pixel in
one shot, for example through recursively visit all the connected
neighbors. One of the most successful approach in this category is the
Contour Tracing algorithm by Chang et al. (CVIU 93:206220, 2004). They
also distribute
an implementation
of their algorithm online.
The most commonly used data structure for recording the label
equivalence information in a twopass algorithm is the unionfind data
structure. Researchers have recognized the possiblity of implementing
this data structure implicitly using an array instead of pointers. We
designed a set of algorithms that enabled us to operate on this array
efficiently, and prove that a twopass labeling algorithm with this
implicit unionfind data structure is theoretically as good as the best
known labeling algorithms. We further show through experimental
measurements that it in fact is faster than the best known onepass
algorithms.
Selected Publications

Kesheng Wu, Ekow Otoo and Kenji Suzuki. Two Strategies to Speed up Connected Component Labeling Algorithms. LBNL Tech Report
LBNL59102. 2005.
[Abstract]
[Full Text]
[PAA
v 12 p 117135]
 This paper reports our formal analysis of the underlying algorithms
we used in a number of applications. An early version containing only
empirical study of the algorithms was reported at SPIE
Medical Imaging Conference 2005 [Preprint
LBNL56864]. The key element of our algorithms is a simple unionfind
data structure and a set of algorithm to work on it. This data
structure allows the connected component labeling to be performed in
linear time, which is considered theoretically optimal (cf. CVIU
93:206220 ICIAP'99
and ICPR
2000).
In timing measurements against the fastest known labeling
algorithm for 2D images (distributed by the original authors of the Contour
Tracing algorithm), our optimized labeling algorithm can be up to 10
times faster on large random images. This is primarily because our new
algorithm requires less memory than the Contour Tracing algorithm and
accesses memory in more regular patterns as well.
In 2007, Yapa and
Koichi published an independent evaluation of our algorithm and
found it to be very hard to beat as well.

Kurt Stockinger, John Shalf, Kesheng Wu, Wes Bethel. QueryDriven
Visualization of Large Data Sets, IEEE Visualization,
Minneapolis, Minnesota, USA, October 2005, IEEE Computer Society Press.
[Abstract]
[preprint as
LBNL57511]
 This paper describes our work on the
dextrous data
explorer (DEX). It marks the beginning of very fruitfuly line of
work on
querydriven
visualization, where the query capability of FastBit is used to
efficiently and effectively limit the amount of data deliever to the
visualization pipeline. This combination not only improve the
responsiveness of the viaulization system, but also reduces the visual
clutter and focuses the human perception system in a more productive way.

Kesheng Wu, Wendy Koegler, Jacqueline Chen and Arie Shoshani, Using
Bitmap Index for Interactive Exploration of Large
Datasets.
In Proceedings of SSDBM 2003.
[Abstract]
[Preprint (LBNL52535)]

This paper describes the first application that motivated our work on
connected component labeling. We were trying to meet the need of
identifying and tracking ignition
kernels in a combustion simulation. The procedure we use to track
ignition kernel can be divided into three steps: (1) searching for
points with user specified characteristics of burning, (2) connecting the
points into regions in space, and (3) computing overlaps among regions
from adjacent time steps to track their evolution. The FastBit search technology can
accomplish step (1) very efficiently. The underlying data structure
also allows us to compute the overlaps among regions efficiently too.
Therefore, we have solutions to two of the three steps. What was
missing was an efficient technology to accomplish step (2), which can be
treated as a connected component labeling problem.
Here is a more extensive list of
publications on connected component labeling and its applications.
This work forms the basis of many fruitful collaborations with the
Visualization group at LBNL, including
DEX.
 Related sites:
 A
description of connected component labeling for image analysis as it
relates to HIPR2.
 A
free implementation of image analysis tool that uses connected component
labeling idea.
 Connected
Components of a Graph
 More research work by John
Wu
 Bitmap Index
 Connected Component
Labeling
 Eigenvalue
Computation
 Information available elsewhere on the web
 CiteSeer
 DBLP
 Google
Scholar
Contact us
Disclaimers