Optical Motion Capture:
Theory and Implementation
Tutorial XVIII Brazilian Symposium on
Computer Graphics and Image Processing
Gutemberg Guerra-Filho
Computer Vision Laboratory
Center for Automation Research
University of Maryland
College Park, MD 20742-3275
guerra@cs.umd.edu
Abstract
Motion capture is the process of
recording real life movement of a subject as sequences
of Cartesian coordinates in 3D space. Optical motion
capture (OMC) uses cameras to reconstruct the body posture
of the performer. One approach employs a set of multiple
synchronized cameras to capture markers placed in strategic
locations on the body. A motion capture system has applications
in computer graphics for character animation, in virtual
reality for human control-interface, and in video games
for realistic simulation of human motion. In this tutorial,
we discuss the theoretical and empirical aspects of
an optical motion capture system. Basically, for a motion
capture system implementation, the resources required
consist of a number of synchronized cameras, an image
acquisition system, a capturing area, and a special
suit with markers. The locations of the markers on the
suit are designed such that the required body parts
(e.g. joints) are covered. We present our motion capture
system using a framework that identifies different sub-problems
to be solved in a modular way. Therefore, we propose
a Matlab( toolbox for Optical Motion Capture where each
module version may be implemented in order
to consider different constraints. The sub-problems
involved in OMC are initialization, marker detection,
spatial correspondence, temporal correspondence, and
post-processing. In this tutorial, we discuss the theory
involved in each sub-problem and the corresponding novel
techniques used in the current implementation. The initialization
consists in setting up an anthropomorphic human model
and in the
computation of intrinsic and extrinsic camera calibration.
Marker detection involves finding the 2D pixel coordinates
of markers in the images. The spatial correspondence
problem consists in finding pairs of detected markers
in different images captured at the same time with different
viewpoints such that each pair corresponds to the projections
of the same scene point. Given camera calibration and
the spatial matching, the 3D reconstruction of markers
(translational
data) is achieved by triangulating the various camera
views. The temporal correspondence problem (tracking)
involves matching two clouds of 3D points representing
detected markers at two consecutive frames, respectively.
The temporal correspondence module builds a track for
each marker where the marker's 3D coordinates are concatenated
according to time. Post-processing consists in labeling
each track with a marker code, filling track gaps caused
by occlusions, correcting possible gross errors, filtering
or smoothing noise, and interpolating data along time.
Other important techniques used to improve consistency
in the motion data are volumetric reconstruction, inverse
kinematics, and inverse dynamics. Once the translational
data is processed, a hierarchical human model may be
used to compute rotational data (joint angles). We consider
standard data formats available for motion capture data
(e.g. bvh, acclaim) and cover topics related to editing
and manipulation of motion data.
Further information: http://www.cs.umd.edu/~guerra/OptMoCap.html
|