We are developing a real-time system for Structure from
Motion that allows for virtual objects insertion on the fly!
(See here for more information on
real-time motion estimation. Code available!) While virtual
object insertion has been demostrated before, and is even part
of commercial video editing products nowadays (see for
instance 2d3), current
systems require several batch steps, including feature
tracking, outlier rejection, epipolar bundle adjustment, some
of which involve human intervention. What our demo shows is
that results of similar quality can be obtained in a
completely automatic fashion (no human intervention
whatsoever), while processing the sequence causally. This
enables a whole new range of applications for real-time
interaction with mixed reality (real as well as virtual
objects), all on commerical off-the-shelf hardware while the
camera is undergoing arbitrary motion, including hand-held.
The system consists of off-the-shelf hardware (a camera connected to a
Pentium PC) and software to
- automatically select and track region features despite
changes in illumination
- estimate three-dimensional position and orientation of
surface patches relative to an inertial reference frame
despite individual point-features appearing and
disappearing
- insert a texture-mapped virtual object into the scene so
as to make it appear to be part of the scene and moving with
it.
This is all done in real-time! Additional graphic
features, such as cast shadows, inter-reflections, occlusions,
can be handled off-line.
In the following links, we show a little demo of our
system. The original footage is taken with a still camera
(Canon XL1) in front of which an object rotates on a
turntable. We generate three sequences with a texture-mapped
virtual object inserted. In Color Vase the sequence goes back
and forth showing the frames where the virtual object is never
occluded. In Gray Vase the object undergoes a complete turn to
show the accuracy of motion estimation. (If you watch very
carefully, you may notice a small jump when the sequence goes
back to the beginning). Such an error is a well known, but
unavoidable problem in Structure from Motion due to
the change of scale factor (in a complete turn any point in
the scene will disappear at least once). However, our
estimated overall accumulated error is reasonably low. In
Bouncing Ball an animated demo is shown to stimulate your
imagination on how many special effects are possible using
this system. We want to make two remarks. First, the
processing is completely automatic (we didn't touch any single
data of anything that can be though of). Second, in this demo
we didn't deal with the relative occlusion of the synthetic
and real object, but such a task is possible indeed! What is
necessary is only to estimate the structure of the scene. (See
here for more information on how to
estimate the structure.) |