Thứ Hai, 10 tháng 3, 2014

Tài liệu 3D FACE PROCESSING Modeling, Analysis and Synthesis docx

Contents
List of Figures
List of Tables
Preface
Acknowledgments
xi
xv
xvii
xix
1.
INTRODUCTION
1
2
Motivation
Research Topics Overview
2.1
2.2
2.3
2.4
2.5
3D face processing framework overview
3D face geometry modeling
Geometric-based facial motion modeling, analysis and
synthesis
Enhanced facial motion analysis and synthesis using
flexible appearance model
Applications of face processing framework
3
Book Organization
1
2
2
2
4
5
7
8
9
11
11
12
12
13
14
14
15
17
19
2.
3D FACE MODELING
1
State of the Art
1.1
1.2
1.3
Face modeling using 3D range scanner
Face modeling using 2D images
Summary
2
Face Modeling Tools in iFACE
2.1
2.2
Generic face model
Personalized face model
3
Future Research Direction of 3D Face Modeling
3.
LEARNING GEOMETRIC 3D FACIAL MOTION MODEL
vi
3D FACE PROCESSING: MODELING, ANALYSIS AND SYNTHESIS
1
Previous Work
1.1
1.2
1.3
Facial deformation modeling
Facial temporal deformation modeling
Machine learning for facial deformation modeling
2
3
4
5
6
7
Motion Capture Database
Learning Holistic Linear Subspace
Learning Parts-based Linear Subspace
Animate Arbitrary Mesh Using MU
Temporal Facial Motion Model
Summary
4.
GEOMETRIC MODEL-BASED 3D FACE TRACKING
1
Previous Work
1.1
Parameterized geometric models
1.1.1
1.1.2
1.1.3
1.1.4
1.2
1.3
1.3.1
1.3.2
B-Spline curves
Snake model
Deformable template
3D parameterized model
FACS-based models
Statistical models
Active Shape Model (ASM) and Active Appearance
Model (AAM)
3D model learned from motion capture data
2
3
4
Geometric MU-based 3D Face Tracking
Applications of Geometric 3D Face Tracking
Summary
5.
GEOMETRIC FACIAL MOTION SYNTHESIS
1
Previous Work
1.1
1.2
1.3
Performance-driven face animation
Text-driven face animation
Speech-driven face animation
2
3
4
5
Facial Motion Trajectory Synthesize
Text-driven Face Animation
Offline Speech-driven Face Animation
Real-time Speech-driven Face Animation
5.1
Formant features for speech-driven face animation
5.1.1 Formant analysis
19
19
20
21
22
23
24
27
29
30
31
31
32
32
32
33
33
33
34
34
34
35
37
38
41
41
41
42
42
44
46
47
48
49
49
Contents
vii
5.1.2
An efficient real-time speech-driven animation system
based on formant analysis
5.2
ANN-based real-time speech-driven face animation
5.2.1
5.2.2
5.2.3
5.2.4
Training data and features extraction
Audio-to-visual mapping
Animation result
Human emotion perception study
6
Summary
6.
FLEXIBLE APPEARANCE MODEL
1
Previous Work
1.1
1.2
1.3
Appearance-based facial motion modeling, analysis and
synthesis
Hybrid facial motion modeling, analysis and synthesis
Issues in flexible appearance model
1.3.1
1.3.2
1.3.3
Illumination effects of face appearance
Person dependency
Online appearance model
2
Flexible Appearance Model
2.1
Reduce illumination dependency based on illumination
modeling
2.1.1
2.1.2
2.1.3
Radiance environment map (REM)
Approximating a radiance environment map using spherical
harmonics
Approximating a radiance environment map from a
single image
2.2
Reduce person dependency based on ratio-image
2.2.1
2.2.2
2.2.3
Ratio image
Transfer motion details using ratio image
Transfer illumination using ratio image
3
Summary
7.
FACIAL MOTION ANALYSIS USING FLEXIBLE APPEARANCE
MODEL
1
Model-based 3D Face Motion Analysis Using Both Geometry
and Appearance
1.1
1.2
1.3
1.4
Feature extraction
Influences of lighting
Exemplar-based texture analysis
Online EM-based adaptation
50
52
53
53
55
56
59
61
62
62
62
63
63
66
66
67
67
67
68
70
71
71
71
72
73
75
75
77
79
79
80
viii
3D FACE PROCESSING: MODELING, ANALYSIS AND SYNTHESIS
2
3
Experimental Results
Summary
8.
FACE APPEARANCE SYNTHESIS USING FLEXIBLE
APPEARANCE MODEL
1
Neutral Face Relighting
1.1
Relighting with radiance environment maps
1.1.1
1.1.2
1.1.3
1.1.4
Relighting when rotating in the same lighting condition
Comparison with inverse rendering approach
Relighting in different lighting conditions
Interactive face relighting
1.2
Face relighting from a single image
1.2.1
Dynamic range of images
1.3
1.4
Implementation
Relighting results
2
3
Face Relighting For Face Recognition in Varying Lighting
Synthesize Appearance Details of Facial Motion
3.1
3.2
Appearance of mouth interior
Linear alpha-blending of texture
4
Summary
9.
APPLICATION EXAMPLES OF THE FACE PROCESSING
FRAMEWORK
1
Model-based Very Low Bit-rate Face Video Coding
1.1
1.2
1.3
1.4
Introduction
Model-based face video coder
Results
Summary and future work
2
Integrated Proactive HCI environments
2.1
2.2
2.3
Overview
Current status
Future work
3
Summary
10.
CONCLUSION AND FUTURE WORK
1
2
Conclusion
Future Work
2.1
2.2
2.3
Improve geometric face processing
Closer correlation between geometry and appearance
Human perception evaluation of synthesis
83
87
91
91
92
92
93
93
94
94
95
96
97
100
103
103
104
105
107
107
107
108
109
110
110
111
112
113
113
115
115
116
116
116
117
Contents
ix
2.3.1
2.3.2
Previous work
Our ongoing and future work
Appendices
Projection of face images in 9-D spherical harmonic
space
References
Index
117
120
123
125
137
List of Figures
1.1
1.2
2.1
2.2
2.3
2.4
2.5
3.1
3.2
3.3
3.4
3.5
3.6
Research issues and applications of face processing.
A unified 3D face processing framework.
The generic face model. (a): Shown as wire-frame
model. (b): Shown as shaded model.
An example of range scanner data. (a): Range map.
(b): Texture map.
Feature points defined on texture map.
The model editor.
An example of customized face models.
An example of marker layout for MotionAnalysis sys-
tem.
The markers of the Microsoft data [Guenter et al., 1998].
(a): The markers are shown as small white dots. (b) and
(c): The mesh is shown in two different viewpoints.
The neutral face and deformed face corresponding to
the first four MUs. The top row is frontal view and the
bottom row is side view.
(a): NMF learned parts overlayed on the generic face
model. (b): The facial muscle distribution. (c): The
aligned facial muscle distribution. (d): The parts over-
layed on muscle distribution. (e): The final parts de-
composition.
Three lower lips shapes deformed by three of the lower
lips parts-based MUs respectively. The top row is the
frontal view and the bottom row is the side view.
(a): The neutral face side view. (b): The face deformed
by one right cheek parts-based MU.
3
4
14
15
15
16
16
22
23
24
25
26
26
xii
3D FACE PROCESSING: MODELING, ANALYSIS AND SYNTHESIS
3.7
4.1
4.2
5.1
5.2
5.3
5.4
5.5
5.6
5.7
5.8
5.9
6.1
7.1
7.2
7.3
(a): The generic model in iFACE. (b): A personalized
face model based on the Cyberware
TM
scanner data.
(c): The feature points defined on generic model.
Typical tracked frames and corresponding animated face
models. (a): The input image frames. (b): The track-
ing results visualized by yellow mesh overlayed on input
images. (c): The front views of the face model animated
using tracking results. (d): The side views of the face
model animated using tracking results. In each row, the
first image corresponds to neutral face.
(a): The synthesized face motion. (b): The recon-
structed video frame with synthesized face motion. (c):
The reconstructed video frame using H.26L codec.
(a): Conventional NURBS interpolation. (b): Statisti-
cally weighted NURBS interpolation.
The architecture of text driven talking face.
Four of the key shapes. The top row images are front
views and the bottom row images are the side views.
The largest components of variances are (a): 0.67; (b):
1.0;,
(c):
0.18;
(d):
0.19.
The architecture of offline speech driven talking face.
The architecture of a real-time speech-driven animation
system based on formant analysis.
“Vowel Triangle” in the system, circles correspond to
vowels [Rabiner and Shafer, 1978].
Comparison of synthetic motions. The left figure is text
driven animation and the right figure is speech driven
animation. Horizontal axis is the number of frames;
vertical axis is the intensity of motion.
Compare the estimated MUPs with the original MUPs.
The content of the corresponding speech track is “A bird
flew on lighthearted wing.”
Typical frames of the animation sequence of “A bird
flew on lighthearted wing.” The temporal order is from
left to right, and from top to bottom.
A face albedo map.
Hybrid 3D face motion analysis system.
(a): The input video frame. (b): The snapshot of the
geometric tracking system. (c): The extracted texture map
Selected facial regions for feature extraction.
27
36
38
45
46
48
49
50
51
52
54
55
70
76
76
77
List of Figures
xiii
7.4
7.5
7.6
7.7
7.8
8.1
8.2
8.3
8.4
8.5
8.6
8.7
8.8
Comparison of the proposed approach with geometric-
only method in person-dependent test.
Comparison of the proposed appearance feature (ratio)
with non-ratio-image based appearance feature (non-
ratio) in person-independent recognition test.
Comparison of different algorithms in person-independent
recognition test. (a): Algorithm uses geometric feature
only. (b): Algorithm uses both geometric and ratio-
image based appearance feature. (c): Algorithm ap-
plies unconstrained adaptation. (d): Algorithm applies
constrained adaptation.
The results under different 3D poses. For both (a) and
(b): Left: cropped input frame. Middle: extracted tex-
ture map. Right: recognized expression.
The results in a different lighting condition. For both (a)
and (b): Left: cropped input frame. Middle: extracted
texture map. Right: recognized expression.
Using constrained texture synthesis to reduce artifacts
in the low dynamic range regions. (a): input image; (b):
blue channel of (a) with very low dynamic range; (c):
relighting without synthesis; and (d): relighting with
constrained texture synthesis.
(a): The generic mesh. (b): The feature points.
The user interface of the face relighting software.
The middle image is the input. The sequence shows
synthesized results of 180° rotation of the lighting en-
vironment.
The comparison of synthesized results and ground truth.
The top row is the ground truth. The bottom row is
synthesized result, where the middle image is the input.
The middle image is the input. The sequence shows a
180° rotation of the lighting environment.
Interactive lighting editing by modifying the spheri-
cal harmonics coefficients of the radiance environment
map.
Relighting under different lighting. For both (a) and
(b): Left: Face to be relighted. Middle: target face.
Right: result.
85
86
87
88
88
95
96
97
97
98
99
100
101
xiv
3D FACE PROCESSING: MODELING, ANALYSIS AND SYNTHESIS
8.9
8.10
8.11
9.1
9.2
9.3
Examples of Yale face database B [Georghiades et al.,
2001]. From left to right, they are images from group 1
to group 5.
Recognition error rate comparison of before relighting
and after relighting on the Yale face database.
Mapping visemes of (a) to (b). For (b), the first neutral
image is the input, the other images are synthesized.
(a) The synthesized face motion. (b) The reconstructed
video frame with synthesized face motion. (c) The re-
constructed video frame using H.26L codec.
The setting for the Wizard-of-Oz experiments
(a) The interface for the student. (b) The interface for
the instructor.
102
103
104
110
112
113
List of Tables
5.1
5.2
5.3
5.4
5.5
5.6
7.1
7.2
7.3
7.4
7.5
9.1
Phoneme and viseme used in face animation.
Emotion inference based on video without audio track.
Emotion inference based on audio track.
Emotion inference based on video with audio track 1.
Emotion inference based on video with audio track 2.
Emotion inference based on video with audio track 3.
Person-dependent confusion matrix using the geometric-
feature-only method
Person-dependent confusion matrix using both geomet-
ric and appearance features
Comparison of the proposed approach with geometric-
only method in person-dependent test.
Comparison of the proposed appearance feature (ratio)
with non-ratio-image based appearance feature (non-
ratio) in person-independent recognition test.
Comparison of different algorithms in person-independent
recognition test. (a): Algorithm uses geometric feature
only. (b): Algorithm uses both geometric and ratio-
image based appearance feature. (c): Algorithm ap-
plies unconstrained adaptation. (d): Algorithm applies
constrained adaptation.
Performance comparisons between the face video coder
and H.264/JVT coder.
47
57
57
57
58
58
84
84
84
85
87
109

Không có nhận xét nào:

Đăng nhận xét