MediaPipe Face Mesh Demo with Recording
Real-time face mesh detection and visualization using MediaPipe and OpenCV with live camera feed, plus 3D mesh recording and export capabilities.
Features
- 468 3D Face Landmarks - Comprehensive face mesh tracking
- Real-time Processing - Optimized for live camera feed
- Multiple Visualizations:
- Face tessellation (mesh network)
- Face contours (outline)
- Iris tracking (detailed eye landmarks)
- Camera Management:
- Auto-detection of available cameras
- Support for multiple cameras (built-in, iPhone, external)
- Camera switching on-the-fly
- Interactive Controls:
- Pause/resume functionality
- Frame capture and saving
- FPS monitoring
- Recording & Export:
- Record up to 150 frames (5 seconds at 30fps)
- Export 3D face meshes as OBJ files
- Per-vertex colors based on facial semantics
- Organized session-based storage
- Reprocessing:
- Regenerate mesh files from existing recordings
- Batch process multiple frames
- Useful when live recording fails
- Smart Defaults - Automatically uses camera index 1 (typically front-facing)
Usage
Basic Usage
# Run with default camera (index 1)
python -m algos.character_masks.demo.face_mesh_demo
# Use specific camera index
python -m algos.character_masks.demo.face_mesh_demo --camera 0
Camera Management
# List all available cameras
python -m algos.character_masks.demo.face_mesh_demo --list
Recording Management
# List all recording sessions
python -m algos.character_masks.demo.face_mesh_demo --list-sessions
# Reprocess an existing recording session
python -m algos.character_masks.demo.face_mesh_demo --reprocess session_20250916_002902
# Reprocess using full path
python -m algos.character_masks.demo.face_mesh_demo --reprocess /path/to/recording/session
Controls
| Key | Action |
|---|---|
q | Quit the demo |
s | Save current frame as image |
r | Reset FPS counter |
c | Switch to next camera |
v | Start/Stop recording (video + 3D meshes) |
SPACE | Pause/Resume |
Camera Notes
On macOS with iPhone connected via Continuity Camera:
- Camera 0: Usually iPhone rear camera
- Camera 1: Usually iPhone front camera or Mac built-in (default)
The demo automatically defaults to camera 1 for face tracking.
Recording Feature
How Recording Works
When you press v during the live demo:
- Frame Capture: Records up to 150 frames (approximately 5 seconds at 30fps)
- Real-time Processing: Extracts face mesh landmarks from each frame
- 3D Mesh Export: Generates OBJ files with vertex colors for each frame
- Organized Storage: Saves everything in timestamped session folders
Output Structure
Each recording session creates:
algos/character_masks/demo/recording/session_YYYYMMDD_HHMMSS/
├── frames/ # JPEG images of recorded frames
│ ├── frame_0000.jpg
│ ├── frame_0001.jpg
│ └── ...
├── meshes/ # 3D mesh files with vertex colors
│ ├── face_mesh_timestamp_001.obj
│ ├── face_mesh_timestamp_002.obj
│ └── ...
└── metadata.txt # Session information and statistics
Vertex Colors
The 3D meshes include semantic vertex coloring:
- Red: Lips and mouth area
- Blue: Eyes and iris regions
- Brown: Eyebrow areas
- Light skin tone: Face oval and general face area
- Depth shading: Subtle lighting based on relative depth
Reprocessing
If mesh generation fails during live recording (common due to processing load), you can reprocess any session:
- List available sessions:
--list-sessions - Reprocess specific session:
--reprocess SESSION_NAME - Check results: Mesh files will be regenerated in the meshes/ folder
This is particularly useful when:
- Live recording couldn't keep up with mesh generation
- You want to regenerate meshes with updated algorithms
- Mesh files got corrupted or deleted
Performance
The demo displays real-time performance metrics:
- FPS (Frames Per Second)
- Frame counter
- Face detection status
- Active camera index
- Recording status (when active, shows frame count and REC indicator)
Technical Details
Face Mesh Detection
- Uses MediaPipe's Face Mesh solution
- Provides 468-478 3D facial landmarks (478 when refined landmarks enabled)
- Includes refined landmarks for better accuracy
- Configurable detection and tracking confidence thresholds
- Optimized buffer settings for minimal latency
3D Mesh Export
- OBJ format with vertex colors
- Uses MediaPipe's face tesselation for triangular faces
- Per-vertex RGB colors encoding facial semantics
- Handles both 468 (base) and 478 (refined with iris) landmark modes
- Real-time coordinate transformation from normalized to pixel space
File Organization
- Session-based recording with timestamps
- Git-ignored recording directories (
**/recording/**) - Automatic metadata generation and tracking
- Support for batch reprocessing of existing sessions
Troubleshooting
If you encounter issues:
-
No camera detected:
- Check System Settings > Privacy & Security > Camera
- Ensure Terminal/IDE has camera permissions
- Close other apps using the camera
-
Black screen or wrong camera:
- Use
--listto see available cameras - Try different camera indices with
--camera N - Press
cwhile running to switch cameras
- Use
-
Poor performance:
- Ensure good lighting conditions
- Position face clearly in frame
- Check that no other heavy processes are running
-
Recording issues:
- Recording stops automatically at 150 frames
- If mesh generation fails during recording, use
--reprocess - Check that demo/recording/ directory has write permissions
- Ensure sufficient disk space for frame images and mesh files
-
Mesh files missing or corrupted:
- Use
--list-sessionsto see available sessions - Use
--reprocess SESSION_NAMEto regenerate mesh files - Check that face was clearly visible during recording
- Use
Dependencies
Requires the Pajama project virtual environment with:
- MediaPipe (face mesh detection)
- OpenCV-Python (camera capture and image processing)
- NumPy (numerical operations)
- PyTorch (tensor operations for landmark processing)
All dependencies are already installed in /Users/kvenkateshan/code/pajama/venv/
Example Workflow
-
Start recording session:
python -m algos.character_masks.demo.face_mesh_demo
# Press 'v' to start recording
# Move your head around for 5 seconds
# Press 'v' again to stop, or wait for auto-stop -
Check recorded sessions:
python -m algos.character_masks.demo.face_mesh_demo --list-sessions -
Reprocess if needed:
python -m algos.character_masks.demo.face_mesh_demo --reprocess session_20250916_002902 -
View results:
- Frame images:
algos/character_masks/demo/recording/session_*/frames/ - 3D meshes:
algos/character_masks/demo/recording/session_*/meshes/ - Import OBJ files into Blender, MeshLab, or other 3D software
- Frame images:
Batch Frame Generation Script
Generate landmarks and expression captions for video frames from a subject-based dataset.
Features
- Memory-efficient: Only loads small chunks of images at a time
- Pandas-based: Uses dataframes for efficient data management
- AVIF support: Handles AVIF format images via Pillow
- Filtering: Can ignore specific segments by name or file
- Parallel processing: Uses concurrent threads within each chunk
- Parquet output: Saves results as efficient Parquet files
- Hydra configuration: Easy configuration management via YAML files
Installation
pip install pandas pillow-heif tqdm hydra-core pyarrow
Dataset Structure
dataset_root/
├── subject_id_mapping.csv # Maps subject indices (column: index)
├── 0/ # Subject folder
│ ├── frame_list.csv # Lists seg_id and frame_id
│ ├── 000001.avif # Frame images in AVIF format
│ ├── 000002.avif
│ └── ...
├── 1/
│ └── ...
└── expression_generated/ # Output directory (created by script)
├── chunk_0000.parquet
├── chunk_0001.parquet
└── all_results.parquet
Usage
1. Using Config Files (Recommended)
Create a config file (e.g., conf/my_experiment.yaml):
dataset_root: /path/to/dataset
subject_mapping_csv: /path/to/subject_id_mapping.csv
subjects: [0, 1, 2] # Optional: specific subjects
ignore_segments: ["SEN_bad1", "SEN_bad2"] # Optional: segments to ignore
concurrency: 8
chunk_size: 5
Run with your config:
python -m algos.character_masks.demo.run_batch_generate_dataset --config-name=my_experiment
2. Using Default Config with Overrides
python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/path/to/dataset \
subject_mapping_csv=/path/to/mapping.csv \
concurrency=8 \
chunk_size=5
3. Override Specific Parameters
# Process only specific subjects
python -m algos.character_masks.demo.run_batch_generate_dataset \
--config-name=my_experiment \
subjects=[0,1,2]
# Ignore specific segments
python -m algos.character_masks.demo.run_batch_generate_dataset \
--config-name=my_experiment \
ignore_segments=["SEN_bad1","SEN_bad2"]
# Change chunk size for memory management
python -m algos.character_masks.demo.run_batch_generate_dataset \
--config-name=my_experiment \
chunk_size=3
Configuration Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
dataset_root | str | required | Path to dataset root directory |
subject_mapping_csv | str | required | Path to subject_id_mapping.csv |
subjects | list[int] | null | Specific subject indices to process |
ignore_segments | list[str] | null | Segment names to ignore |
concurrency | int | 4 | Concurrent threads per chunk |
chunk_size | int | 5 | Frames per chunk (keep small for memory) |
show_results | int | 5 | Number of sample results to display |
Output Format
Results are saved as Parquet files with the following schema:
subject_index: int64 # Subject index number
frame_id: int64 # Frame ID
segment_name: str # Segment name (e.g., "SEN_...")
expression_caption: str # Generated expression description
landmarks_detected: bool # Whether face was detected
landmarks_num: int64 # Number of landmarks (e.g., 478)
landmarks_coordinates: array # Numpy array of shape (N, 3) for [x, y, z]
Reading Results
import pandas as pd
# Read all results
df = pd.read_parquet('dataset_root/expression_generated/all_results.parquet')
# Filter to specific subject
subject_0 = df[df['subject_index'] == 0]
# Get frames with detected landmarks
detected = df[df['landmarks_detected'] == True]
# Access landmark coordinates (returns numpy array)
landmarks = df.iloc[0]['landmarks_coordinates']
print(landmarks.shape) # e.g., (478, 3)
Examples
Process all subjects with default settings
python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/data/faces \
subject_mapping_csv=/data/faces/mapping.csv
Process specific subjects with high concurrency
python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/data/faces \
subject_mapping_csv=/data/faces/mapping.csv \
subjects=[0,1,2,5,10] \
concurrency=16 \
chunk_size=10
Ignore specific segments
python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/data/faces \
subject_mapping_csv=/data/faces/mapping.csv \
'ignore_segments=["SEN_bad1","SEN_bad2","SEN_bad3"]'
Troubleshooting
AVIF Loading Issues
If you get errors loading AVIF files, ensure pillow-heif is installed:
pip install pillow-heif
Memory Issues
If you run out of memory:
- Reduce
chunk_size(try 3 or even 1) - Reduce
concurrency - Process fewer subjects at a time using
subjectsparameter
Hydra Output Directory
By default, Hydra creates output directories in outputs/ with timestamps. To disable this:
python -m algos.character_masks.demo.run_batch_generate_dataset \
hydra.run.dir=. \
hydra.output_subdir=null \
dataset_root=/path/to/dataset \
subject_mapping_csv=/path/to/mapping.csv