Skip to main content

MediaPipe Face Mesh Demo with Recording

Real-time face mesh detection and visualization using MediaPipe and OpenCV with live camera feed, plus 3D mesh recording and export capabilities.

Features

  • 468 3D Face Landmarks - Comprehensive face mesh tracking
  • Real-time Processing - Optimized for live camera feed
  • Multiple Visualizations:
    • Face tessellation (mesh network)
    • Face contours (outline)
    • Iris tracking (detailed eye landmarks)
  • Camera Management:
    • Auto-detection of available cameras
    • Support for multiple cameras (built-in, iPhone, external)
    • Camera switching on-the-fly
  • Interactive Controls:
    • Pause/resume functionality
    • Frame capture and saving
    • FPS monitoring
  • Recording & Export:
    • Record up to 150 frames (5 seconds at 30fps)
    • Export 3D face meshes as OBJ files
    • Per-vertex colors based on facial semantics
    • Organized session-based storage
  • Reprocessing:
    • Regenerate mesh files from existing recordings
    • Batch process multiple frames
    • Useful when live recording fails
  • Smart Defaults - Automatically uses camera index 1 (typically front-facing)

Usage

Basic Usage

# Run with default camera (index 1)
python -m algos.character_masks.demo.face_mesh_demo

# Use specific camera index
python -m algos.character_masks.demo.face_mesh_demo --camera 0

Camera Management

# List all available cameras
python -m algos.character_masks.demo.face_mesh_demo --list

Recording Management

# List all recording sessions
python -m algos.character_masks.demo.face_mesh_demo --list-sessions

# Reprocess an existing recording session
python -m algos.character_masks.demo.face_mesh_demo --reprocess session_20250916_002902

# Reprocess using full path
python -m algos.character_masks.demo.face_mesh_demo --reprocess /path/to/recording/session

Controls

KeyAction
qQuit the demo
sSave current frame as image
rReset FPS counter
cSwitch to next camera
vStart/Stop recording (video + 3D meshes)
SPACEPause/Resume

Camera Notes

On macOS with iPhone connected via Continuity Camera:

  • Camera 0: Usually iPhone rear camera
  • Camera 1: Usually iPhone front camera or Mac built-in (default)

The demo automatically defaults to camera 1 for face tracking.

Recording Feature

How Recording Works

When you press v during the live demo:

  • Frame Capture: Records up to 150 frames (approximately 5 seconds at 30fps)
  • Real-time Processing: Extracts face mesh landmarks from each frame
  • 3D Mesh Export: Generates OBJ files with vertex colors for each frame
  • Organized Storage: Saves everything in timestamped session folders

Output Structure

Each recording session creates:

algos/character_masks/demo/recording/session_YYYYMMDD_HHMMSS/
├── frames/ # JPEG images of recorded frames
│ ├── frame_0000.jpg
│ ├── frame_0001.jpg
│ └── ...
├── meshes/ # 3D mesh files with vertex colors
│ ├── face_mesh_timestamp_001.obj
│ ├── face_mesh_timestamp_002.obj
│ └── ...
└── metadata.txt # Session information and statistics

Vertex Colors

The 3D meshes include semantic vertex coloring:

  • Red: Lips and mouth area
  • Blue: Eyes and iris regions
  • Brown: Eyebrow areas
  • Light skin tone: Face oval and general face area
  • Depth shading: Subtle lighting based on relative depth

Reprocessing

If mesh generation fails during live recording (common due to processing load), you can reprocess any session:

  1. List available sessions: --list-sessions
  2. Reprocess specific session: --reprocess SESSION_NAME
  3. Check results: Mesh files will be regenerated in the meshes/ folder

This is particularly useful when:

  • Live recording couldn't keep up with mesh generation
  • You want to regenerate meshes with updated algorithms
  • Mesh files got corrupted or deleted

Performance

The demo displays real-time performance metrics:

  • FPS (Frames Per Second)
  • Frame counter
  • Face detection status
  • Active camera index
  • Recording status (when active, shows frame count and REC indicator)

Technical Details

Face Mesh Detection

  • Uses MediaPipe's Face Mesh solution
  • Provides 468-478 3D facial landmarks (478 when refined landmarks enabled)
  • Includes refined landmarks for better accuracy
  • Configurable detection and tracking confidence thresholds
  • Optimized buffer settings for minimal latency

3D Mesh Export

  • OBJ format with vertex colors
  • Uses MediaPipe's face tesselation for triangular faces
  • Per-vertex RGB colors encoding facial semantics
  • Handles both 468 (base) and 478 (refined with iris) landmark modes
  • Real-time coordinate transformation from normalized to pixel space

File Organization

  • Session-based recording with timestamps
  • Git-ignored recording directories (**/recording/**)
  • Automatic metadata generation and tracking
  • Support for batch reprocessing of existing sessions

Troubleshooting

If you encounter issues:

  1. No camera detected:

    • Check System Settings > Privacy & Security > Camera
    • Ensure Terminal/IDE has camera permissions
    • Close other apps using the camera
  2. Black screen or wrong camera:

    • Use --list to see available cameras
    • Try different camera indices with --camera N
    • Press c while running to switch cameras
  3. Poor performance:

    • Ensure good lighting conditions
    • Position face clearly in frame
    • Check that no other heavy processes are running
  4. Recording issues:

    • Recording stops automatically at 150 frames
    • If mesh generation fails during recording, use --reprocess
    • Check that demo/recording/ directory has write permissions
    • Ensure sufficient disk space for frame images and mesh files
  5. Mesh files missing or corrupted:

    • Use --list-sessions to see available sessions
    • Use --reprocess SESSION_NAME to regenerate mesh files
    • Check that face was clearly visible during recording

Dependencies

Requires the Pajama project virtual environment with:

  • MediaPipe (face mesh detection)
  • OpenCV-Python (camera capture and image processing)
  • NumPy (numerical operations)
  • PyTorch (tensor operations for landmark processing)

All dependencies are already installed in /Users/kvenkateshan/code/pajama/venv/

Example Workflow

  1. Start recording session:

    python -m algos.character_masks.demo.face_mesh_demo
    # Press 'v' to start recording
    # Move your head around for 5 seconds
    # Press 'v' again to stop, or wait for auto-stop
  2. Check recorded sessions:

    python -m algos.character_masks.demo.face_mesh_demo --list-sessions
  3. Reprocess if needed:

    python -m algos.character_masks.demo.face_mesh_demo --reprocess session_20250916_002902
  4. View results:

    • Frame images: algos/character_masks/demo/recording/session_*/frames/
    • 3D meshes: algos/character_masks/demo/recording/session_*/meshes/
    • Import OBJ files into Blender, MeshLab, or other 3D software

Batch Frame Generation Script

Generate landmarks and expression captions for video frames from a subject-based dataset.

Features

  • Memory-efficient: Only loads small chunks of images at a time
  • Pandas-based: Uses dataframes for efficient data management
  • AVIF support: Handles AVIF format images via Pillow
  • Filtering: Can ignore specific segments by name or file
  • Parallel processing: Uses concurrent threads within each chunk
  • Parquet output: Saves results as efficient Parquet files
  • Hydra configuration: Easy configuration management via YAML files

Installation

pip install pandas pillow-heif tqdm hydra-core pyarrow

Dataset Structure

dataset_root/
├── subject_id_mapping.csv # Maps subject indices (column: index)
├── 0/ # Subject folder
│ ├── frame_list.csv # Lists seg_id and frame_id
│ ├── 000001.avif # Frame images in AVIF format
│ ├── 000002.avif
│ └── ...
├── 1/
│ └── ...
└── expression_generated/ # Output directory (created by script)
├── chunk_0000.parquet
├── chunk_0001.parquet
└── all_results.parquet

Usage

Create a config file (e.g., conf/my_experiment.yaml):

dataset_root: /path/to/dataset
subject_mapping_csv: /path/to/subject_id_mapping.csv
subjects: [0, 1, 2] # Optional: specific subjects
ignore_segments: ["SEN_bad1", "SEN_bad2"] # Optional: segments to ignore
concurrency: 8
chunk_size: 5

Run with your config:

python -m algos.character_masks.demo.run_batch_generate_dataset --config-name=my_experiment

2. Using Default Config with Overrides

python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/path/to/dataset \
subject_mapping_csv=/path/to/mapping.csv \
concurrency=8 \
chunk_size=5

3. Override Specific Parameters

# Process only specific subjects
python -m algos.character_masks.demo.run_batch_generate_dataset \
--config-name=my_experiment \
subjects=[0,1,2]

# Ignore specific segments
python -m algos.character_masks.demo.run_batch_generate_dataset \
--config-name=my_experiment \
ignore_segments=["SEN_bad1","SEN_bad2"]

# Change chunk size for memory management
python -m algos.character_masks.demo.run_batch_generate_dataset \
--config-name=my_experiment \
chunk_size=3

Configuration Parameters

ParameterTypeDefaultDescription
dataset_rootstrrequiredPath to dataset root directory
subject_mapping_csvstrrequiredPath to subject_id_mapping.csv
subjectslist[int]nullSpecific subject indices to process
ignore_segmentslist[str]nullSegment names to ignore
concurrencyint4Concurrent threads per chunk
chunk_sizeint5Frames per chunk (keep small for memory)
show_resultsint5Number of sample results to display

Output Format

Results are saved as Parquet files with the following schema:

subject_index: int64                 # Subject index number
frame_id: int64 # Frame ID
segment_name: str # Segment name (e.g., "SEN_...")
expression_caption: str # Generated expression description
landmarks_detected: bool # Whether face was detected
landmarks_num: int64 # Number of landmarks (e.g., 478)
landmarks_coordinates: array # Numpy array of shape (N, 3) for [x, y, z]

Reading Results

import pandas as pd

# Read all results
df = pd.read_parquet('dataset_root/expression_generated/all_results.parquet')

# Filter to specific subject
subject_0 = df[df['subject_index'] == 0]

# Get frames with detected landmarks
detected = df[df['landmarks_detected'] == True]

# Access landmark coordinates (returns numpy array)
landmarks = df.iloc[0]['landmarks_coordinates']
print(landmarks.shape) # e.g., (478, 3)

Examples

Process all subjects with default settings

python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/data/faces \
subject_mapping_csv=/data/faces/mapping.csv

Process specific subjects with high concurrency

python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/data/faces \
subject_mapping_csv=/data/faces/mapping.csv \
subjects=[0,1,2,5,10] \
concurrency=16 \
chunk_size=10

Ignore specific segments

python -m algos.character_masks.demo.run_batch_generate_dataset \
dataset_root=/data/faces \
subject_mapping_csv=/data/faces/mapping.csv \
'ignore_segments=["SEN_bad1","SEN_bad2","SEN_bad3"]'

Troubleshooting

AVIF Loading Issues

If you get errors loading AVIF files, ensure pillow-heif is installed:

pip install pillow-heif

Memory Issues

If you run out of memory:

  • Reduce chunk_size (try 3 or even 1)
  • Reduce concurrency
  • Process fewer subjects at a time using subjects parameter

Hydra Output Directory

By default, Hydra creates output directories in outputs/ with timestamps. To disable this:

python -m algos.character_masks.demo.run_batch_generate_dataset \
hydra.run.dir=. \
hydra.output_subdir=null \
dataset_root=/path/to/dataset \
subject_mapping_csv=/path/to/mapping.csv