Skip to main content

RTC Calling System Design Document

Overview

The Pajama iOS app implements a sophisticated real-time calling system that integrates with iOS CallKit for native call UI experience while supporting multiple RTC providers (Daily.co, Stream Video). The system follows a protocol-oriented architecture with clear separation of concerns between UI coordination, CallKit integration, RTC providers, and backend services.

Getting Started

Running Calls Locally

  1. iOS Simulator Limitations: CallKit doesn't work properly in the iOS Simulator - it automatically hangs up calls. The system uses NoOpCallKitCoordinator in simulator builds to bypass CallKit while maintaining the same interfaces.

  2. Testing on Device: For full CallKit integration testing, you must use a physical iOS device.

  3. Provider Selection: The active RTC provider is selected at compile time in AppServices.swift:

    typealias AppRTCConfiguration = DailyConfiguration  // or StreamVideoConfiguration

Key Terminology

  • CallKit: iOS system framework for native call UI integration
  • CallKitCoordinator: Our business logic layer that orchestrates CallKit integration
  • RTCProvider: Vendor-agnostic protocol for video calling services (Daily, Stream)
  • GenericCallCoordinator: Main UI state manager for calling features

Call States

The system uses RTCCallState enum to track call lifecycle:

  • .idle: No active call
  • .incoming(callerId, callerName): Incoming call ringing with caller info
  • .outgoing: Outgoing call dialing (shows "Calling..." UI)
  • .joining: Connecting to call room (typically hidden from user)
  • .connected: Active call with media flowing
  • .reconnecting: Temporary network disruption, attempting to reconnect
  • .disconnecting: User initiated hang up, waiting for cleanup
  • .disconnected: Call ended, returning to idle

System Architecture

High-Level Component Diagram

┌─────────────────────────────────────────────────────────────────────┐
│ SwiftUI Views │
│ (CallView, ContactsView, etc.) │
└─────────────────────────┬───────────────────────────────────────────┘

┌─────────────────────────▼───────────────────────────────────────────┐
│ GenericCallCoordinator<Provider> │
│ (@MainActor, UI State Management) │
│ • Published state (isActive, callingState, participants) │
│ • Call actions (startCall, joinCall, acceptCall, etc.) │
│ • Audio/Video controls (toggleAudio, toggleVideo) │
└─────────────────────────┬───────────────────────────────────────────┘

┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌───────────────────┐ ┌──────────────────┐
│CallBackend │ │ CallKitHandling │ │ RTCProvider │
│Service │ │ (Coordinator) │ │ (Daily/Stream) │
│ │ │ │ │ │
│Backend API │ │CallKit Integration│ │Vendor SDK │
│Firestore │ │& Business Logic │ │Integration │
└─────────────┘ └─────────┬─────────┘ └──────────────────┘

┌─────────────┼───────────────┐
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌──────────────┐ ┌─────────────┐
│CallKit │ │CallKit │ │PushKit │
│Event │ │Event │ │Service │
│Reporter │ │Receiver │ │ │
│(TO CallKit) │ │(FROM CallKit)│ │VoIP Push │
└─────────────┘ └──────────────┘ └─────────────┘

Protocol Architecture

The system uses protocol-oriented design with clear separation of concerns:

CallKitHandling (Business Logic Orchestrator)
├── Inherits from CallKitEventReceiving + CallKitEventReporting
├── Coordinates between CallKit and RTC systems
├── Manages call state, caching, and participant tracking
└── Implements both inbound and outbound CallKit operations

CallKitEventReceiving (FROM CallKit TO App)
├── Defines callbacks triggered by iOS CallKit user interactions
├── Handles incoming call notifications from PushKit
└── Uses delegation pattern to forward events to business logic

CallKitEventReporting (FROM App TO CallKit)
├── Defines interface for reporting call states TO iOS CallKit
├── Pure reporting operations with no business logic
└── Async methods for actor compatibility

Component Responsibilities

1. GenericCallCoordinator<Provider> (@MainActor)

Primary Responsibility: UI state management and user interaction coordination

Key Functions:

  • Maintains @Published properties for SwiftUI reactivity
  • Handles user call actions (start, join, accept, reject, leave)
  • Manages audio/video controls and ready state
  • Bridges async operations to UI thread
  • Manages outgoing call ringtone and ring timeout behavior
  • Coordinates between RTCProvider and CallKitHandling

Threading: Runs on MainActor for UI updates, spawns Tasks for async operations

State Management:

  • REACTIVE ONLY: Subscribes to provider state changes via callStatePublisher
  • Publishes UI-ready state through @Published properties for SwiftUI
  • NO DIRECT STATE MUTATIONS: All state changes originate from the provider
  • Handles state-triggered side effects (ringtones, timeouts, CallKit reporting)

2. CallKitCoordinator (actor, implements CallKitHandling)

Primary Responsibility: Business logic orchestration between CallKit and RTC systems

Key Functions:

  • UUID Management: Maps backend callIds to unique CallKit UUIDs per session
  • State Synchronization: Reports RTC state changes to CallKit via reporter
  • Participant Tracking: Implements "Cancelled Call" behavior - delays reporting "connected" to CallKit until remote participants join, ensuring calls that end before anyone joins show as "Cancelled Call" rather than a timed call in iOS call history
  • Ring Timeout Management: Enforces 30-second timeout for unanswered calls with "No Answer" feedback
  • Cache Management: Stores call join information for CallKit flow
  • Event Delegation: Receives events from CallKitEventReceiver and coordinates responses
  • Composition: Uses CallKitEventReporter and CallKitEventReceiver via dependency injection

Threading: Actor for thread-safe state management

3. CallKitEventReporter (implements CallKitEventReporting)

Primary Responsibility: Pure outbound communication TO iOS CallKit

Key Functions:

  • Direct CallKit API interface for reporting call states
  • No business logic - just translation between app events and CallKit APIs
  • Uses CXCallController and CXProvider for CallKit operations
  • Handles CallKit transaction creation and execution

Threading: Can be called from any thread, handles CallKit's threading requirements internally

Design: Pure reporting class with no receiving logic or business state

4. CallKitEventReceiver (implements CallKitEventReceiving + CXProviderDelegate)

Primary Responsibility: Pure inbound communication FROM iOS CallKit

Key Functions:

  • Receives callbacks from iOS CallKit via CXProviderDelegate
  • Translates CallKit events into app-specific call events
  • Forwards events to CallKitHandling delegate for business logic
  • Handles CallKit action fulfillment/failure responses

Threading: Handles CallKit's threading requirements, forwards to delegate appropriately

Design: Pure receiving class that delegates all business logic to CallKitHandling

5. RTCProvider Protocol & Implementations

Primary Responsibility: SINGLE SOURCE OF TRUTH for all call state and vendor-agnostic RTC operations

Implementations:

  • DailyRTCProvider: Daily.co integration
  • StreamVideoRTCProvider: Stream Video SDK integration

Key Functions:

  • State Management: Complete ownership of call state transitions via callStatePublisher
  • Three State Control Methods: prepareForOutgoingCall(), failOutgoingCall(), handleCallTimeout()
  • Room joining/leaving operations
  • Audio/video control
  • Participant management
  • Video view creation

State Architecture:

  • Provider controls ALL state: CallCoordinator only reacts to state changes
  • Immediate UI feedback: prepareForOutgoingCall() provides instant .outgoing state
  • Consistent error handling: failOutgoingCall() reverts state on failures
  • Timeout management: handleCallTimeout() handles ring timeout scenarios

Threading: MainActor for UI integration, internal async operations

6. CallBackendService Protocol

Primary Responsibility: Backend API communication

Key Functions:

  • Call lifecycle management (start, join, leave)
  • Active call monitoring via Firestore listeners
  • Participant tracking on server side
  • Token management for RTC providers
  • Call metadata retrieval using all_participants field

7. Supporting Components

NoOpCallKitCoordinator: Simulator-safe CallKitHandling implementation that logs but doesn't interact with CallKit (iOS Simulator doesn't support CallKit and auto-hangs up calls immediately)

CallKitConfiguration: Shared CXProviderConfiguration between reporter and receiver

PushKitService: VoIP push notification handling with CallKitEventReceiver integration

Backend Architecture

Firestore Call Documents

Call state is persisted in Firestore at /calls/{callId} with the following schema:

/calls/{callId}
├── active_participants: string[] # Currently in the call
├── all_participants: string[] # Everyone invited (used for name resolution)
├── invited_participants: string[] # Invited but haven't joined yet
├── left_participants: string[] # Previously joined but left
├── caller_id: string # User who initiated the call
├── created_at: timestamp
├── updated_at: timestamp
├── ended_at: timestamp | null
├── ended_by: string | null
├── provider: string # "daily" or "stream"
├── provider_room_id: string # RTC provider's room identifier
└── status: "active" | "inactive"

Call ID Architecture

The system uses two distinct identifiers:

  1. call_id (UUID): Our internal tracking ID, matches Firestore document ID

    • Used for all internal operations and state tracking
    • Generated by our backend when call is created
    • Consistent across app restarts and rejoins
  2. provider_room_id (String): RTC provider's room identifier

    • Required for all RTC provider API calls
    • Different format per provider (Daily, Stream)
    • Must be paired with provider token for room access

Call Lifecycle (Server Perspective)

startCall() → Call document created with status: "active"
│ caller added to active_participants
│ others added to invited_participants

joinCall() → Participant moved from invited → active


leaveCall() → Participant moved from active → left


Last participant leaves → status: "inactive"
ended_at: timestamp
RTC provider room closed

CallBackendService Endpoints

  1. startCall(recipients): Creates call document, returns:

    StartCallResponse(
    callId: UUID,
    providerRoomId: String,
    providerToken: String? // Currently unused
    )
  2. joinCall(callId): Updates participant arrays, returns:

    JoinCallResponse(
    providerRoomId: String,
    providerToken: String?
    )
  3. leaveCall(callId, reason?): Updates participant arrays, returns:

    LeaveCallResponse(
    callEnded: Bool // true if last participant
    )

    Optional reason parameter tracks disconnect reasons for analytics:

    • .userLeft: Normal user hangup
    • .rejected: Call declined by recipient
    • .timedOut: 30-second ring timeout reached
    • .networkError: Connection lost
    • .serverError: Technical failure
  4. getCallMetadata(callId): Fetches call document

  5. listenForActiveCalls(): Real-time listener for calls where user is in all_participants

Server-Side Behavior

  • No automatic cleanup: Stale calls remain until explicitly ended
  • Participant arrays: Maintained by client-triggered backend calls
  • Room lifecycle: RTC provider rooms closed only when last participant leaves
  • No webhooks: Currently no integration with RTC provider webhooks
  • No server monitoring: Call quality/state monitoring requires explicit queries to RTC provider

Security Model

  • No token validation: Calls are "security through obscurity" via UUID
  • No participant authorization: Any user can join any call if they have the callId
  • Future token support: Scaffolding exists for provider token validation
  • Firestore rules: Standard app-level authentication required

This backend architecture enables call persistence across app restarts and provides the foundation for features like rejoin, call history, and multi-device support.

Call Lifecycles

App-Initiated Outgoing Call Lifecycle

User Taps Call in App


┌─────────────────────────────────────────────────────────────────┐
│ GenericCallCoordinator.startCall() │
│ 1. provider.prepareForOutgoingCall() → Immediate .outgoing state│
│ 2. Call CallBackendService.startCall() → Backend creates room │
│ 3. Resolve recipient display name for CallKit │
│ 4. Call CallKitHandling.startOutgoingCall() → Report to CallKit│
│ 5. Call RTCProvider.joinCall() → Join vendor room │
│ 6. On failure: provider.failOutgoingCall() → Revert to .idle │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ CallKitCoordinator.startOutgoingCall() │
│ 1. Generate unique CallKit UUID for session │
│ 2. Cache call join info │
│ 3. Delegate to CallKitEventReporter.reportOutgoingCall() → Registers call with CallKit │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ Provider-Controlled State Flow │
│ .idle → .outgoing → .joining → .connected │
│ │
│ PROVIDER publishes state changes: │
│ • CallCoordinator reacts via callStatePublisher subscription │
│ • CallKit status updates triggered by state changes │
│ • UI updates via @Published properties in CallCoordinator │
│ • Participant monitoring and ringtone management │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ Participant-Aware CallKit Reporting │
│ │
│ Connected to room: │
│ ├─ Has remote participants → Report "connected" (timer starts) │
│ └─ No participants → Wait for join (shows "Calling...") │
│ │
│ Participant joins → Report "connected" (timer starts) │
│ Hang up before anyone joins → "Cancelled Call" in call log │
└─────────────────────────────────────────────────────────────────┘

Incoming Call Lifecycle

VoIP Push Notification
(wakes/opens app if backgrounded/closed)


┌─────────────────────────────────────────────────────────────────┐
│ PushKitService Processing │
│ 1. Parse notification payload (callId, caller, room details) │
│ 2. IMMEDIATELY report to CallKit via callKitAPI.reportIncomingCall() │
│ (Required by iOS - must happen before completion handler) │
│ 3. Call CallKitEventReceiver.onIncomingCall() for app state │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ CallKitEventReceiver.onIncomingCall() │
│ 1. Forward to delegate (CallKitCoordinator) │
│ 2. Delegate handles business logic │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ CallKitCoordinator.onIncomingCall() │
│ 1. Generate unique CallKit UUID │
│ 2. Cache call join info │
│ 3. Notify RTCProvider → Update state to .incoming │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ iOS CallKit UI │
│ Both native iOS and in-app incoming call screens appear │
│ User can: Answer | Decline | Ignore │
└─────────────────────────────────────────────────────────────────┘

┌───┴───┐
▼ ▼
Answer Decline
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ CallKitEventReceiver CXProviderDelegate │
│ 1. Receives CXAnswerCallAction or CXEndCallAction │
│ 2. Forwards to CallKitCoordinator via delegate │
│ 3. CallKitCoordinator coordinates with RTCProvider │
│ 4. Fulfills or fails the CallKit action │
└─────────────────────────────────────────────────────────────────┘

Available Call Banner Lifecycle

Active Call Exists (User Not Joined)


┌─────────────────────────────────────────────────────────────────┐
│ CallBackendService.listenForActiveCalls() │
│ 1. Firestore listener detects active call document │
│ 2. Fetch call metadata with all_participants field │
│ 3. Update GenericCallCoordinator.activeCalls │
└─────────────────────────────────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│ SwiftUI Available Call Banner │
│ 1. Shows contact name using all_participants field │
│ 2. Displays "Join" button for user action │
│ 3. Banner appears above main content │
└─────────────────────────────────────────────────────────────────┘

┌───┴───┐
▼ ▼
User Taps User Ignores
"Join" Banner
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ GenericCallCoordinator.joinCall() │
│ 1. Set UI to active state │
│ 2. Check for cached call join info from CallKitCoordinator │
│ 3. Fetch call metadata from backend if not cached │
│ 4. Call RTCProvider.joinCall() → User joins existing room │
└─────────────────────────────────────────────────────────────────┘

CallKit UUID Management

Critical Design: CallKit UUIDs must be unique per call session, not per call.

Backend Call ID: "ABC-123" (persistent across reconnects)

├─ Session 1: CallKit UUID "UUID-1" (first join)
├─ Session 2: CallKit UUID "UUID-2" (after disconnect/rejoin)
└─ Session 3: CallKit UUID "UUID-3" (another reconnect)

This prevents CallKit conflicts when users disconnect and rejoin the same call.

Ring Timeout Behavior

Mimics FaceTime timeout behavior with 30-second ring duration:

Call Initiated (.outgoing or .incoming state)


Start 30-second timeout timer

┌───┴─── (Call answered/connected)
│ │
│ ▼
│ Cancel timeout
│ Continue normal call


Timeout reached (30 seconds)


┌───────────────────────────────────────────────────┐
│ Ring Timeout Handler │
│ 1. Stop ringing sound │
│ 2. Show "No Answer" message (2.5 seconds) │
│ 3. provider.handleCallTimeout() → .disconnected │
│ 4. Report call end to CallKit │
│ 5. Send leaveCall(reason: .timedOut) to backend │
│ 6. Hide call overlay after message timeout │
└───────────────────────────────────────────────────┘

Implementation Details:

  • Timeout Duration: 30 seconds (configurable via ringTimeoutDuration constant)
  • User Feedback: Brief "No Answer" message before returning to previous screen
  • Backend Tracking: Disconnect reason sent for analytics and call quality monitoring
  • CallKit Integration: Proper call end reporting to show correct call history

Participant-Aware CallKit Behavior

Implemented to mimic FaceTime behavior:

  1. Outgoing Call Started: Immediately reported to CallKit (enables native iOS call controls and call log integration) and opens full-screen live selfie view
  2. Room Connected: Check for existing participants
    • Has participants: Report "connected" → Timer starts
    • Empty room: Wait in "connecting" state
  3. First Participant Joins: Report "connected" → Timer starts
  4. Hang Up Before Anyone Joins: Shows "Cancelled Call" in call log

Threading Model

MainActor Components

  • GenericCallCoordinator: UI state management and SwiftUI integration
  • RTCProvider implementations: UI integration (video views, state publishing)

Actor Components

  • CallKitCoordinator: Thread-safe state management for CallKit coordination

Background Operations

  • CallBackendService: Network operations, Firestore listeners
  • Audio operations: Ring sounds played on background queue
  • CallKit reporting: Can be called from any thread

State Management Architecture

Provider-Centric State Control

The calling system implements a provider-centric state management architecture where the RTCProvider is the single source of truth for all call state. This design eliminates race conditions and ensures consistent state across the entire system.

Core Principles

  1. Single Source of Truth: Only the RTCProvider can change call state
  2. Reactive Updates: All other components subscribe to provider state changes
  3. Immediate UI Feedback: Critical user actions get instant state updates
  4. Consistent Error Handling: Standardized state reversion patterns

Three State Control Methods

The provider exposes three methods that give CallCoordinator controlled access to state management:

// 1. Immediate outgoing call preparation
func prepareForOutgoingCall()
// Sets state to .outgoing immediately for responsive UI

// 2. Error state reversion
func failOutgoingCall()
// Reverts to .idle if call setup fails

// 3. Timeout handling
func handleCallTimeout()
// Transitions to .disconnected for ring timeouts

State Flow Examples

Successful Outgoing Call:

.idle → prepareForOutgoingCall() → .outgoing → joinCall() → .joining → .connected

Failed Outgoing Call:

.idle → prepareForOutgoingCall() → .outgoing → joinCall() fails → failOutgoingCall() → .idle

Ring Timeout:

.outgoing → 30s timeout → handleCallTimeout() → .disconnected

Benefits

  • No Race Conditions: Only provider controls state transitions
  • Predictable State: All state changes flow through provider publishers
  • Testable: Mock providers can control exact state transitions
  • Maintainable: Clear ownership prevents scattered state mutations

Migration from Previous Architecture

The previous architecture had CallCoordinator directly setting callingState and isActive properties, leading to race conditions where state could flip-flop rapidly. The new architecture eliminates this by:

  1. Removing all direct state assignment from CallCoordinator
  2. Making provider the authoritative state source
  3. Converting CallCoordinator to purely reactive component
  4. Standardizing state control through the three provider methods

Key Design Decisions

1. Protocol-Oriented Architecture with Separation of Concerns

Why: Clear responsibilities, testability, maintainability Implementation: Separate protocols for reporting TO and receiving FROM CallKit Trade-off: More files and complexity vs clear interfaces

2. Composition Over Inheritance in CallKitCoordinator

Why: Avoid circular dependencies, single responsibility Implementation: CallKitCoordinator uses CallKitEventReporter and CallKitEventReceiver Trade-off: Thin wrapper functions vs clean separation

3. Actor for CallKitCoordinator Business Logic

Why: Thread-safe state management for CallKit integration Implementation: Actor with async/await boundaries to MainActor components Trade-off: Async boundaries vs race condition safety

4. Immediate CallKit Reporting

Why: Native iOS behavior matching FaceTime Implementation: Report to CallKit before joining RTC room Trade-off: Complexity in state management vs user experience

5. Participant-Aware Timer Logic

Why: Proper "Cancelled Call" behavior in call log Implementation: Monitor participants and delay "connected" state Trade-off: Implementation complexity vs correct UX

6. Provider-Centric State Management

Why: Eliminate race conditions and provide single source of truth Implementation: Provider controls all state via three dedicated methods Trade-off: Slightly more complex provider interface vs robust state consistency

7. Provider Abstraction

Why: Support multiple RTC vendors (Daily, Stream) Implementation: Generic RTCProvider protocol with specific implementations Trade-off: Generic interfaces vs vendor-specific features

8. Parallel Backend Operations

Why: Minimize user-visible latency during call operations Implementation: Background Tasks for non-critical operations Trade-off: Complexity in error handling vs perceived performance

Testing Strategy

Mock Architecture

  • MockCallKitCoordinator: Full CallKitHandling conformance with call recording
  • MockCallKitEventReceiver: CallKitEventReceiving simulation
  • MockCallKitEventReporter: CallKitEventReporting simulation
  • MockDailyRTCProvider: Controllable RTC state for UI testing
  • MockCallBackendService: Backend API simulation

Test Coverage Areas

  • State Transitions: All RTCCallState progressions
  • CallKit Integration: UUID mapping, state reporting, event delegation
  • Protocol Conformance: All protocol methods implemented correctly
  • Threading: MainActor isolation, actor boundaries
  • Error Scenarios: Network failures, invalid states
  • Edge Cases: Rapid connect/disconnect, participant changes

UI Testing

  • TestScenarioConfig: Configurable mock behaviors for different test scenarios
  • CallKit Simulation: NoOpCallKitCoordinator for simulator compatibility
  • State Verification: Published property changes in GenericCallCoordinator

Monitoring & Debugging

Performance Monitoring

  1. View Performance: Monitor SwiftUI view lifecycles during call state changes:

    • Simple struct view re-inits are cheap (expected)
    • body recomputes should be minimal
    • makeUIViewController is heavy (should be rare)
    • updateUIViewController should have efficient update logic
  2. CPU Usage: Use OSSignpost markers with Instruments to verify:

    • CPU usage should stay below 50% during active calls
    • Look for spikes during state transitions
    • Profile video rendering performance on device
  3. @Published Optimization:

    • Check frequency of @Published field updates
    • Warning: Setting a field to its current value still triggers emission
    • Implement value checks before assignment to prevent redundant updates

Testing Workflows

CallKit Testing

Best approach: Call FROM iOS Simulator TO physical device

  • Simulator can initiate outgoing calls (CallKit works for dialing)
  • Physical device receives full CallKit integration
  • Tests both outgoing and incoming flows effectively

Video Call Testing

Device Requirements:

  • Simulator ↔ Device: Limited usefulness
    • Simulator can't render incoming video (Daily.co OpenGL bug)
    • Simulator has no camera for outgoing video
  • Best: Two physical devices

Workaround for Solo Testing:

  1. Start call from iOS device
  2. Grab call info from Firestore database
  3. Join from laptop browser
  4. Use OBS Virtual Camera + Firefox for portrait video simulation
    • Chrome and Safari have compatibility issues
    • Firefox handles the virtual camera stream properly

File Organization

Core/Calling/
├── CallBackendService.swift # Backend API integration
├── CallCoordinator.swift # GenericCallCoordinator<Provider>
├── CallKit/
│ ├── CallKitConfiguration.swift # Shared CallKit config
│ ├── CallKitCoordinator.swift # Business logic orchestrator
│ ├── CallKitEventReceiver.swift # FROM CallKit (CXProviderDelegate)
│ ├── CallKitEventReporter.swift # TO CallKit (CXCallController)
│ ├── NoOpCallKitCoordinator.swift # Simulator-safe implementation
│ ├── Protocols/
│ │ ├── CallKitEventReceiving.swift # Inbound protocol
│ │ ├── CallKitEventReporting.swift # Outbound protocol
│ │ └── CallKitHandling.swift # Business logic protocol
│ ├── PushKitService.swift # VoIP push notifications
│ └── PushKitTokenManager.swift # Token management
├── RTC/
│ ├── RTCProtocols.swift # Provider abstraction
│ ├── RTCProviderFactory.swift # Factory protocol
│ ├── Providers/
│ │ ├── Daily/ # Daily.co implementation
│ │ └── StreamVideoConfiguration.swift # Stream Video implementation
│ └── Views/ # Video rendering components
└── Views/ # Call UI components

Integration Points

SwiftUI View Integration

The calling UI is integrated at the app root level using the withCallOverlay modifier in PajamaApp.swift:

PajamaApp (root)
├── mainAppContent
│ └── .withCallOverlay(callCoordinator)
│ ├── CallView (full-screen calling UI)
│ └── AvailableCallBanner (persistent banner for joinable calls)

This pattern ensures:

  • Call UI appears above all other content
  • Single source of truth via GenericCallCoordinator
  • Reactive updates via @Published properties

Experience System Integration

ExperienceManager relies on an active call to enable shared experiences between participants. The call provides the real-time communication channel that experiences build upon.

Common Issues & Troubleshooting

Testing & Development

  1. "Call immediately ends": You're likely running in the iOS Simulator without NoOpCallKitCoordinator, the "join call" API call failed to our backend, or there was an error connecting via the 3P RTC video provider.

  2. "No incoming call UI": Ensure VoIP push notifications are properly configured and the app has CallKit permissions.

  3. "Can't see contact names": Check that the all_participants field is populated in Firestore call documents.

Adding a New RTC Provider

  1. Create a new configuration implementing RTCProviderFactory
  2. Implement the RTCProvider protocol with your vendor's SDK
  3. Update the AppRTCConfiguration typealias in AppServices.swift
  4. Ensure your provider publishes state changes to the required publishers

Common Pitfalls

  • UUID confusion: CallKit UUIDs are per-session, not per-call
  • State timing: Don't report "connected" to CallKit until participants actually join
  • Threading: CallKit callbacks may come on any thread - use proper actor isolation

Error Handling & Recovery

Call Connection Failures

When a call fails to connect:

  1. UI shows error state via GenericCallCoordinator
  2. CallKit is notified to end the call
  3. User can retry via UI (creates new call session)

Network Disruptions

The system handles network issues through RTCCallState:

  • .reconnecting: Temporary disruption, automatic recovery attempted
  • .disconnected: Permanent failure, user must manually retry

State Recovery

  • App Crash/Kill: Active calls tracked in Firestore enable recovery on restart
  • CallKit Sync: CallKitCoordinator maintains UUID mappings for proper state sync
  • Available Calls: Firestore listeners automatically detect and display rejoinable calls