what a real holosuite actually is (today)

It’s not one magic display. It’s an integrated stack:

Visual immersion
- Near-eye headsets for now (e.g., retina-class, eye-tracked, low-latency MR/VR) and, later, room-scale light-field/holographic walls.
- Key problems to beat: latency (<20 ms motion-to-photon) and vergence–accommodation conflict (VAC) (the eye focuses at a fixed screen but converges at virtual depths). Apple Vision Pro shows the state of high-end near-eye displays & spatial audio; VAC reduction requires true light-field/holographic displays or varifocal systems. ScienceDirect+4PMC+4ScienceDirect+4
- Group-viewable light-field panels already exist (dozens of views, no glasses) and can tile into walls. lookingglassfactory.com
Locomotion
- Omnidirectional floors let you walk “anywhere” in a small room; Disney’s HoloTile demoed a multi-user version (research stage). YouTube+1
Touch / haptics
- Mid-air ultrasound haptics focuses acoustic pressure to your skin (contactless buttons, shapes, gusts). Mature research & products exist; acoustic phased arrays can even levitate/steer tiny objects (“acoustic holograms”). Bruce Drinkwater+4ResearchGate+4support.ultraleap.com+4
- Complement with body-worn haptics/exosleeves for force & weight illusions (commercial gear exists), plus fans/heat/cold for environmental cues.
Spatial audio
- High-order ambisonics or beamformed arrays + personalized HRTFs for believable distance/elevation; Vision Pro-style audio ray tracing shows state of the art. Apple
Scent & atmosphere
- Controlled micro-dosing scent emitters; wind, temperature, humidity for presence.
World capture & rendering
- 3D Gaussian Splatting has become the “JPEG moment for spatial computing”: fast, photoreal scene capture & realtime render from videos/phones—ideal for rapidly populating holosuite worlds. (NeRF successor; real-time with high fidelity.) The Verge+3arXiv+3repo-sam.inria.fr+3
AI actors & simulation
- On-device/edge LLMs + behavior trees for NPCs; physics for objects; safety guardian.

the core math you’ll actually use

1) display & optics (light-field / holography)

Light-field sampling (spatio-angular Nyquist): to avoid aliasing, sample spatial pitch $\Delta x$ and angular pitch $\Delta \theta$ so that scene spatial frequency $f_x$ and disparity remain under Nyquist. Practical rule: pixels per degree (PPD) ≥ 60 and views ≥ 32–100 for multi-viewer walls; otherwise VAC & swim occur. (Sampling analyses from diffraction/light-field literature.) MDPI+1
Holography (Fourier optics): fringe spacing $d \approx \frac{\lambda}{2\sin(\theta/2)}$ . To steer angle $\theta$ , SLM pixel pitch $p$ bounds max angle: $\theta_{\max} \approx \sin^{-1}\!\left(\frac{\lambda}{p}\right)$ .
VAC mitigation target: provide true focus cues; varifocal: dynamically set focal distance $f(t)$ to vergence depth; light-field/holography: render correct wavefront for accommodation at depth $z$ . VAC is what makes long sessions uncomfortable. PMC+1
Latency budget: motion-to-photon $< 20\,\mathrm{ms}$ (preferably < 12 ms) to avoid motion sickness:
$t_{\mathrm{total}} = t_{\mathrm{head\ track}} + t_{\mathrm{render}} + t_{\mathrm{scanout}} + t_{\mathrm{display}}$

2) locomotion (omni floor)

Control: closed-loop body tracking yields desired floor velocity $\mathbf{v}_f$ keeping user near room center while making world-space motion $\mathbf{v}_w$ feel natural. Basic law:
$\mathbf{v}_f = G\left(\mathbf{p}_\text{user}\right) - \mathbf{v}_w,$
with stability via PID/LQR on user position. (Disney HoloTile proves feasibility.) YouTube

3) mid-air haptics (ultrasound)

Acoustic radiation pressure at a focal point (simplified):
$F \propto \frac{2\alpha I}{c},$
where $I$ is intensity, $c$ sound speed, $\alpha$ depends on medium/skin. A phased array solves per-transducer phase/amplitude to maximize focal pressure subject to safety. (Ultraleap describes the control-point solver.) support.ultraleap.com
Levitation / “acoustic holograms”: optimize array phases $\phi_i$ so the superposed field yields target pressure nodes; Bristol’s work shows single-sided levitation and manipulation. University of Bristol

4) spatial audio

Ambisonics order $N$ sets spatial resolution; channel count $(N+1)^2$ . Real-time binaural rendering uses listener pose to rotate the soundfield; time-of-flight & occlusion from geometry yield audio ray tracing.

5) real-time world capture

3D Gaussian Splatting objective (very high level): fit Gaussian set $\{ \mathcal{G}_k(\mu_k,\Sigma_k,\mathbf{c}_k) \}$ to minimize photometric error across views while enforcing visibility; rendering is alpha-composited splats sorted by depth. It trains in minutes and renders at 60–200 FPS on a good GPU. arXiv+1

reference architecture (modular holosuite)

Room shell (6–8 m² min):

Acoustic treatment, blackout, HVAC, power & thermal headroom (~1–2 kW per user for GPUs/displays/actuators).

Sensing:

Ceiling/floor depth cams + IMUs (inside-out tracking), eye tracking (for foveated rendering), SLAM.

Visual layer (choose path):

Path A (near-term): high-end MR/VR headsets (e.g., Vision Pro class) for each user + projection surfaces for peripheral ambience. PMC
Path B (mid-term): tileable light-field panels (e.g., Looking Glass-style) for group no-glasses 3D; add smaller near-eye for close-up tasks. lookingglassfactory.com

Locomotion:

Omni floor (HoloTile-like) with center-hold control; fall-prevention & emergency stop. YouTube

Haptics:

Ultrasound arrays at waist/desk height + ceiling for touch cues; wearable vibro/force bands for sustained forces. ResearchGate

Audio & atmosphere:

Beamformed speaker arrays + sub; scent/wind/heat modules.

Compute:

Multi-GPU server (path tracing + Gaussian splats), <12 ms pipeline; real-time physics & AI agents.

Content:

Library of Gaussian-splat scenes; photogrammetry; procedural worlds; AI-driven NPCs. arXiv+1

build it in phases (pragmatic roadmap)

Phase 1 — Foundational “room-VR lab” (1–2 months)

Headsets + tracking; spatial audio; fan/heat modules; <20 ms motion-to-photon target. Capture a few spaces with 3D Gaussian Splatting to experience instant photoreal worlds. arXiv+1

Phase 2 — Atmosphere & haptics (2–4 months)

Add mid-air ultrasound haptics unit for touchable mid-air buttons/textures; integrate with engine events. support.ultraleap.com

Phase 3 — Locomotion (research/procurement)

Integrate an omnidirectional floor (or a lower-cost treadmill proxy) with safety rails & E-stop; tune controller to keep user centered while world motion feels natural. (Study HoloTile principles & demos.) YouTube

Phase 4 — Group view walls (pilot)

Install light-field panels for “helmet-off” scenes and spectators; sync with headset users so everyone shares one world. lookingglassfactory.com

Phase 5 — Reduce VAC / increase comfort (ongoing)

Experiment with varifocal optics or near-eye holography research techniques to lessen VAC symptoms for long sessions. ScienceDirect+1

Phase 6 — Content pipeline

Standardize capture via phone rigs or drones → Gaussian splats (minutes to train) → live in holosuite; mix with physically-based rendering & AI NPCs. arXiv+1

Safety & policy

Enforce exposure limits (lasers/IR, ultrasound SPL, scent allergens), fall protection, emergency lighting, and accessibility.

performance targets (rules of thumb)

Visual: PPD ≥ 35 (min), aim 60+; FOV ≥ 100°; MTP latency ≤ 20 ms; effective multi-view count ≥ 50 for walls.
Audio: localization error < 5°; RT60 < 300 ms (treated room).
Haptics: focal refresh ≥ 200 Hz; safe skin exposure; perceptible pressure peaks ~100–300 mN on fingertips (device-specific). ResearchGate
Locomotion: center error < 0.5 m; braking < 300 ms; fall-probability minimized via predictive control.
Render: 90–120 FPS per user eye; Gaussian-splat scenes 100–200 FPS on modern GPUs. arXiv

how it “feels” together

Your eyes see correct parallax, occlusions, and (in time) focus cues.
Your ears hear sound rays bounce realistically from virtual geometry.
Your skin feels air, heat, and mid-air tactile points.
Your feet keep walking “through” worlds while staying in one room.
Your brain gets the right multisensory correlations at low latency—this is presence.

good research threads to follow

Mid-air ultrasound haptics (surveys & tech notes). ResearchGate+1
Acoustic levitation & “acoustic holograms.” University of Bristol+1
VAC: why VR makes some people ill & how light-fields help. PMC+1
Light-field/holographic sampling & diffraction constraints. MDPI+1
Vision Pro specs & spatial audio cues (today’s high end). Apple
Omnidirectional floors (Disney HoloTile demos). YouTube
Gaussian splatting for instant 3D worlds. arXiv+2repo-sam.inria.fr+2

bottom line

A holosuite isn’t a single breakthrough; it’s an orchestra of displays, acoustics, haptics, locomotion, capture, and AI—played with ruthless attention to latency, focus cues, and multisensory alignment. Nearly every subsystem is here now in some form; stitching them into a safe, reliable, multi-user room is the engineering art.

If you want, I can package this into a neat project brief (with a bill-of-materials per phase) or a pitch deck for funding a holosuite pilot.

Beers Blogs for the people

Friday, September 12, 2025

Building the Holosuite: The Science and Technology Behind Real Immersive Worlds