Aligning Real-World with 3D Space

This weeks test involved developing a system which I can accurately align the real world position of the camera in Blender. Because the position of the iPhone from ARKit is always (0,0,0) when initiated, the location of the iPhone must be accurately recreated in 3D space beforehand. And its position must alway be relative to the space you are in.

The way I approached this is by positioning the camera and iPhone rig in approximately the center of the room (in this case the living room is where I am currently working). To easily and accurately get the measurements of the room I did a quick 3D scan of the living room just using my iPhone, and Meshroom to create a 3D model of the space. Then I was able to place the virtual camera in Blender at (0,0,0), and measure the distance to a recognizable spot in the real world (in my case I chose the bottom corner of the cabinet that lies directly in front of the camera), and then align the same position of that spot on the 3D model to be the same distance from the virtual camera. Then, measure the initial height of the iPhone and add that value to the incoming Z (height) position. In my case, the beginning coordinates of the virtual camera was (0,0,0.75), and the distance from the corner of the cabinet in real world and virtual world was approximately 2 meters. (Click on the images below).

What’s great about this approach is that whenever the camera needs to be reset (for a new shot, or take), the camera can be returned to the tripod, and the camera will still be accurately aligned in 3D automatically. To test accuracy, I compared the 3D viewport with the live camera footage - if all the measurements are correct, then the footage from Blender should match almost perfectly to what the live camera see’s - and because the 3D viewport is a 3D model of the same room, the footage is very easy to compare. Here’s a video (keep an eye on the box on the floor - it’s an easy object to compare position and rotation):

I am very happy with these results - the live camera feed and the viewport match very well. It looks like this approach will work!