Hardware Hack Lab

Exploring Virtual Reality and 3D Video

The tech consortium NYC MediaLab just released an interesting report on the current state of Virtual Reality. It looks at some of the tech, a timeline of recent events and discussion on business opportunities.

The Jaunt Neo
The Jaunt Neo 3D camera system

In this article I'll summarize some of the tech and experiential findings, and I'll dig in a little on some of the fascinating developments in 3D video.

Future Reality?

Before we start, it's worth noting the report kicks off with the phrase 'Future Reality', borrowed from Ken Perlin, as an alternative to 'Augmented Reality' and 'Virtual Reality'.

Future Reality is an odd phrase. It's really trying to say that this isn't a modified version of reality, this will actually be reality - once the tech is ubiquitous.

This is helpful when you are thinking medium/long term, but in the short/medium time frame this report is really looking at it doesn't work. The rest of the report only talks about VR. To learn about this 'future reality' trajectory check out my write-up of Ken Perlin's ideas here.

The visceral experience

As for interesting things happening right now, the report discusses the difference audience members feel with VR versus regular video. It points to the development of studio-freindly 3D video and production platforms like Jaunt.

Clouds over Sidra

The 3D video Clouds over Sidra was shot with hardware like this. It allows you to spend time with a 12-year old Syrian refugee in Jordan. At the other end of the spectrum, Game of Thrones is a synthetic, animated environment designed to portray a fictional world.

In both, the experience is more visceral for the audience. The report says:

"With virtual reality, the transaction between an audience and the story is fundamentally changed – and pared down. Established mediums ... still rely on the removed - and unmoved – perspective of their viewer. Virtual reality bridges that gap..."

The language here is one of the birth of a new medium. It reminds me of the stories of early silent film, in which cinema audiences were said to panic at the sight of a train coming towards them (on the screen).

Whether or not those audiences actually panicked, they probably had the same sort of bodily reactions to the ones we have now when we experience VR. This comparison makes me wonder how much these reactions will wear off as people become accustomed to VR over time.

Realtime 3D video streaming

The report also mentions investment in live streaming company NextVR. Leading on from this, it's not hard to imagine 3D video being distributed in a similar way to Netflix or Skype now.

Streaming video via NextVR
Streaming video via NextVR (credit evrydayvr / public domain)

For this to make sense the broadcaster would need to have at least one 3D camera. I don't see this becoming a norm for Skype calls. But perhaps things like sporting events and concerts - especially if you have the option to switch between cameras at will.

The State of VR today

The report has a handy 'state of play' section that goes over the birds-eye view of the industry.

1. Distribution

It begins with distribution, for which the challenges appear to be bandwidth, and horsepower of playback devices such as phones. As you can imagine these are all below par right now, but expected to ramp up to the required levels in the 2-5 year timeframe.

Another distribution problem is how users find content. Currently there is relatively little content and it is distributed inconsistently over several small distribution outlets. Typically users find content through an app store or targeted promotional outreach.

2. Display

This is the part that most of us have already heard about, whether you are in the industry or not. But for completeness, let's go over it.

Displays are divided into smartphone-based, like the Samsung GearVR and Google Cardboard and non-smartphone based like the Sony PlayStation VR, HTC Vive and Oculus Rift.

Unboxing Google Cardboard
Unboxing Google Cardboard (credit brownpau / cc)

Of the latter three headsets, only the Oculus is seriously discussed, because it's community have driven the way in this space. The benefit with a unit like this is the sleek experience. The drawbacks are cost and platform requirements.

What this means is it's generally better to be running Windows on a beefed up machine. By contrast, the smartphone-based systems are cheap and run with standard phones.

3. Production

Here's where it gets kind of interesting, and I'll mix in a little of my own research here. How do you produce content and what are the features of each technique?

The report says you have three options to produce VR content: record video, animate 3D or a hybrid between the two. That's true but in reality there's a fair gray area in the hybrid space.

3a. Recording 3D video

When it comes to video, the standard approach is to build a rig using 6 or more cameras facing different directions. You can see this in action with GoPro cameras hooked up using 360 Hero products.

GoPro and 360 Hero

This is great for giving you an immersive 3D 'bubble' to explore, but it gets frustrating for users because although you can look around the bubble, you can't move. The action is recorded from one spot via live synchronised video streams, so you have to be wherever the camera was.

You can get around this by recording with two or more camera rigs in different locations. Then you have multiple spots to view the action from. But you have to flick awkwardly between fixed camera rig positions, and smooth transitions are seemingly not possible.

3b. A hybrid approach

Matterport hardware is all about architectural mapping. The product is basically three primesense sensors and a hardware/software production suite.

Matterport product shot
Matterport product shot

Here's how it works. First, it uses the mounted depth sensors and RGB camera to allow you to build a grainy but accurate mesh of an architectural interior. This is a bit like the Structure sensor but specifically designed for architectural mapping.

Second, at set locations within that space it allows you to take what amount to 360 degree images. This part is effectively the same as the panorama apps on your phone. The end result is that you have a series of hi-res 3D 'bubbles' situated inside a lo-res navigable space.

Matterport goes VR

This technique is pretty cool, however there are a couple of drawbacks. First, you can't do live action with it, since you don't have enough camera positions to take in the entire 360 degrees without moving the camera.

Second, as a default, you end up with a Google StreetView-style navigation between the 'bubbles' in the space. For fully 'immersive' applications this is real pain and can feel clunky.

To get past this, Matterport offer a cloud service to which you upload your captured content. There they run a series of intensive computer vision algorithms, merging the hi-res 'bubble' data with the lo-res map. This produces what they say is a smooth and 'highly realistic' 3D mesh, which you can then traverse freely.

That sounds great but I've yet to see a solid example of it. It probably takes a while. The video example of it on the Matterport website is really small, and all the promo videos seem to use some form of StreetView-style navigation.

3c. Another depth video hybrid

The DepthKit hardware allows you to record live-action volumetric video which is not 360 degrees. It uses the same technique that Matterport uses to build it's grainy map, and embraces the graininess and lack of 360.

The guys at the Specular studio who made the DepthKit have been demonstrating that you can do interesting things even within these limitations. Check out the 'Blackout' kickstarter video:

'Blackout' by Specular

To situate the watcher in an immersive 360 environment they have meticulously recreated a subway car as a virtual model based on photos of the real thing. This allows you to navigate the virtual space like a game, getting you away from Streetview navigation.

On top of that they apply the grainy live-action technique, filming people in a studio who are then 'placed' in the subway car. These people have occlusions - portions of their bodies you can't see. However they do have a realness that animated characters don't usually have.

3d. The obvious one

Last but not least is just building 3D virtual environments and animated characters. Here you are talking about established game development platforms like Unity and Unreal, or web platforms like WebGL and three.js.

This gives you a lot of creative control, but it's a long way from live action. These environments are much more mature as people have been playing 3D computer games and rendering 3D graphics for a long time now. It's not a huge stretch to now explore that in VR.

They grow up so fast

I'll end with the generic get-out clause that justifies interest in emerging technology. These trajectories are moving quickly, and converging on each other. It's easy to dismiss technologies when they are at an immature stage, but I'd say keep your eye on them, and learn about these limitations.

They will mature, so the only question is whether you want to hang back and let everyone else to figure it out before you get involved.