Holographic Capture using Multiple Kinect - A DIY Approach with Lessons Learned!

Lessons Learned Creating a DIY Holographic Capture Rig using Multiple Kinect Sensors

I’ve always had a strong interest in technology and photography and consider myself to be an early adopter. I also have a passion for graphics, virtual reality and have a background in software architecture and development. Several years ago, I acquired a Kinect for Windows device, but because of USB issues on my main PC, I didn’t get around to playing with the hardware for a long time. Kinect can be notoriously fussy about which USB 3 controllers it will work with, this is a well-documented issue. Eventually, I managed to get a USB expansion card that played nicely with the Kinect and I fired up the SDK examples to see what it could do. I was impressed, even if there was a lot of depth noise, particularly at the edge of objects - so called flying pixels.

I’m not sure when it happened, whether it was prompted by getting an Oculus VR kickstarter headset (DK1) or by the birth of my daughter, but at some stage I started dreaming of capturing the world around me, for what I would call the ultimate home movies. Technically, I guess this is most commonly referred to as holographic capture. I would think about optimal configurations of Kinect cameras for live action capture (a circle of Kinects facing inwards to create a capture volume of about 2x2x2m) or environment capture (an arrangement of vertically mounted Kinects facing out, mounted a tripod). In most of these scenarios, I thought I would need 6-8 Kinect devices, due to the FOV of the sensor and a need to adequately capture the subject. Of course, totting up the cost of such a system proved off-putting, particularly for a hobby project, but the idea wouldn’t go away.

About 3 or 4 years ago, I happened across an article about ultra SFF PCs that were on clearance at various UK online stores for about £85 per unit. The model was the HP 260 G1. The specs and design of these PCs was attractive for use with the Kinect. They had USB 3 ports, had an adequate integrated GPU, could support 4GB of RAM or more, had M.2 slots for fast drive access, space for an internal drive, could be mounted and were really compact (177 x 175 x 34 mm (6.9 x 7 x 1.3 in)). There were two issues though. Would the USB 3 ports be compatible with the Kinect and how would the processor (a humble Celeron 2957U) cope with the Kinect if the port worked? I caved almost immediately and bought one to test. It worked with the Kinect and achieved about 24fps in the Configuration Verifier. Like many systems, the configuration verifier often states “usb port with unknown bandwidth”, but works fine. Over the next month, I justified buying another 2 of these PCs before they went out of stock and were impossible to source. I upgraded the RAM, found the custom HDD cable required to add a larger internal HDD and bought new Kinects, adapters and tripods. However, the amount of free time I had to devote to this was limited. There were unresolved issues with this system however. How could I trigger a recording across multiple nodes and would interference be an issue? Kinects use time of flight (ToF) sensors that can interfere with each other producing unstable depth values.

While I have the skills to code a solution for driving more than one Kinect, free time is a serious constraint these days. While researching point cloud processing solutions, I discovered Brekel’s Point Cloud software and a post about the multi-sensor alpha. I bought a copy of PointCloud V2 and gained access to the alpha. I also bought an 8-port gigabit Ethernet hub and the necessary cables to connect the PCs.

At the same time, I started thinking about system mobility. It would be an effort to have to move this system out of my office – even with only three nodes it was already a mess of cables and boxes. Could I find something to host the kit and achieve two objectives? (With one constraint – I could do the work).

Ideally, I wanted to have only one power cable going into a portable ‘box’ that contained all the PSUs, adapters, the network hub and PCs
To have only one lead coming out of the box for each Kinect camera

I found an old wooden storage box and realised that I could probably fit up to 5 PCs in there. Using a normal power drill I could route cables as required. There were some size constraints though and I needed to make things as compact as possible. For maximum ‘compactness’ I bought:

right angled USB3 adapters and right angled power adapters to minimise space for cable routing
· ultra-short network cables (and then ended up buying longer ones!)
clover leaf and figure of 8 power connector adapters and C14 splitter cables to reduce the need for several power cables to fit in the box + the necessary sockets
a compact multi-port power socket
a C14 to UK 3 pin socket adapter for the network hub

By now I was on eBay watching all HP 260 G1 auctions (and variants) to see what I could pick up cheaply and managed to snag a bargain, getting two i3 PCs at a pretty good price.

A word on keeping costs low. I’ve watched eBay for years now and know when I see a decent price for an HP PC. However, the Kinect sensors can be purchased more cheaply at second hand gaming outlets, mainly because so many Xbox One gamers were selling them off - at launch the sensor was bundled with the console, but the requirement to have a Kinect was later dropped. I don’ think I’ve ever paid more than £30 for a sensor. I was also fortunate that I bought most Kinect for Windows (and later Xbox One S) adapters before they were discontinued and hit crazy prices. They used to be £30.

So, I now had my five-unit set-up. However, one issue remained (and is still a problem). Calibrating the relative sensor positions is difficult. The marker calibration system supported by the Brekel alpha can be unreliable and would need to be run each time the sensors were moved. This was a nonstarter on my set-up.

About this time, my son was born and I lost access to my office (and gained even less free time!). All the kit went up in the loft and remained packed away due to lack of space. However, the idea of the holoport refused to die.

I kept buying more PCs to get to my ideal 8 cameras set up. I cleared a space in the loft and decided to fix the position of each camera, rather than use tripods. I also began to tackle a few other issues. I didn’t find TeamViewer an ideal solution for remoting into each PC. So, I found a cheap USB and VGA KVM style hub on eBay. I also connected the slave PCs to my main PC for running the multi-sensor client. Finally, I investigated using a Vive tracker to determine the relative position and orientation of each node.

In the case of the Vive tracker, I thought I may have issues with it working while the Kinects were running, due to both systems use of IR sources. This didn’t appear to be the case, but I feel that it might have been a case of pot luck (tip, you could always use the sync cable between lighthouse units). Unfortunately, the brief amount of time I devoted to the co-ordinate transform issue wasn’t enough to solve it and I couldn’t transform the data correctly to get the Brekel software to align the sensors, so I went back to the marker solution rather than invest any more time in an alternative approach.

Around this time disaster happened. Microsoft announced they were discontinuing the Kinect (I wasn’t too worried about this) and the adapter. The latter was a real issue. Stock rapidly disappeared, prices went crazy and I had only 5 adapters - not enough for my 8-node ‘dream’ configuration. However, I found that the Kinect can be hacked to take a conventional USB3 lead and a DC power source. This hack worked and has some advantages:

I could decide the length of cables (USB and power)
I didn’t need an annoying extra adapter break out box
I could power 4 Kinects from a single CCTV power brick
Most of the length of the ‘cable’ could be wrapped as a single cord
I could sell my Kinect adapters and recoup some of my investment in this mad hobby project!

HP Rack – HP Image

Now, by this time I had even more PCs and bits and bobs strewn around the loft. I needed a new box. On this front, I had a couple of breakthroughs. By luck I discovered that HP had designed a server rack mount for the SFF PCs line that fitted 8 systems into a standard 19” rack, but once again they could only be sourced from eBay at inflated prices. With time and patience, I managed to snag once at a more sensible price. I also found a compact rack case that is not as deep as a standard server rack but is the same width (it appears audio visual gear also uses the same width kit as standard servers). This should give me a portable solution. A few other upgrades have also happened – a cheap HDMI 8 port switch + DisplayPort adapter cables (I couldn’t stand the quality of the VGA output), a 16-port gigabit network hub and shelves for the server rack case. I also wanted the system to be easy to configure, so I added a KeyStone jack back plate to the server rack that would enable me to mount DC and USB3 ports for each Kinect and HDMI and network ports should I want to connect to an external PC and display. It will be just 'plug and play' - nice. To ease management of the holoport hardware I’ve adopted a naming system for each node based on a colour. The PC has the name, the cables are colour coded, the desktop background is set to the PC colour, the Kinect, KVM switch boxes and ports all are colour coded. It just makes diagnosing problems and checking set-up so much easier.

Current Status

That brings things pretty much up to date. I’m currently in the process installing all the kit and have one more Kinect camera to adapt and add as a node.

In terms of system performance, I still have issues with the sensor position calibration. I’m really hoping that JasperBrekelman’s research on multi sensor body alpha will lead to a better calibration solution coming to PointCloud V3, which looks likely. I have managed to configure the relative position and orientation of 6 out of 7 nodes, but it’s a frustrating process that has to be redone if a sensor is accidentally knocked. For proof it works, here’s a point cloud of my loft captured from six Kinects running simultaneously and captured with the Brekel software.

A few other observations from my journey

· At some stage I want to be able have a system that can work outside, ideally in sunlight. Kinect sensors are swamped by sunlight. But there are several alternatives. Microsoft themselves now recommend Kinect developers move to the Intel Realsense D400 series of depth cameras. I have one RealSense D435 camera to evaluate, which is the wide-angle version. The pros of these depth cameras relative to the Kinect are:

o They can work outside (they use stereo IR cameras with an optional active IR projector to inject structure into a scene, if needed). This means they can work in sunlight.

o They are really small.

o They have higher, configurable depth resolution than the Kinect v2.

o In theory, they don’t interfere in a multi-sensor system. However, there are some unexpected system design choices quirks, particularly with the D435 regarding synchronisation. Read more here.

o They are powered by USB, so just one cable and no need for a dedicated power source.

o They are supported by PointCloud V2 and the multi-sensor alpha.

o More than one can be connected to a PC and the performance has been characterised.

> See here.

o However, these are relatively expensive compared to buying Kinect on the second-hand market (£30 vs £150+) – particularly given that I have the PCs to drive the Kinect now. I also feel the depth performance is not as good as the Kinect and there is no directional audio capture. There’s more about tuning the depth performance in this whitepaper.

· There are other solutions to capturing point cloud data from multiple Kinects, including LiveScan3D and one on SteamVR - sorry, I can’t recall the link.

I’ve seen bad interference between Kinects, but most of the time they are fine.

o There was a research paper “Resolving Multipath Interference in Kinect: An Inverse Problem Approach” that used custom firmware to enable custom modulation frequency to enable Kinect sensors to perform better. Unfortunately, this firmware was never made available to the public.

Just getting the point cloud data is only the beginning. You (may) need to process it to get a mesh and then need to animate this. There’s loads of interesting research on solving this problem. Here’s a few of my favourite references.

o You can attempt an improved point cloud renderer that fills in the gaps, as done by Nurulize and their AtomView software.

> I have written shaders to project colour texture information onto geometry as a light source in Unity, which is not normally supported. This approach has some merit, as do geometry shaders.

o You could process the data to get a mesh.

o For performance capture of people, you can then deform a mesh rather than recalculate it for each frame.

o You need to deal with different lighting conditions across viewpoints. This is discussed in several solutions. One of the real leaders in this area are the team at the Visual Computing Lab at the ITI and two papers worth reading are:

> Real-time, realistic full-body 3D reconstruction and texture mapping from multiple Kinects

> An Integrated Platform for Live 3D Human Reconstruction and Motion Capturing

I have some information about Kinect configuration verifier performance across the different hardware configurations. NB this is not saving data.

o I know I’ll have to deal with dropped frames or drift between sensors, but I don’t have much of a solution yet. Others have used a clock or light source to sync.

Model	Processor		Memory	Primary HDD	Primary HDD Type	GPU	FPS (low)	FPS (High)
ProDesk 600 G2	i3 6300T (Skylake)	3.3GHz	4GB	128GB	M2	HD Graphics 530	15*	20*
HTPC	i7 3770T	2.5 GHz	8GB	128GB	SSD	HD Graphics 4000	29	30
260 G1	P 2597U	1.4 GHz	4GB	32GB	M2	HD Graphics	26	30
260 G1	P 2597U	1.4 GHz	4GB	32GB	M2	HD Graphics	24	30
260 G1	P 2597U	1.4 GHz	4GB	32GB	M2	HD Graphics	28	30
HP9	i3 4330T	3.0 GHz	4GB	465GB	SATA		30	30
HP9	i3 4330T	3.0 GHz	4GB	465GB	SATA		30	30
HP 260 G1 DM	P 3558U	1.7 GHz	4GB	462GB	SATA	HD Graphics	29	30
Elitedesk 800 G1	i5 4590T	2.0 GHz	8GB	500GB	SATA		TODO	TODO

* I had driver issues with this system. Now resolved and likely to hit and maintain 30fps.

----
Update 17th August

I've come to the conclusion that the Celeron 2597U based nodes are too slow for anything other than playing around. I've come to the end of a process to retire these, replacing them with various flavours of i5 system. This is much better as they hold their frame rate well. I'll update this table when I have more information
---

· Holoportation and true holographic capture will become far more viable in the marketplace and Microsoft have been very active on this front. However, sensor set-ups will remain costly due to the need for multiple viewpoints. I’ve listed a few sources here, but I know of more.

o I can’t wait to get hold of the project Kinect for Azure solution. In fact, this is one reason I chose to write up my work. Hopefully this will be possible. It appears to have all the advantages of the RealSense, but I would expect less depth noise and better depth accuracy. It also has an option of 4K video and 360-degree audio capture. MS is also targeting Holoportation as a potential application on their example application pages. Of course, there are still many important details we don’t know…

Other examples:

> Holographic capture (not using depth sensors) at Microsoft's mixed reality capture studios or by 8i.

> Hololens applications and virtual teleconferencing.

· I remain very committed to the use case. Holographic capture will become the ultimate home movie and to record your children as they grow up is going to be provide such rich and emotive media, particularly when you add mid air haptics... which is also on the list. Watch out for my virtual ghosts project.

More here when the system is finally put together. Honestly, it will look a lot better than the mess of bits shown in the earlier pictures. I have all the bits now and can shift to the software side, before everything becomes obsolete!

Update 26th July.

Getting closer to a working system. Just waiting on some USB 3 cables. What you see here is the front of the unit with 8 PCs mounted, access to an 8 port KVM for the keyboard and an 8 port HDMI switch.

At the back we have connectors on the left for power to the 7 Kinects* and USB3 ports, so each Kinect camera connects using a pair of connectors, this is how I've cabled the Kinects. Internally all cables are colour coded and the nodes themselves are named after the colour with a theme set to match the colour, so it's easy when debugging across the connected HDMI devices or when on the master unit looking at the feeds. To the right of the connector panel we see ports to connect to a network - i.e. if you have an external master controlling the capture, an HDMI out port to connect a display and there will be a USB port to connect the external keyboard - for the nerds out there the keyboard in the picture is a Touchstream.

*At the moment I haven't cabled up the 8th Kinect node, simply because I might just have the 8th PC dedicated to management of the data from the other devices - i.e. a dedicated master.

It's a pretty neat set up. I'll post capture results soon. However, each PC currently boots and is connected to each other via a 16 port Gigabit switch internally. The KVM and display is working too. I just need to wire up the Kinects and I'll be back in action...

Comments

Max said…

Kinects are now up and running. Still having issues calibrating relative sensor positions. Also DepthKit is an interesting option to capture data from a Kinect V2 that is in closed (commerical) beta. They have spoken about the outside in multi-kinect configuration as something they wish to support and you can sign up here: http://www.depthkit.tv/studio-application

Wednesday, August 01, 2018 5:00:00 pm

Anders Lundgren said…

Interesting project, thanks for your writeup!
Did you get any further in the multi sensor setup? We're currently investigating a similar "holoportation" setup and we think that the LiveScan3D project looks quite promising. Did you try it with your setup?

Monday, August 13, 2018 11:48:00 am

I have hopes that Jasper will fix PointCloud marker based calibration in V3, he's experimenting with this now. I thought about using LiveScan3D, but haven't given it a go yet. There was something putting me off, but I can't recall what it was.

Friday, August 17, 2018 3:51:00 pm

Search This Blog

Max On Graphics