Holographic Capture using Multiple Kinect - A DIY Approach with Lessons Learned!
Lessons Learned Creating a DIY Holographic Capture Rig using Multiple Kinect Sensors
I’ve always had a strong
interest in technology and photography and consider myself to be an early
adopter. I also have a passion for graphics, virtual reality and have a
background in software architecture and development. Several years ago, I
acquired a Kinect for Windows device, but because of USB issues on my main PC,
I didn’t get around to playing with the hardware for a long time. Kinect can be
notoriously fussy about which USB 3 controllers it will work with, this is a well-documented issue.
Eventually, I managed to get a USB expansion card that played nicely
with the Kinect and I fired up the SDK examples to see what it could do. I was impressed,
even if there was a lot of depth noise, particularly at the edge of objects - so
called flying pixels.
I’m not sure when it happened,
whether it was prompted by getting an Oculus VR kickstarter headset (DK1) or by
the birth of my daughter, but at some stage I started dreaming of capturing the
world around me, for what I would call the ultimate home movies. Technically, I
guess this is most commonly referred to as holographic capture. I would think
about optimal configurations of Kinect cameras for live action capture (a
circle of Kinects facing inwards to create a capture volume of about 2x2x2m) or
environment capture (an arrangement of vertically mounted Kinects facing out, mounted
a tripod). In most of these scenarios, I thought I would need 6-8 Kinect
devices, due to the FOV of the sensor and a need to adequately capture the
subject. Of course, totting up the cost of such a system proved off-putting,
particularly for a hobby project, but the idea wouldn’t go away.
About 3 or 4 years ago,
I happened across an article about ultra SFF PCs that were on clearance at
various UK online stores for about £85 per unit. The model was the HP 260 G1. The specs and design of these PCs was attractive for use with the Kinect. They
had USB 3 ports, had an adequate integrated GPU, could support 4GB of RAM or more, had M.2 slots for fast drive
access, space for an internal drive, could be mounted
and were really compact (177
x 175 x 34 mm (6.9 x 7 x 1.3 in)). There
were two issues though. Would the USB 3 ports be compatible with the Kinect and
how would the processor (a humble Celeron 2957U) cope with the Kinect if the
port worked? I caved almost immediately and bought one to test. It worked with
the Kinect and achieved about 24fps in the Configuration Verifier. Like many
systems, the configuration verifier often states “usb port with unknown
bandwidth”, but works fine. Over the next month, I justified buying another 2
of these PCs before they went out of stock and were impossible to source. I
upgraded the RAM, found the custom HDD cable required to add a larger internal HDD
and bought new Kinects, adapters and tripods. However, the amount of free time
I had to devote to this was limited. There were unresolved issues with this
system however. How could I trigger a recording across multiple nodes and would
interference be an issue? Kinects use time of flight (ToF) sensors that can
interfere with each other producing unstable depth values.
While I have the
skills to code a solution for driving more than one Kinect, free time is a
serious constraint these days. While researching point cloud processing
solutions, I discovered Brekel’s Point Cloud software and a post about the multi-sensor alpha. I bought a copy of PointCloud V2 and gained
access to the alpha. I also bought an 8-port gigabit Ethernet hub and the
necessary cables to connect the PCs.
At the same time, I
started thinking about system mobility. It would be an effort to have to move
this system out of my office – even with only three nodes it was already a mess
of cables and boxes. Could I find something to host the kit and achieve two
objectives? (With one constraint – I could do the work).
- Ideally, I wanted to have only one power cable going into a portable ‘box’ that contained all the PSUs, adapters, the network hub and PCs
- To have only one lead coming out of the box for each Kinect camera
I found an old wooden
storage box and realised that I could probably fit up to 5 PCs in there. Using
a normal power drill I could route cables as required. There were some size
constraints though and I needed to make things as compact as possible. For
maximum ‘compactness’ I bought:
- right angled USB3 adapters and right angled power adapters to minimise space for cable routing
- · ultra-short network cables (and then ended up buying longer ones!)
- clover leaf and figure of 8 power connector adapters and C14 splitter cables to reduce the need for several power cables to fit in the box + the necessary sockets
- a compact multi-port power socket
- a C14 to UK 3 pin socket adapter for the network hub
By now I was on eBay
watching all HP 260 G1 auctions (and variants) to see what I could pick up cheaply
and managed to snag a bargain, getting two i3 PCs at a pretty good price.
A word on keeping
costs low. I’ve watched eBay for years now and know when I see a decent price
for an HP PC. However, the Kinect sensors can be purchased more cheaply at
second hand gaming outlets, mainly because so many Xbox One gamers were selling
them off - at launch the sensor was bundled with the console, but the
requirement to have a Kinect was later dropped. I don’ think I’ve ever paid
more than £30 for a sensor. I was also fortunate that I bought most Kinect for
Windows (and later Xbox One S) adapters before they were discontinued and hit
crazy prices. They used to be £30.
So, I now had my five-unit set-up. However,
one issue remained (and is still a problem). Calibrating the relative sensor
positions is difficult. The marker calibration system supported by the Brekel
alpha can be unreliable and would need to be run each time the sensors were
moved. This was a nonstarter on my set-up.
About this time, my son was born and I lost
access to my office (and gained even less free time!). All the kit went up in
the loft and remained packed away due to lack of space. However, the idea of
the holoport refused to die.
I kept buying more PCs to get to my ideal 8
cameras set up. I cleared a space in the loft and decided to fix the position
of each camera, rather than use tripods. I also began to tackle a few other
issues. I didn’t find TeamViewer an ideal solution for remoting into each PC.
So, I found a cheap USB and VGA KVM style hub on eBay. I also connected the slave
PCs to my main PC for running the multi-sensor client. Finally, I investigated using
a Vive tracker to determine the relative position and orientation of each node.
In the case of the Vive tracker, I thought
I may have issues with it working while the Kinects were running, due to both
systems use of IR sources. This didn’t appear to be the case, but I feel that
it might have been a case of pot luck (tip, you could always use the sync cable between lighthouse units). Unfortunately, the brief amount of time I
devoted to the co-ordinate transform issue wasn’t enough to solve it and I couldn’t
transform the data correctly to get the Brekel software to align the sensors, so I went back
to the marker solution rather than invest any more time in an alternative approach.
Around this time disaster happened.
Microsoft announced they were discontinuing the Kinect (I wasn’t too worried
about this) and the adapter. The
latter was a real issue. Stock rapidly disappeared, prices went crazy and I had
only 5 adapters - not enough for my 8-node ‘dream’ configuration. However, I
found that the Kinect can be hacked to take a
conventional USB3 lead and a DC power source. This hack worked and has some
advantages:
- I could decide the length of cables (USB and power)
- I didn’t need an annoying extra adapter break out box
- I could power 4 Kinects from a single CCTV power brick
- Most of the length of the ‘cable’ could be wrapped as a single cord
- I could sell my Kinect adapters and recoup some of my investment in this mad hobby project!
HP
Rack – HP Image
Now, by this time I had even more PCs and
bits and bobs strewn around the loft. I needed a new box. On this front, I had
a couple of breakthroughs. By luck I discovered that HP had
designed a server rack mount for the SFF PCs line that fitted 8 systems into
a standard 19” rack, but once again they could only be sourced from eBay at
inflated prices. With time and patience, I managed to snag once at a more sensible
price. I also found a compact rack case that is not as deep as a standard server
rack but is the same width (it appears audio visual gear also uses the same
width kit as standard servers). This should give me a portable
solution. A few other upgrades have also happened – a cheap HDMI 8 port switch
+ DisplayPort adapter cables (I couldn’t stand the quality of the VGA output),
a 16-port gigabit network hub and shelves for the server rack case. I also
wanted the system to be easy to configure, so I added a KeyStone jack back
plate to the server rack that would enable me to mount DC and USB3 ports for each Kinect and HDMI and network ports should I want to connect to an external
PC and display. It will be just 'plug and play' - nice. To ease management of the
holoport hardware I’ve adopted a naming system for each node based on a
colour. The PC has the name, the cables
are colour coded, the desktop background is set to the PC colour, the Kinect, KVM
switch boxes and ports all are colour coded. It just makes diagnosing problems
and checking set-up so much easier.
Current Status
That brings things pretty much up to date. I’m
currently in the process installing all the kit and have one more Kinect camera
to adapt and add as a node.
In terms of system performance, I still
have issues with the sensor position calibration. I’m really hoping that JasperBrekelman’s research on multi sensor body alpha will lead to a better
calibration solution coming to PointCloud V3, which looks likely. I have
managed to configure the relative position and orientation of 6 out of 7 nodes, but it’s
a frustrating process that has to be redone if a sensor is accidentally knocked.
For proof it works, here’s a point cloud of my loft captured from six Kinects running simultaneously and captured with the Brekel software.
A few other observations from my journey
· At some stage I want to be able
have a system that can work outside, ideally in sunlight. Kinect sensors are swamped by
sunlight. But there are several alternatives. Microsoft themselves now
recommend Kinect developers move to the Intel Realsense D400
series of depth cameras. I have one RealSense D435 camera to evaluate, which is
the wide-angle version. The pros of these depth cameras relative to the Kinect
are:
o They can work outside (they use
stereo IR cameras with an optional active IR projector to inject structure into
a scene, if needed). This means they can work in sunlight.
o They are really small.
o They have higher, configurable
depth resolution than the Kinect v2.
o In theory, they don’t interfere
in a multi-sensor system. However, there are some unexpected system design
choices quirks, particularly with the D435 regarding synchronisation. Read more
here.
o They are powered by USB, so
just one cable and no need for a dedicated power source.
o They are supported by
PointCloud V2 and the multi-sensor alpha.
o More than one can be connected
to a PC and the performance has been characterised.
> See here.
o
However, these are relatively
expensive compared to buying Kinect on the second-hand market (£30 vs £150+) –
particularly given that I have the PCs to drive the Kinect now. I also feel the
depth performance is not as good as the Kinect and there is no directional
audio capture. There’s more about tuning the depth performance in this
whitepaper.
· There are other solutions to
capturing point cloud data from multiple Kinects, including LiveScan3D
and one on SteamVR - sorry, I can’t recall the link.
- I’ve seen bad interference between Kinects, but most of the time they are fine.
o There was a research paper “Resolving
Multipath Interference in Kinect: An Inverse Problem Approach” that used
custom firmware to enable custom modulation frequency to enable Kinect sensors
to perform better. Unfortunately, this firmware was never made available to the public.
- Just getting the point cloud data is only the beginning. You (may) need to process it to get a mesh and then need to animate this. There’s loads of interesting research on solving this problem. Here’s a few of my favourite references.
o
You can attempt an improved
point cloud renderer that fills in the gaps, as done by Nurulize and their AtomView software.
> I have written shaders to project colour texture information onto
geometry as a light source in Unity, which is not normally supported. This
approach has some merit, as do geometry shaders.
o You could process the data to
get a mesh.
o For performance capture of
people, you can then deform a mesh rather than recalculate it for each frame.
o You need to deal with different
lighting conditions across viewpoints. This is discussed in several solutions.
One of the real leaders in this area are the team at the Visual Computing Lab at the ITI and two papers worth
reading are:
- I have some information about Kinect configuration verifier performance across the different hardware configurations. NB this is not saving data.
o
I know I’ll have to deal with
dropped frames or drift between sensors, but I don’t have much of a solution yet.
Others have used a clock or light source to sync.
Model
|
Processor
|
Memory
|
Primary HDD
|
Primary HDD
Type
|
GPU
|
FPS (low)
|
FPS (High)
|
|
ProDesk 600 G2
|
i3 6300T (Skylake)
|
3.3GHz
|
4GB
|
128GB
|
M2
|
HD Graphics 530
|
15*
|
20*
|
HTPC
|
i7
3770T
|
2.5
GHz
|
8GB
|
128GB
|
SSD
|
HD
Graphics 4000
|
29
|
30
|
260 G1
|
P 2597U
|
1.4 GHz
|
4GB
|
32GB
|
M2
|
HD Graphics
|
26
|
30
|
260 G1
|
P
2597U
|
1.4
GHz
|
4GB
|
32GB
|
M2
|
HD
Graphics
|
24
|
30
|
260 G1
|
P 2597U
|
1.4 GHz
|
4GB
|
32GB
|
M2
|
HD Graphics
|
28
|
30
|
HP9
|
i3
4330T
|
3.0
GHz
|
4GB
|
465GB
|
SATA
|
30
|
30
|
|
HP9
|
i3 4330T
|
3.0 GHz
|
4GB
|
465GB
|
SATA
|
30
|
30
|
|
HP 260 G1 DM
|
P
3558U
|
1.7
GHz
|
4GB
|
462GB
|
SATA
|
HD
Graphics
|
29
|
30
|
Elitedesk 800 G1
|
i5 4590T
|
2.0 GHz
|
8GB
|
500GB
|
SATA
|
TODO | TODO |
* I had driver issues with this system. Now resolved and likely to hit and maintain 30fps.
----
Update 17th August
I've come to the conclusion that the Celeron 2597U based nodes are too slow for anything other than playing around. I've come to the end of a process to retire these, replacing them with various flavours of i5 system. This is much better as they hold their frame rate well. I'll update this table when I have more information
---
----
Update 17th August
I've come to the conclusion that the Celeron 2597U based nodes are too slow for anything other than playing around. I've come to the end of a process to retire these, replacing them with various flavours of i5 system. This is much better as they hold their frame rate well. I'll update this table when I have more information
---
· Holoportation and true
holographic capture will become far more viable in the marketplace and
Microsoft have been very
active on this front. However, sensor set-ups will remain costly due to the
need for multiple viewpoints. I’ve listed a few sources here, but I know of
more.
o
I can’t wait to get hold of the
project Kinect
for Azure solution. In fact, this is one reason I chose to write up my
work. Hopefully this will be possible. It appears to have all the advantages of
the RealSense, but I would expect less depth noise and better depth accuracy.
It also has an option of 4K video and 360-degree audio capture. MS is also
targeting Holoportation
as a potential application on their example application pages. Of course, there
are still many important details we don’t know…
Other examples:
Other examples:
> Holographic capture (not using depth sensors) at Microsoft's mixed reality capture studios or by 8i.
> Hololens applications and virtual teleconferencing.
· I remain very committed to the
use case. Holographic capture will become the ultimate home movie and to record
your children as they grow up is going to be provide such rich and emotive
media, particularly when you add mid air haptics... which is also on the list. Watch out for my virtual ghosts project.
More here when the system is finally put together. Honestly, it will look a lot better than the mess of bits shown in the earlier pictures. I have all the bits now and can shift to the software side, before everything becomes obsolete!
Getting closer to a working system. Just waiting on some USB 3 cables. What you see here is the front of the unit with 8 PCs mounted, access to an 8 port KVM for the keyboard and an 8 port HDMI switch.
At the back we have connectors on the left for power to the 7 Kinects* and USB3 ports, so each Kinect camera connects using a pair of connectors, this is how I've cabled the Kinects. Internally all cables are colour coded and the nodes themselves are named after the colour with a theme set to match the colour, so it's easy when debugging across the connected HDMI devices or when on the master unit looking at the feeds. To the right of the connector panel we see ports to connect to a network - i.e. if you have an external master controlling the capture, an HDMI out port to connect a display and there will be a USB port to connect the external keyboard - for the nerds out there the keyboard in the picture is a Touchstream.
*At the moment I haven't cabled up the 8th Kinect node, simply because I might just have the 8th PC dedicated to management of the data from the other devices - i.e. a dedicated master.
It's a pretty neat set up. I'll post capture results soon. However, each PC currently boots and is connected to each other via a 16 port Gigabit switch internally. The KVM and display is working too. I just need to wire up the Kinects and I'll be back in action...
Comments
Did you get any further in the multi sensor setup? We're currently investigating a similar "holoportation" setup and we think that the LiveScan3D project looks quite promising. Did you try it with your setup?