Towards Monocular 3D Pose Estimation

Traditionally pose estimation & skeleton extraction was a difficult task that required depth cameras and expensive hardware for post-processing captured data.
 
The introduction of Microsoft Kinect which was initially oriented towards the gaming community made a significant impact in pose estimation and paved the way for similar sensors and more research in the field.
 
Lately, we are seeing more and more alternatives that not only require fewer resources but can also work with a simple RGB camera without necessarily relying on the depth information to help with the estimation. Of course, the lack of depth information would normally mean no Z-axis however that is also changing thanks to better ML models focused on extrapolating missing data from a monocular image. The combination of these two solutions can soon make a 3D sensor requirement obsolete for certain use cases, which in turn not only lowers the overall hardware cost but also simplifies the on-site installation process thanks to a smaller average footprint of a 2d camera.
 
Here at Munogu, we are excited to have finalized the conversion of some of our well-known gamification solutions that traditionally required 3D sensors to work with a simple webcam while maintaining a similar accuracy and precision that a time-of-flight camera has provided since we initially started developing them. Many of our products from our gamification catalog are available today in both variants, providing virtually identical results.