Prof. Dr. Hao Li is a true innovator within the industry, in fact, he was named in the ‘C-Suite Quaterly NextGen 10: Innovators under 40’ and the ‘World’s top 35 innovators under 35’ by MIT Technology Review! Li’s contributions have affected multiple fields under the computer science umbrella, with leading visual effects studios and radiation therapy manufacturers alike benefiting from his work. The former Industrial Light & Magic (ILM) Research Lead and current Assistant Professor, computer science department at University of Southern California (USC), made the trip to Sydney for a Vivid Ideas talk ‘On the Future of Digital Characters’.
As was to be expected, the large majority of attendees had ties to 3D, Animation or related specialties. Li began with an overview of the visual character creation process, which many gamers will be familiar with. Speaking on the amazing fidelity we are able to achieve in presenting lifelike models today, Li turned to EA DICE’s Frostbite 3 Engine, as showcased in Battlefield 4. But what Li has been attempting to do – and is making strides in – is facilitate the creation of digital content much faster and efficiently, ideally to the point of being virtually automatic. It’s actually a reality that is well-nigh, as opposed to being a 10 year goal. We’ve been privy to the processes of developers and filmmakers who utilise multi-view stereo to obtain a fully 3D data-set and recreation of a model. However, with those methods comes limitation, not the least of which are money-related. Microsoft took the first step towards democratizing this sort of technology with the release of the Kinect as a standalone peripheral for Windows back in February, 2011. The appropriation of real-time depth sensors for varied purposes, particularly since the Kinect’s launch, has been widely documented, although Li sees the implementation of such similar quality, but much more compact scanners in our laptops, tablets and even smartphones in the very near future.
Before that day comes to pass, we must look at the steps being taken in order to get there. You may have noticed in the header image (and promotional ads) that Li is holding a 3D printed model of himself. While the cost of buying a 3D printer is extremely high, although many companies are trying to change that, what Li is actually demonstrating is the user’s ability to capture the necessary data to enable such physical recreations, simply by using the Kinect or similar hardware. The problems with this approach are the inconveniences of capturing every side of the subject, the inconsistency of lighting and shading due to the subject’s movements and a certain quality ceiling. An effective model can be stitched together from these rigid scans nonetheless, but the digital uses for this information is most intriguing. In a collaboration with USC’s Institute for Creative Technologies (ICT), a software was developed that allows the automatic rigging (adding a skeleton) of a mesh made from that 3D data, which has clear implications for 3D animation pipelines.
Looking at film practices, James Cameron and his crew, while innovating on new technologies, still had to rely on marker-point motion-capture for Avatar, which is entirely non-ideal considering the added manual labor required to correct and animate nuances and in-between movements of the face where there are no markers. And before ILM introduced a new motion-capture suit/technology able to accurately capture motion outside, filmmakers could forget any ideas involving outdoor shoots. Li proceeded to present footage showing ILM tests predating the Kinect, whereby a stream of significantly higher-resolution data could be captured within a certain space using two machine-vision cameras and a video projector. But, the data was raw, unstructured and meaningless at that stage. This leads into Li’s main focuses in his research and experimentation: how to extract low-level information from raw data via correspondences, and how to analyse and detect semantics.
The talk becomes quite technical and, truly, the best way to understand all the details and processes is with visual aids. For those who were unable to attend, or simply unaware of the opportunity, Li’s own website is a fantastic resource, where he is generous in offering video links to numerous paper videos, succinct demonstrations, as well as the almost two-dozen journal and conference papers he has written, a free geometry processing application for Mac called BeNTO 3D, and even some older lecture materials. For those with a decided interest in digitizing the real world, or the applications of such technology in medicine, military (counter-terrorism in particular), SFX and other fields, you should really check it out at hao.li. Besides a sneak peek of a top secret prototype for effectively dealing with occlusion whilst maintaining consistent, unconstrained and high-quality facial performance capture, you’ll see practically everything we saw (sorry!). Gamers, don’t be surprised if he has a hand in bringing life-like hair, garment and facial performance capture to that upcoming next-gen game that will undoubtedly impress you beyond words.
Soon, taking a complete 3D image of yourself at the press of a button will be reality. Oh, and take one quick glance at his resume for instant inferiority syndrome…what have we done with our lives?!