The Next Big Step for AI? Understanding Video

Will Knight for MIT Technology Review:  For a computer, recognizing a cat or a duck in a still image is pretty clever. But a stiffer test for artificial intelligence will be understanding when the cat is riding a Roomba and chasing the duck around a kitchen.

MIT and IBM this week released a vast data set of video clips painstakingly annotated with details of the action being carried out. The Moments in Time Dataset includes three-second snippets of everything from fishing to break-dancing.

“A lot of things in the world change from one second to the next,” says Aude Oliva, a principal research scientist at MIT and one of the people behind the project. “If you want to understand why something is happening, motion gives you lot of information that you cannot capture in a single frame.”

The current boom in artificial intelligence was sparked, in part, by success in teaching computers to recognize the contents of static images by training deep neural networks on large labeled data sets (see “The Revolutionary Technique That Quietly Changed Machine Vision Forever”).

AI systems that interpret video today, including the systems found in some self-driving cars, often rely on identifying objects in static frames rather than interpreting actions. On Monday Google launched a tool capable of recognizing the objects in video as part of its Cloud Platform, a service that already includes AI tools for processing image, audio, and text.  Full Article:

Comments (0)

This post does not have any comments. Be the first to leave a comment below.


Post A Comment

You must be logged in before you can post a comment. Login now.

Featured Product

FLIR Si1-LD - Industrial Acoustic Imaging Camera for Compressed Air Leak Detection

FLIR Si1-LD - Industrial Acoustic Imaging Camera for Compressed Air Leak Detection

The FLIR Si1-LD is an easy-to-use acoustic imaging camera for locating and quantifying pressurized leaks in compressed air systems. This lightweight, one-handed camera is designed to help maintenance, manufacturing, and engineering professionals identify air leaks faster than with traditional methods. Built with a carefully constructed array of MEMS microphones for high sensitivity, the Si1-LD produces a precise acoustic image that visually displays ultrasonic information, even in loud, industrial environments. The acoustic image is overlaid in real time on a digital image, allowing you to accurately pinpoint the source of the sound, with onboard analytics which quantify the losses being incurred. The Si1-LD features a plugin that enables you to import acoustic images to FLIR Thermal Studio suite for offline editing, analysis, and advanced report creation. Field analysis and reporting can also be done using the FLIR Acoustic Camera Viewer cloud service. Transferring of images can be managed via memory stick or USB data cable. Through a regular maintenance routine, the FLIR Si1-LD can help facilities reduce their environmental impact and save money on utility bills.