Approaching video forensics with fresh intelligence
New AI technology that mimics the human brain can help law enforcement and intelligence organizations rapidly identify patterns, objects and faces in large amounts of archived and live streaming video
Video is a critical element in crime prevention and investigation, yet current law enforcement systems are increasingly unable to cope. The sheer volume of surveillance material captured and stored every day is staggering, and set to rise dramatically. Adding more cameras to gather more information will only ever be useful if processes to search and analyze the mountain of data keep pace. As it stands vital information may be missed because the vast majority of video is simply never viewed.
Information technology firm Cisco estimates than in 2021 it would take more than 5 million years to watch the amount of video traffic across the globe – each month. Market researcher IHS forecasts that 127 million surveillance cameras and 400,000 body-worn cameras will ship this year, in addition to the estimated 300 million cameras already deployed. By 2020 it is predicted there will be more than 1 billion cameras operated by smart cities worldwide, providing 30 billion frames of video per day. Internet video surveillance traffic alone increased 71 per cent in 2016 according to Cisco, and is set to increase sevenfold by 2021. Globally, 3.4 per cent of all video traffic crossing the internet will be video surveillance.
Give that a major problem for surveillance operators is directed attention fatigue, where the brain naturally alternates between periods of attention and distraction, it would require a superhuman effort to identify and classify all these images. What is required is a system that is never distracted and can work in conjunction with people to reduce errors, which is what artificial intelligence-driven video systems promise.
AI in video surveillance can potentially deliver four times the performance of conventional video search – in contrast to human vigilance, which studies have shown can degrade by 95 per cent after about 20 minutes.
The cost of deep learning
Since 2012, when AI video analytics took off, the systems trained to recognize objects and facial IDs from different types of image have proved expensive to run and slow to compute, and require large datasets to generate results. These systems, which are based on convolutional neural networks (CNNs), employ an AI technique known as ‘deep learning’. They excel at churning through data but lack the ability to refine and react to streams of information gathered from the surrounding environment – which the human brain is extremely good at.
What’s more, CNNs exhibit limitations including poor noise immunity, particularly when random pixels appear in an image due to noisy sensors or lens contamination. They can serve false classifications if the network becomes confused – for example by someone wearing glasses, or if it cannot find a new face in a crowd without a large set of labelled images relating to that face being added to the database. The network parameters of CNNs need careful adjustment, and even then the accuracy rate for correct image classification may not be sufficient for video surveillance applications.
Spiking neural networks
A relatively new approach is the spiking neural network (SNN), which simulates and models the different aspects of the human brain’s operation much more closely than a CNN.
For instance, a police department that is looking for a suspect in live video streams does not have thousands of images of that suspect; nor does it have weeks to train a CNN system. In an SNN-based system, it can find patterns and people in videos in milliseconds and from a single image – which, importantly, can be as small as 24 x 24 pixels: it doesn’t need to be high definition. The system excels in recognition in low-light, low-resolution, noisy environments, making it ideal for the large amount of previously installed video surveillance systems.
Unlike current CNN technologies that require extensive pre-labelled datasets and expensive cloud-based training and acceleration, an SNN system can be implemented in software with traditional computer processors (CPUs) and trained on-premises. The one-shot technology learns in real time and requires only modest processing power – typically a Windows- or Linux-based x86 desktop computer or server – as well as consuming little energy.
This enables a greater number of law enforcement organizations to capitalize on the opportunities offered by AI. It means AI algorithms can be used with legacy systems without requiring expensive hardware or infrastructure upgrades, and it can be deployed in the field in highly secure environments that may not have cloud connectivity.
Tasks that seemed impossible for machines just a few years ago are becoming almost routine, and SNN technology has perhaps the greatest potential to bring valuable new capabilities into mainstream automated video surveillance today.
About the author:
Bob Beachler is Senior Vice President of Marketing and Business Development at BrainChip. He can be reached at: [email protected]