Video Analytics – An Introductory Overview

Co-Author: Benita Mordi, Artificial Intelligence and IoT Strategist

Overview

Increasing hours of video footage, combined with the limits of human biology, make video analytics software essential to processing large amounts of video streams. Information is most valuable when it is most needed and is also as valuable as the incidents that can be actively captured and acted upon in real-time. Video management systems assist security and surveillance personnel by monitoring video streams 24/7 and alerting them to activity which requires attention.

Advances in storage capabilities, video and image resolution drove video analytics adoption over the last decade. The global video analytics market was valued at $2.77 billion in 2017 and is estimated to increase to a staggering $8.55 billion by 2023. GPUs make it easier to process videos on low-cost accelerators, making the case for advanced video analytics. Signal stabilising technologies also improve the effectiveness of video analytics, which relies on quality video streams.

Video analytics has use cases across many industries. Some examples include:

  • Retail – Counting customers in a store, tracking movement, optimising store design and merchandise restocking
  • Transportation – Left luggage identification in airports
  • Healthcare – Thermal imaging for elevated temperature monitoring
  • Manufacturing/Industrial/Construction – Quality control, and health and safety compliance
  • Food Processing – Quality control
  • Entertainment and Sports – People counting to manage crowd traffic
  • City Operations – License plate recognition and vehicle counting for urban planning
  • Law Enforcement – Searching video content to help investigations

Multiple video streams are often paired to enable use cases and combined with IoT sensor data for comprehensive insight and situational awareness. Video management systems can also integrate with third party security apps to help organisations take a holistic approach to their video analytics strategy.

How Video Analytics Works

Incoming streams of video are dense combinations of pictorial and audio data packaged as a collection of consecutive frames. For analytics, an individual frame would not provide any more insightful information than a typical photo. The sequential continuity of consecutive frames provides the dynamic needed to extract insight from video data.

Video data is processed in two stages: first motion detection and analysis, followed by pattern recognition. During motion detection and analysis, changes in pixel content are monitored to identify movement, then pattern recognition classifies objects in motion, their trajectory, and considers other moving objects.

Analytics-enabled cameras use a mathematical function to detect objects in motion by calculating the difference between frames. If the difference is anything but zero, movement is said to have occurred. This is a simple task computationally, while analysing movement requires more complex computational gymnastics, mathematical functions included. This is where AI comes in. Movement is viewed as a trajectory drawn from an initial frame and tracked to an object’s position in subsequent frames.

As an example, consider tracking one automobile’s movement in traffic for five seconds without confusing its trajectory with other automobiles in the extracted video stream. An image segmentation algorithm segments the vehicle in the initial frame and connects the identified image from frame to frame, thus drawing out the trajectory. Basic Computer Vision solutions with reasonable computational power support this easily with CPU cores and a lavish amount of RAM. In the case of 30 automobiles, 30 segmented images in 30 different spots need to be tracked without confusion when images overlap. For video analytics solutions to be valuable here, they need to process many objects per frame as well as their movement across frames, when the average rate is 60FPS. Imagine looking at 60 photos in one second and fully understanding what was in each photo. More advanced computational capacity is needed to optimise for scale in scenarios like this.

Tools and Development

Deep learning techniques used in intelligent video analytics vary. One common approach is converting video frames to image files and applying Convolutional Neural Networks (CNN) to detect objects in each frame. Hybrid models of CNNs and Recurrent Neural Networks (RNN) are recommended for motion analysis. It’s outside the scope of this blog to cover all candidate tools and frameworks available to implement video analytics applications; however, here are some resources where a variety of options are reviewed for tooling and development:

  • Open Source Tools – 33 Open Source products on GitHub available for Video Analytics. The major difference amongst these is the use case they enable.
  • VidSaga, a global video marketers community suggests 2020’s top 10 commercial Video Analytics tools for Business Intelligence.

The core power of such tools and other popular frameworks are the APIs they offer to enrich correspondence in algorithm implementation. Google Object Detection is popular for rapid creation of object detection models. It provides APIs leveraged over 330,000 classes in the  COCO Data Set for object classification. It also allows use of libraries like OpenCV for segmentation and enforces appropriate object-labelling as prerequisites for object tracking. This blog walks through a step by step approach to developing a video analytics application in Python, using the TensorFlow framework and OpenCV library, also applied via Google Object Detection APIs. For a complete implementation example here’s how a soccer game is analysed. Players (objects) are detected by segmenting them in video frames using OpenCV to assign attributes like jersey colour, then labelled and tracked using Google Object Detection APIs.

Related Solutions

At Dell, we have an extensive video analytics partner ecosystem powered by our solutions. See our partner validation page for details.

NVIDIA and Dell have a joint Intelligent Video Analytics solution that’s been implemented to support loss prevention in Retail, and another with Intellisite using thermal vision technology for a wide range of use cases. We also partner with companies like NTT in the Smart Cities space, and Converge, who bring great consultative and custom approaches to video analytics.

To find out more, you can read about our Edge/IoT and Analytics capabilities and customer success stories.

About the Author: Amir Bahmanyari

Amir Bahmanyari has more than 30 years of diversified experience in the IT industry. Over the last several years his passion and focus have been on Big Data Analytics, HPC, AI, BI, IoT, Cloud and related emerging technologies. During the last five years he has served as a Sr. Solutions Architect delivering Retailer IT solutions involving cloud-based Data Analytics and Platform Architect, as well as evangelizing the Business Process Management methodology. Amir’s prior experience was as a contributing developer/architect for DARPA, several startups, in the financial and automotive industries. He greatly enjoys hands-on work, sharing his findings, and helping his customers leverage leading edge technologies to their competitive advantage. Amir lives in the San Francisco Bay area with his wife and two children.