Vision AI

Taking Computer Vision Further With Booz Allen

Computer vision uses advanced artificial intelligence (AI) and machine learning (ML) techniques to identify, categorize, and analyze objects or changes within images and videos. Popular applications include automated object identification and tracking, pattern recognition, and real-time anomaly detection. Increasingly, these systems can also peer into non-visible transmissions, such as the infrared and radio parts of the spectrum. But efficiently scaling and operationalizing computer-vision applications means overcoming critical challenges, from cost-efficiently gathering data and fine-tuning models to achieving mission-critical performance outside of the enterprise cloud.

20%-80% Reduction in Real Data Need

Superior data engineering methods to label, organize, and augment data

Booz Allen’s Replicant™ synthetic data suite and integrations with commercial off-the-shelf solutions show huge savings in real data needed to build models, reducing cost and time for data collection.

10 Times Faster for Label and Train Processes

Accelerated data pipelines

We bring agile model development to the mission by pairing our proprietary rapid training methods to deliver rapid, semi-supervised labeling capability.

4 Times Faster for Typical Single-Shot Detectors

SWaP-Optimized AI

We provide flexibility in running AI from the Internet of Things (IoT) edge to the cloud, helping optimize models for specific chipsets to drive inference.

Multi-Spectral Data Fusion

Optimization to best-fit computer vision methods on different compute hardware

We enable organizations to build, deliver, and re-train a variety of electro-optical/infrared (EO/IR), light detection and ranging (LiDAR), radio frequency (RF), and synthetic aperture radar (SAR) imagery-based classification, detection, and tracking solutions. Extendable to support new data fusion algorithms within our existing pipelines.

Solution Demos

Vision Assistant

The Booz Allen Vision Assistant automates image and video analysis for accelerated decision making.

View Full Video Transcript

Hi, I'm Britanya Wright-Smith, technical lead of the Vision AI team's Vision Assistant. The Vision Assistant is an AI-powered tool that couples human expertise with leading-edge computer vision models to analyze images and videos faster than ever. No AI model is perfect. That's why we've coupled our cutting-edge AI with the mission expert to create a powerful team. The Vision Assistant is available as part of our Vision AI platform, and built upon open source, industry-proven, enhanced for our clients' missions.

 

It solves the primary problem many users face when extracting information from visual data. They have a large amount of video or imagery and a need to accurately find, label, highlight, or measure objects that there's just not enough time in the days, weeks, or even months to process it all. This is where the Vision Assistant comes in. Users of the Vision Assistant start by uploading their data to the Vision AI platform. Various data types can be uploaded - including, but is not limited to videos, images, image batches, or even satellite images. In this stage, the user also defines their labels. It can comprise of anything from a rectangle or bounding box to a polygon, or even to a skeleton. By default, it is defined as any. Once completed, the user can directly go into the Vision Assistant and start their annotation process.

 

The user is navigated to the job window where they have the ability to view their data. In this stage, the user gets to choose a frame or image to start annotating. On this frame or image, they're able to use the labels they created to define an object of interest. They can define this object manually or with the help of AI. The AI model types that can be used include, but are not limited to, tracker models, object detection models, and object segmentation models. Let's walk through a few examples using the Vision Assistant to annotate objects within a video with AI assistance. This user wants to segment a vehicle of interest in this frame of the video. They do this by navigating to the AI Tools icon. Under the Interactors tab, the user chooses the appropriate label and AI model of choice. After clicking Interact, they are prompted to place down points where the object is.

 

Once completed, the AI model predicts a segmentation mask or polygon of that object. This can be done for multiple objects in the same frame, But it doesn't just stop there. The user can also make modifications to the AI predictions to refine their results. This user modifies the segmentation results by placing a point on an area of the vehicle that needed to be added to the mask or polygon. In addition, the user can remove parts of the segmentation by placing a negative point signifying areas that do not include the object. If the user wants to detect all cars in this frame or image, for example, they do this by navigating to the AI tools icon. Under the Detectors tab, the user chooses the appropriate label and AI model of choice. After clicking Detect, the model returns its best prediction on all cars within the current frame or image.

 

While AI tools can be used on individual frames, the vision Assistant takes this further by incorporating AI tracking, further accelerating the annotation process. A user can do this by navigating to the AI Tools icon and choose the appropriate model of choice under the Tracker tab. This model will track all annotated objects in the video. The user clicks Track to start the prediction process. Once the model has made tracking predictions, the user is able to navigate frame by frame to see the results. The Vision Assistant can track objects throughout a video in many ways.

 

These objects can be defined as bounding boxes, masks, polygons, or even skeletons comprised of many key points. After viewing the results and making modifications, users are able to export these results in multiple formats for other post-processing needs. Imagine doing in minutes what used to take hours. We've helped clients speed up their annotation process tenfold. With the Vision Assistant, you have fast AI-assisted annotating right at your fingertips, making your imagery workflow smoother and more efficient. 

Vision AI Multi-Modal ISAC Sensing

Booz Allen Vision AI Multi-Modal ISAC Sensing integrates multiple data sources for enhanced situational awareness.

View Full Video Transcript

Imagine we are at an intersection we want to monitor for transportation and pedestrian safety. We will use combined vision and integrated sensing and communications, or ISAC capabilities to sense pedestrian traffic in order to meet this need. Booz Allen's Vision AI is a platform for edge inference that can ingest various data types such as imagery, video, and radio frequency signals data. To run AI on RAN, we are  implementing vision AI output for detection and classification of pedestrians.

 

For this demo, NVIDIA is implementing their ISAC capability. This system showcases how existing wireless signals can be used to sense and understand the environment, utilizing available 5G radio waves. Camera feeds provide excellent sensing in clear, non-obstructed conditions. However, the ISAC sensing capability provides additional information where vision is weaker, such as at night or conditions with low visibility like rain, snow, and physical obstructions. ISAC additionally provides information about the detected pedestrians, such as distance and location.

 

For the setup, we have an RU acting as the RF receiver, along with a camera positioned at the same location as the RU. We also use a commercial UE for uplink transmission. With this setup, we are able to sense a pedestrian through two modes,  Vision and ISAC, and view the fused results overlaid. As we switch to conditions with reduced visibility the camera no longer detects the pedestrian, while ISAC continues to detect it. Having multi-modal information produces stronger and more accurate outcomes. By bringing these capabilities to our AI-RAN stack Booz Allen can improve sensing capabilities at the edge, meeting the challenge to use our resources effectively in complex environments. 

Vision AI Edge Fusion

Booz Allen Vision AI Edge Fusion combines EO, IR, and ISAC over 5G for real-time detection and tracking in low-visibility.

View Full Video Transcript

Maintaining continuous awareness of the environment is critical for many industries such as public safety and emergency response. Visible light cameras, also known as electro optical cameras, are valuable sensing tools. But just like human eyes on their own, they have limitations – especially in low visibility conditions such as fog, obstructions, or darkness. The answer to this is sensor fusion, combining multiple data types to enhance certainty. Utilizing multimodal information produces stronger outcomes compared to relying on a single sensing modality.

The integration of vision,  IR, and ISAC capabilities enhances overall scene comprehension, as each modality compensates for the other's limitations. Deploying multiple cameras expands coverage and reduces blind spots while integrating infrared, or IR, and integrated sensing and communications, or ISAC, significantly enhances detection confidence when visibility is poor.


Consider a scenario where emergency responders need to perform search and rescue in an environment with poor lighting conditions at night and partially hidden victims. By performing sensor fusion and combining multiple electro optical, or EO, cameras with IR and ISAC sensors, the system can reliably detect and track the individual, even in challenging conditions.


Booz Allen's Vision AI platform provides artificial intelligence model training and deployment  that ingests various data types, or modalities, such as imagery, video, and radio frequency signals data. In this scenario, Vision AI  brings AI to RAN through the use of these trained machine learning models for detection and classification of people for search and rescue. Here, EO and IR data are streamed with results displayed in real time.


Edge fusion builds on our existing ISAC capabilities by introducing IR for resiliency. A combined spatial view allows for rapid user understanding of the scene. Here we can sense a person with multiple modes, with fused results displayed as a unified detection. Additionally, sensor fusion combines information from multiple modalities, increasing confidence and providing reliable detections even in low visibility or complex conditions. As we switch to darkened conditions where vision alone no longer produces accurate detections, we see we are still able to detect the person due to the added resiliency of IR.

The integration of multiple data modalities such as vision, IR, ISAC, and others  showcases how sensor fusion  can transform response efforts, ensuring that even in challenging conditions, critical missions are executed with confidence.

 

Advanced Indications and Warnings Capability

Booz Allen uses computer vision and satellite imagery to deliver actionable geospatial intelligence for faster mission-critical decisions.

View Full Video Transcript

Hi, I'm Sarah and I'm a technical lead on the Vision AI team at Booz Allen. Our Advanced Indications and Warnings capability combines computer vision and remote sensing to understand activities around the world. The Earth's surface spans nearly 200 million square miles, which is far too vast for any team to monitor alone. And in a world where change happens every second, decision makers can't afford to wait for analysis.


Innovations in computer vision and the exponential increase in available satellite imagery have heightened the demand for systems that can detect and contextualize anomalous activity or events of interest. These capabilities are essential for enabling downstream systems to send an alert, react autonomously, or deploy another appropriate response.


Our technology uses state-of-the-art computer vision models with satellite imagery to understand changes in activity at sites of interest. In this scenario, an unannounced launch of an adversary satellite is about to occur. Our pipeline detects visual changes in relevant locations that indicate an upcoming event, sending a notification or triggering an action without the need for continuous human monitoring. The system can then autonomously cue an increase in image collection frequency, resolution, and sensor types to expand and refine our understanding of what's happening on the ground.


Using the same analysis pipeline, we can also see large scale changes at the launch site over time, giving us important insights such as the construction of a new launch pad or increased capacity along a supply route. Because our analysis has already established a baseline of the vegetation, soil and pavement in the area, we can detect when roads are built or trees give way to paved surfaces, indicating an increase in the site's launch capacity.


These early insights can inform future data collection requirements. At Booz Allen, we're focused on empowering our customers to make informed decisions swiftly and accurately, ensuring mission success. With Advanced Indications and Warnings, customers can be confident that they'll maintain strategic advantage with the ability to process large volumes of satellite imagery using the latest technology and our in-depth analysis capabilities. Learn more about what we're doing with computer vision at BoozAllen.com. 

Comprehensive. Integrated. Flexible.

As a pre-integrated offering that leverages modular, best-of-breed software components and AI technologies, the Vision AI stack reduces the time needed to scale specialized computer vision skill sets for complex use cases while streamlining delivery, reducing risk, and elevating quality. Our lean manufacturing approach to AI engineering, aiSSEMBLE™, provides an efficient foundation for Vision AI through tested AI reference architectures, data delivery and machine learning patterns, and reusable software capabilities that together drive momentum toward system deployment from day one. We deliver and deploy Vision AI flexibly to support any mission across cloud, on-premises, and edge environments.

Pipeline Model Graphic

The technology behind Booz Allen’s Vision AI technology stack supports a number of additional solutions offering cutting-edge performance, including:

Replicant™

Booz Allen’s custom-built synthetic data generation framework, Replicant, enables organizations to augment real-world datasets with synthetic imagery in order to address challenges such as privacy concerns, regulatory constraints, scarce financial resources, and accessibility limitations.

Computer Vision Use Cases

Multi-Modal

Integrated Content Analysis

Example: Manage text, audio, and video feeds as a single workstream for automated monitoring and analysis.

Electro-Optical/Infra-Red (EO/IR)

Rapid Search/Detection and Tracking

Example: Transforming the speed, efficiency, and safety of search-and-rescue missions.

Synthetic Aperture Radar (SAR)

Satellite Imagery Detection

Example: Scanning large-scale images to identify airfields and aircraft in a specified area.

Radio Frequency (RF)

Signal Classification

Example: Analyzing RF-based intelligence signals as images through computer vision algorithms.

Vision AI Services From Booz Allen

Multiple configuration and delivery options provide maximum agility to address your unique and evolving mission needs.

 

Contact Us

Contact us to learn more about harnessing the power of Vision AI to transform critical missions.

Thank You for Contacting Us

Thank you for contacting Booz Allen. Your inquiry has been passed on to the appropriate team and we will follow up regarding your submission as soon as possible.

You can update your communications choices at any time by visiting our preference center

You can learn more about Booz Allen by following us on LinkedIn or X.