The Science of Detection

How we combine advanced Forensics with Next-Gen AI to detect the undetectable.

Our Hybrid Detection System

The Brain

Cognitive AI Analysis

Our AI brain thinks like a human expert, understanding context, lighting physics, skin texture realism, and anatomical consistency. It can explain why an image feels "off" even without obvious 6-finger hands.

Gemini 2.5 Pro

The Eyes

Deep Pixel Forensics

A heavy-duty Vision Transformer specifically trained to catch the newest AI generators (Flux.1, Midjourney v6, DALL-E 3). It analyzes the invisible digital fingerprint left by the generation process.

SigLIP ViT (Flux Killer)

The Detector

Regional Manipulation Analysis

A specialized system that detects partial AI manipulation—catching face swaps, deepfake frames, and AI-enhanced edits by analyzing activation differences across 9 image regions.

Grad-CAM Heatmap

Enhanced Scanner Deep Dive

The Pro tier combines all three detection layers with intelligent fusion logic

1

Upload Image

2

Parallel Analysis

ViT + LLM + Grad-CAM

3

Hybrid Fusion

Intelligent merge logic

What Each Layer Detects

Manipulation Type The Brain (LLM) The Eyes (ViT) The Detector (Grad-CAM)
Fully AI-Generated Images ★★★
AI Face Swaps on Real Photos ★★★
Deepfake Video Frames ★★★
AI Inpainting / Local Edits ★★★
Anatomical Errors (6 fingers, etc.) ★★★
Lighting / Physics Inconsistencies ★★★

★★★ = Primary detector | ✓ = Can detect | △ = Limited detection | — = Not applicable

Regional Heatmap Analysis

The Grad-CAM system divides each image into a 3×3 grid (9 regions) and calculates AI activation levels for each zone. This reveals:

  • Face vs Background differences: Real photos have uniform activation; manipulated photos show high face activation with low background
  • Hotspots: Localized areas with >35% activation indicate AI-generated or edited regions
  • Regional spread: A >20% difference between min and max regions suggests partial manipulation

Example: 3×3 Regional Grid

10%
25%
12%
41%
28%
22%
8%
15%
11%

Red = Hotspot (41%) suggests face manipulation

Explore Our Scanning Tiers

Frequently Asked Questions

What is noise residual analysis?

Noise residual analysis looks at the tiny statistical patterns hidden in an image's pixels. Different cameras, editing tools, and generative models leave different "fingerprints" in this noise. By studying these patterns, we can tell whether an image behaves more like a natural photograph or a synthetic creation.

What is Grad-CAM and how does it detect manipulation?

Grad-CAM (Gradient-weighted Class Activation Mapping) is a visualization technique that shows which parts of an image the AI model focuses on when making its decision. We've extended this to analyze regional activation patterns—dividing images into 9 zones and comparing AI "confidence" across regions. When one area (like a face) shows significantly higher AI activation than surrounding areas (like the background), it suggests that region was generated or manipulated separately from the rest of the image.

Can it detect deepfake video frames?

Yes! Deepfake video frames are particularly challenging because the base image is often a real photo that's been "animated" by AI. Our regional analysis system specifically looks for the tell-tale pattern: high AI activation in face regions with low activation in the background—exactly what happens when AI puppets a face onto existing footage.

Can it detect Photoshop edits?

Our system is optimized for AI-generated content and AI-assisted manipulation. Traditional Photoshop edits (cloning, color adjustment, cropping) may not be detected unless they involve AI-powered tools like generative fill or AI-enhanced features. For pure forensic analysis of traditional edits, specialized tools may be more appropriate.

Why Gemini 2.5 Pro?

Gemini 2.5 Pro is Google's advanced reasoning model, capable of understanding complex contexts and subtle clues that older AI models overlook. It combines deep learning with logical thinking, making it ideal for detecting sophisticated fakes by analyzing lighting physics, anatomical consistency, and contextual plausibility.

What happens when the detectors disagree?

Our hybrid fusion logic handles conflicts intelligently. If the ViT says "AI" but the LLM says "Real," the system checks the Grad-CAM regional analysis. If regional manipulation is detected (≥70% confidence), the system overrides to "Likely Manipulated." If no clear pattern emerges, the result is marked "Inconclusive" with a recommendation for manual review—we'd rather be honest about uncertainty than give a false confident answer.

Ready to Test the Technology?

Upload an image and see how our triple-layer system works in action.

Try a Scan Now