An expert consultation
A missile crossed a blue sky and struck a school building in Iran.
The video had already been seen more than a million times when it reached Hany Farid, one of the world’s leading digital-forensics researchers. His first instinct was that it was probably fake. In the preceding days, he had examined a stream of convincing synthetic videos depicting bombings, fires, executions and plane crashes.
But instinct was no longer enough.
Farid slowed the footage down and examined it frame by frame. The movement of the handheld camera appeared plausible. The shadows were geometrically consistent. The delay between the explosion and the sound matched the physics of distance. He geolocated the scene, stabilized the footage, tracked the missile’s trajectory, estimated its speed and measured its apparent size.
Each check made the video look more credible. Yet even after a full day of analysis and consultation with other experts, Farid did not declare it unquestionably authentic. His conclusion remained narrower: he had found no compelling evidence that it was fake or manipulated.
That distinction matters.
The central problem today
The central problem of modern image verification is no longer simply that convincing fabrications exist. It is that the time required to investigate an image or a video now exceeds the time required for that media to shape public reality.
By the time an expert has checked the shadows, compression, geography, audio, metadata and physical consistency, the video may already have been viewed millions of times. People have formed conclusions. News organizations have repeated it. Political actors have weaponized it. Synthetic variations may already be circulating alongside the original.
The collapse of expert visual intuition
For much of the history of image verification, trained observation occupied a privileged position. An experienced photographer, journalist or forensic analyst knew where to look: at the direction of the light, the structure of reflections, the consistency of perspective, the texture of skin, the anatomy of hands and the physical plausibility of the scene. Expertise meant developing a mental model of how photographs behave—and noticing when an image violated it.
The expert eye has not become worthless. It has become too slow, too isolated and too vulnerable to serve as the complete verification system. What should replace it is not another eye—human or automated—but an auditable process.
Not useless, insufficient
The expert eye has not become useless. It has become insufficient.
Human judgment remains essential for recognizing context, motive, narrative plausibility and inconsistencies that automated systems may overlook. But generative models are increasingly capable of reproducing many of the surface characteristics that once distinguished photographs from synthetic images. At the same time, authentic images are becoming less photographically straightforward. Smartphone cameras routinely combine multiple exposures, reconstruct detail, suppress noise, sharpen edges, blur backgrounds and separate subjects from their surroundings.
The result is a convergence from two directions. Synthetic images increasingly imitate photographic imperfection, while real photographs are increasingly shaped by computational reconstruction. The visual boundary between the two is therefore not simply becoming harder to see; in some cases, it is becoming conceptually less distinct.
Why “spot the artifact” advice is expiring
For several years, public guidance on synthetic images focused on conspicuous mistakes: malformed hands, unreadable text, asymmetric eyes, impossible reflections or unnaturally smooth skin. Such advice was useful when image generators failed in predictable and visible ways.
But visible defects are temporary characteristics of particular generations of technology, not permanent properties of synthetic media. The Guardian warned already in 2024 that the days of relying on tell-tale signs were “nearly over.” It quoted counter-disinformation specialist Mike Speirs saying that manual detection was already time-consuming, difficult to scale and losing ground as models improved. [2]
Artifact spotting also creates a second problem: it encourages a sense of false confidence. Once people learn a checklist of supposed AI indicators, they may begin treating their absence as proof of authenticity.
Conversely, genuine photographs containing motion blur, aggressive denoising, imperfect anatomy, unusual lighting or computational artifacts may be wrongly accused of being synthetic. An editorial in The Guardian last May revealed their readers, wrongly, increasingly accuse the paper of publishing ‘fake images’. [3]
The problem is therefore not merely that old clues disappear. It is that any static list of visual clues will eventually become unreliable when applied without context.
Why automated scores are not sufficient replacements
It may seem natural to replace fallible human vision with an automated detector. Yet a detector that returns a single result—“AI-generated: 87%”—does not solve the epistemic problem. It relocates it.
A probability score does not explain what the system observed, which parts of the image influenced the result, which assumptions were made or where the detector is likely to fail. It may also conceal whether the result was driven by one strong signal or by several weak and contradictory ones.
Detection systems face distribution shift: a model trained on yesterday’s generators may perform poorly on tomorrow’s. Recompression, resizing, screenshots, social-media processing and ordinary editing can remove or distort the traces on which a classifier depends. Hybrid images create a further complication. A photograph may be authentic overall but contain a generated background, an inpainted object or a replaced face. A whole-image score can obscure precisely the localized manipulation an investigator needs to understand.
An unexplained score asks the user to replace trust in their own eyes with trust in another opaque authority. That may be faster, but it is not necessarily more accountable.
Provenance versus content-based forensics
A more resilient verification system should begin by distinguishing two complementary questions.
The first is provenance: where did the image come from, which device or application created it, and what happened to it afterwards? Cryptographically signed capture, Content Credentials, C2PA records and documented chains of custody can provide strong evidence about origin and editing history.
The second is content-based forensics: what does the image itself reveal when provenance is missing, incomplete or untrustworthy? Metadata may have been stripped. A platform may not preserve credentials. An image may arrive only as a screenshot. In those cases, analysts must still examine compression behaviour, noise, frequency structure, local texture, repeated patches, boundaries and inconsistencies between regions.
Provenance should generally be preferred when it is credible because it provides positive evidence of origin rather than merely searching for suspicious patterns. But provenance is not a universal solution. Credentials can be absent, unsupported or detached from the file. Metadata can be modified. A valid capture history may establish where an image began without proving that every visible element remained untouched.
The practical answer is therefore not provenance or content analysis. It is provenance first, followed by content-based examination where necessary, with neither treated as infallible.
Signals as fallible witnesses
A useful model is to treat each forensic signal as a witness.
A witness reports an observation, but also has a level of confidence, a limited field of view and known conditions under which its testimony becomes unreliable. Error-level analysis may identify uneven compression, but ordinary editing and resaving can produce similar patterns. Patch similarity may reveal repeated synthetic structure, but architectural textures, fabrics and decorative backgrounds are naturally repetitive. Smooth frequency behaviour may be suspicious in one context and entirely normal after smartphone denoising in another.
The task is therefore not to count how many signals vote ‘synthetic’. It is to evaluate what each signal is qualified to say.
That requires context-sensitive weighting. A signal should disclose the observation it made, the evidence supporting it, its known failure modes and the counter-evidence that weakens its interpretation. Signals should also be able to abstain. When the available evidence is insufficient, the honest result is not a forced classification but an explicitly inconclusive finding.
The central technical problem becomes arbitration, how should the system reason when multiple imperfect witnesses disagree?
What an auditable system should disclose
An auditable forensic system should reveal substantially more than its conclusion.
It should disclose which analyses were performed, which could not be performed and why. Also it should identify the regions examined, the measurements produced and the thresholds applied. It should distinguish direct observations from interpretations built on those observations.
It should also preserve contradictory evidence. If metadata suggests a conventional camera workflow while boundary analysis identifies a localized inconsistency, both facts belong in the report. The system should not silently discard whichever one conflicts with its preferred answer.
A particularly useful requirement is the counterfactual: What evidence would have changed the conclusion?
Would the interpretation change if a suspicious measurement fell below a threshold? What if credible signed provenance outweights weak texture anomalies? Would a larger number of valid subject patches increase confidence? Would the absence of a complete metadata history matter less if the image had passed several independent camera-origin checks?
Counterfactual disclosure makes the decision process inspectable. It shows whether the conclusion rests on robust convergence or on one fragile assumption.
How SignalLens approaches the problem
SignalLens is being developed around this evidence-aggregation model rather than as a binary AI detector.
It examines multiple forensic domains, including metadata, compression behaviour, frequency-domain structure, patch recurrence, subject and background characteristics, boundary regions and localized processing differences. Visual overlays, masks and heatmaps help users inspect where particular observations originated.
For every analyzed image, SignalLens creates a local audit trail containing structured measurements, summaries and a reasoning trace. Its purpose is not merely to state that an image appears suspicious or camera-consistent, but to document which signals contributed, which signals conflicted and which limitations reduced confidence.
The system is designed to separate evidence from interpretation. A texture anomaly is first recorded as an observation. Only then is it considered in relation to scene context, metadata, other forensic signals and known failure modes. Where the evidence does not support a reliable conclusion, the result should remain qualified rather than being converted into artificial certainty.
SignalLens is still an early beta, and its analyses should not be treated as proof. Its value lies in making the investigative process visible: allowing users to inspect the evidence, question the reasoning and revisit the conclusion.
Why uncertainty remains part of the answer
The loss of trust in the expert eye does not mean that expertise should be replaced by software. It means expertise must be supported by a broader evidence system.
No individual clue, classifier, provenance standard or forensic module will solve the problem on its own. Some authentic images will look synthetic. Also some synthetic images will preserve convincing camera-like traces. Some manipulations will affect only a small region. Some files will arrive without usable provenance. And some cases will remain genuinely unresolved.
Uncertainty is therefore not necessarily a defect in the analysis. It can be evidence that the system is respecting the limits of what the image can support.
What should replace the expert eye is not another unquestionable eye.
It should be an auditable process: provenance where available, content-based forensics where necessary, multiple signals treated as fallible witnesses, explicit arbitration between competing evidence and a clear account of why the conclusion remains certain, uncertain or inconclusive.
The goal is not to make human judgment obsolete. It is to give human judgment something more reliable than appearance alone.
Notes
[1] The World’s Leading Deepfake Expert No Longer Trusts His Own Eyes, The New York Times, 14 June 2026.
[2] Alex Hern’s Guardian article reports that familiar visual defects are disappearing as generators improve. It quotes Mike Speirs warning that manual detection is time-consuming, unscalable and running out of time.
[3] Your lying eyes? The perils of picture editing in the age of generative AI, 23/05/2026, Katharine Viner, editor-in-chief