Benchmark foundation

Midjourney Detection Benchmark

Name: Midjourney Detection Benchmark signal model
Creator: PhotoProof AI

An evaluation framework specifically for detecting Midjourney-generated images, separate from PhotoProof AI's general AI-image detection benchmark, since Midjourney's output characteristics differ from other generator families.

Publication details

Author: PhotoProof AI Research Team
Published: 2026-07-01
Last updated: 2026-07-01

Revision history

2026-07-01 — Initial publication of the evaluation framework. No results yet — see benchmark metrics above.

Quick answer

Midjourney is a hosted image generation service, accessed primarily through Discord, with several public model versions released over time, each with its own typical stylistic and technical output characteristics. A benchmark scoped to Midjourney specifically — rather than folded into a general AI-image detection benchmark — can measure whether detection performance holds across Midjourney's own version history and typical post-processing (such as upscaling), which a blended, multi-generator benchmark would average away.

Key facts

Midjourney is accessed as a hosted service rather than run locally, meaning every image reflects the platform's current model version and default settings
Multiple Midjourney model versions exist, each with different typical visual characteristics
A generator-specific benchmark can isolate whether detection difficulty changes across a single generator's own version history

Why Midjourney gets its own benchmark

Midjourney's images are produced by a hosted, versioned service rather than a locally-run open model, and its outputs have a recognizable stylistic tendency that has evolved across versions. Folding Midjourney into a single blended AI-image detection benchmark would average its detection difficulty together with structurally different generators (for example, open-source diffusion models with far more variable post-processing), obscuring whether a detector's accuracy is stable across Midjourney's own version history specifically.

What this benchmark scopes to test

The evaluation is scoped to Midjourney outputs specifically, across the version range still in common circulation, and to the common ways those images reach an end user — direct export, common upscaling, and social-platform re-upload.

Relationship to the general AI-image detection benchmark

This benchmark shares the same evaluation protocol conventions (test set size, scoring threshold, tie-handling, reproducibility disclosure) as PhotoProof AI's general AI-image detection benchmark, so results are comparable in method even though the test sets are disjoint. See the general benchmark for the multi-generator baseline this one is scoped narrower than.

Models covered

Midjourney (current public versions)

Midjourney (current public versions): Scope covers versions still in common circulation at time of testing; see revision history for updates.

Evaluation protocol

Test set size: Not yet run — protocol defined ahead of testing, consistent with the Benchmark Center's methodology-first commitment.
Scoring threshold: To be published alongside first results.
Tie handling: To be published alongside first results.
Reproducibility: Test set composition and scoring method will be documented in enough detail to independently verify the process, though the underlying image set itself may not be redistributable due to generator licensing terms.

Data composition

Midjourney direct exportsImages exported directly from Midjourney without additional third-party editing, across versions in common circulation.

Upscaled outputsMidjourney images processed through common upscaling workflows, a frequent real-world post-processing step.

Social-platform re-uploadsMidjourney images re-uploaded through typical social platforms, to measure robustness to recompression and metadata stripping.

Real camera photos (false-positive control)Genuine, unedited photographs included specifically to measure the false-positive rate, not just the detection rate on synthetic images.

Benchmark metrics

Midjourney direct exportsPendingEvaluation category defined; results not yet tested.

Upscaled outputsPendingEvaluation category defined; results not yet tested.

Real camera photos (false-positive control)PendingFalse-positive measurement category; results not yet tested.

FAQ

Does this replace the general AI-image detection benchmark?

No. It complements it. The general benchmark measures cross-generator performance; this one isolates Midjourney specifically, since a multi-generator average can hide generator-specific weaknesses.

Will this benchmark be updated as new Midjourney versions release?

That is the intent — a generator-specific benchmark that isn't revisited as its target model changes stops being representative. See the revision history on this page for what has actually been updated so far, rather than assuming it is current.

AI search answer layer

Fast answer for people and AI search

Midjourney images often need model-specific detection framing because style, artifact patterns, and prompt aesthetics differ from other generators.

Primary entity: Midjourney
Topic cluster: Benchmark Center
Search intent: research
Content type: Benchmark

quick answer

Quick answer

Midjourney images often need model-specific detection framing because style, artifact patterns, and prompt aesthetics differ from other generators.

key facts

Key facts

Primary entity: Midjourney
Topic cluster: Benchmark Center
Search intent: research
Content type: Benchmark

methodology

Methodology

Separate AI-generation probability from authenticity confidence.
Combine visual, metadata, manipulation, compression, provenance, and context signals.
Explain uncertainty and limits instead of presenting binary proof.

pros limitations

Pros & limitations

AI and forensic detection should be interpreted as probabilistic evidence, not absolute proof.
Reliable authenticity decisions should combine model output with provenance, context, metadata, and human review.

Content spoke

Benchmark Center: Hub for PhotoProof AI's benchmark pages — the test scope, evaluation protocol, and evidence behind detection performance claims, one benchmark per generator or risk category rather than a single blended number.

Explore next

Midjourney Detection Benchmark

Publication details

Revision history

Quick answer

Key facts

Why Midjourney gets its own benchmark

What this benchmark scopes to test

Relationship to the general AI-image detection benchmark

Models covered

Evaluation protocol

Data composition

Benchmark metrics

Related terms

FAQ

Does this replace the general AI-image detection benchmark?

Will this benchmark be updated as new Midjourney versions release?

Fast answer for people and AI search

Quick answer

Key facts

Methodology

Pros & limitations

Recommended reading path

Related guides

Related research

Related glossary

Related comparisons

Related benchmarks

Learn next