RedlineBench: AI Architectural Drawing Review Benchmark

Score vs. Cost

Drawing Review Score

Each model given the same drawing set with 58 known issues. Issue catch rate is the percentage of issues correctly identified out of 58 possible. Cost is per drawing review.

Anthropic

OpenAI

Google

xAI

Use it yourself

Try It on Your Drawings

This is the exact prompt sent to every model in the benchmark. It's free, it works on any project type, and it takes about ten minutes.

01

Open any AI chat

Use any AI that accepts file uploads -- Claude, ChatGPT, or Gemini all work.

02

Upload your PDFs

Upload your drawing set and specification booklet. The AI reads them directly. Note: some AI models have PDF upload size limits for free account users.

03

Paste the prompt and run

Copy the prompt below, paste it into the chat, and send.

View full prompt

You are an experienced architect performing a thorough review of a construction document set. Your job is to find errors, inconsistencies, omissions, and coordination failures that would cause problems during construction, the kinds of issues that generate RFIs, change orders, field conflicts, and disputes.

You have been given architectural construction documents in PDF form. These may include a drawing set and a project specification booklet, or only one of the two. Begin by reading all provided documents completely before making any observations.

### Step 1: Establish project context

State the following very briefly:

- Project name, location, and code jurisdiction or code edition

- Building type and construction type

- Climate zone (infer from location if not stated)

- Scope of drawing set, including sheet types included and disciplines absent

- Specification divisions included, or note if none provided

### Step 2: Review the documents

Work through the documents using the following seven review perspectives. For each perspective, think carefully about what the documents show, what they should show but don't, and where they contradict themselves. Report only issues you can identify with reasonable confidence from the documents provided — do not speculate or invent problems. Report only issues, not confirmations of things that are correct.

**Work each category to exhaustion before moving to the next.** Do not stop after finding a few issues in a category — keep looking until you cannot find more. A construction document set of this size and complexity will contain many issues across all seven categories; if a category feels thin, look again before moving on.

**Stay within the scope of what is provided.** If MEP, structural, civil, or other discipline drawings are not included in the set, do not flag their absence as an issue or critique systems that fall under those disciplines. Focus your review on the architectural documents that are in front of you — plans, elevations, sections, details, schedules, and specifications — and evaluate them on their own terms. If the architectural drawings reference other disciplines (e.g., "bidder design" for a system, or a structural condition), you may note where coordination will be needed, but do not review disciplines that are not present.

**Read the drawings carefully before characterizing them.** When describing what the drawings show — layer sequences in wall assemblies, material callouts, dimensional relationships — be precise about what is actually depicted. If a wall section shows layers in a specific order, describe that order exactly as drawn. Do not transpose, infer, or reconstruct assembly sequences from memory or convention. Misreading a drawing and then building analysis on that misreading is worse than missing the issue entirely.

**A. Internal consistency across sheets**

Every element represented on multiple sheets should agree. Look at how the same building components are depicted across different drawing types — plans, elevations, sections, details, and schedules. When the same element appears on two sheets and the information differs, that is a coordination error.

Pay particular attention to schedules (door, window, finish, room, etc.) and whether their entries match what is drawn and what is logical for the assigned space. A schedule entry that assigns an inappropriate finish to a room is an error even if the schedule is internally consistent. Where you can, verify counts — count the windows on an elevation and compare to what the plan shows on that wall face; count door tags on a floor plan and compare to schedule rows. Check that cross-references between sheets (section cuts, detail callouts, elevation markers) point to drawings that actually exist in the set and depict what they claim to. Verify that wall types, floor types, and ceiling types assigned in schedules or plans are consistent with what is detailed in assembly drawings.

**B. Dimensional and geometric integrity**

Dimensions should be internally consistent — chains of dimensions should sum to their stated totals, and measurements for the same element should agree across views. Stair geometry is arithmetic: risers times riser height must equal floor-to-floor height, treads times tread depth must equal total run. Verify these calculations where stair information is provided. Look for missing dimensions that would force a contractor to scale a drawing, and for conflicting dimensions between views. Check that floor-to-floor heights, ceiling heights, and datum elevations are consistent across plans, sections, and elevations.

**C. Specification-to-drawing alignment**

The drawings and specifications are complementary documents describing the same building. If both are provided, perform the following checks. If specifications are not provided, briefly note that spec references on the drawings cannot be verified, then move on — do not enumerate every individual "see specs" callout, as this is not useful without the specs to compare against.

When specifications are available: material callouts, product references, and assembly descriptions on the drawings should be consistent with what the specifications require. Look for conflicts where the drawings call out one material or product and the specifications describe another. Check that specification section references on the drawings point to sections that exist in the project manual. Look for terminology mismatches where drawings and specs use different terms for the same thing. Also look for implicit conflicts — where a specification establishes a performance requirement (fire rating, STC rating, thermal performance, moisture resistance) and the drawn assembly may not be capable of achieving it.

**D. Building envelope and building science**

Evaluate whether the wall, roof, and foundation assemblies shown in sections and details make sense as complete systems for the project's climate.

**E. Code and life safety**

You are not a code official, but you can identify conditions that warrant review. Look for the presence or absence of code-required information: egress paths, exit widths, stair geometry and headroom, handrail and guardrail details, fire-rated assemblies and their continuity (including garage-to-living-space separation — wall type, gypsum board type, door rating, self-closing hardware, and ceiling assembly where habitable rooms are above), accessibility clearances, and emergency escape openings from sleeping rooms. Where code-relevant dimensions are provided, check them for internal consistency. Where code-required information appears to be missing, note its absence. Do not cite specific code sections unless the documents themselves reference them — instead, describe what appears to be missing or inconsistent.

**F. Constructability**

Think about whether the details as drawn can actually be built. Consider physical access for installation, sequencing of assemblies, material compatibility, feasibility of connection details, and whether specified tolerances are achievable. Look for conditions where systems compete for the same space. Consider whether material selections and assembly configurations make sense for the specified application — spans, loads, and exposure conditions implied by the drawings.

**G. Completeness and clarity**

Assess whether the architectural documents contain enough information for a contractor to build what is shown without guessing. Look for details that are referenced but not provided, notes that say "TBD" or "by others" without further clarification, design decisions deferred to field verification that should be resolved on the documents, and general notes that appear to have been carried from a previous project without updating for this one. Evaluate whether the annotation system is consistent — abbreviations, keynote numbering, symbol usage, and graphic conventions should be uniform throughout the set.

### Step 3: Report your findings

Organize your findings by the seven review perspectives above. Use this exact format — one line per finding, no additional prose:

`[Tag] | [Location] | [One-sentence description of the problem] | [Confidence]`

Where:

- **Tag** is the category letter and a sequential number within that category (e.g., A-1, A-2, B-1)

- **Location** is the sheet number(s), detail number(s), spec page(s), or schedule entry where the issue appears

- **Description** is a single sentence stating what is wrong, conflicting, or missing — specific enough that a contractor could locate and understand the problem

- **Confidence** is one of: **Certain** (objectively verifiable from the documents), **High** (clearly visible but involves interpretation), or **Moderate** (warrants review but depends on judgment or information not in the documents)

**Do not write your summary until your findings list is complete.** Work through all seven categories fully. After all findings are listed, provide a brief summary (3–5 sentences) assessing overall document quality and naming the top three areas of greatest concern.

### Important guidance

- **Only report issues, not confirmations.** Do not include findings that confirm something is correct. Every item in your report should describe a problem, concern, or gap.

- **Only report what you can see.** If a drawing is illegible, partially visible, or at a resolution where you cannot read dimensions or text, say so rather than guessing. If you cannot verify element counts because of image quality, state that limitation.

- **Distinguish between errors and judgment calls.** A dimension that contradicts another dimension is an error. A wall assembly that might underperform in a given climate is a judgment call. Label them differently through your confidence rating.

- **Be specific.** "There may be coordination issues" is not useful. "The door schedule lists D-104 as 3'-0" × 7'-0" but the floor plan shows a wider opening at that location" is useful.

- **Do not pad.** If you have genuinely found all the issues in a category, move on. The goal is real findings, not reaching a number.

Want help applying AI-assisted QC to your practice, or interested in having me present this research? I'd love to hear from you.

Get in Touch

What it finds and misses

Issue Spotlights

Four representative issues illustrating the range of difficulty. Cells below each issue show all models left-to-right by overall score.

Window manufacturer conflict

The specifications call for Andersen 400 Series windows, while the window schedule lists Marvin Essential (Ultrex) products. These are completely different manufacturers with different rough-opening dimensions, frame profiles, and weather-sealing details. Every window detail in the A4.10-A4.13 series would need to be reconciled before procurement.

Opus 4.6

GPT-5.4

Opus 4.7

Haiku

S 4.6

Grok

G Flash

G5 Mini

G 3.1

S 4.0

GPT-4o

Missing Type X gypsum above garage

The Level 2 guest suite sits directly above the garage. IRC requires 5/8" Type X gypsum on the garage ceiling when habitable space is above. Floor assembly F-1 specifies standard 1/2" gypsum board. This is a code-required life safety item, not a judgment call.

Opus 4.6

GPT-5.4

Opus 4.7

Haiku

S 4.6

Grok

G Flash

G5 Mini

G 3.1

S 4.0

GPT-4o

Reversed window sill flashing

At the window sill detail (A4.11, Detail 2), the flashing membrane laps under the weather barrier instead of over it. This means water collected by the flashing drains behind the WRB rather than to the exterior, a direct path for water infiltration into the wall assembly. Finding this requires reading the fine details of a window assembly and understanding that the weather barrier wrapping over the flashing at the sill reverses the intended drainage sequence. No model found this issue.

Opus 4.6

GPT-5.4

Opus 4.7

Haiku

S 4.6

Grok

G Flash

G5 Mini

G 3.1

S 4.0

GPT-4o

Stair geometry misread

The stair is L-shaped with 10 risers in the east run and 8 risers in the south run, 18 total, yielding compliant ~6-11/16" risers. Most models misread one run as the entire stair and reported impossible or irreconcilable geometry -- a confident, specific, and wrong finding. This is the costliest kind of AI error: it wastes the reviewer's time and could cause unnecessary field work based on a problem that doesn't exist.

Opus 4.6

GPT-5.4

Opus 4.7

Haiku

S 4.6

Grok

G Flash

G5 Mini

G 3.1

S 4.0

GPT-4o

Found correctly

Found, vague

Missed

False alarm

Issue-Level Detail

All Issues

Every scored issue across all models. Hover a cell for the issue description and score. Models ordered left-to-right by overall ranking.

Found correctly (1 pt)

Found, vague (0.5 pt)

Missed

RedlineBench

Drawing Review Score

All Models

Try It on Your Drawings

Open any AI chat

Upload your PDFs

Paste the prompt and run

Performance by Category

Issue Spotlights

Window manufacturer conflict

Missing Type X gypsum above garage

Reversed window sill flashing

Stair geometry misread

Methodology

One real project, known issues

Identical prompt, every model

Issue-level, manual scoring

All Issues