Recently we discussed Quantitative Coin Grading's digital microscope for coin grading. An ad in the June 2026 Numismatist alerted me to another company using computer vision to identify, grade and price coins and other collectibles - Vardera. An article about coins from their blog is republished here with permission. Vardera positions their system as a high-end tool for third-party grading companies, auction companies and marketplaces.
-Editor
How AI Measures strike, luster, and surfaces.
Professional graders agree on a coin's grade only 85-90% of the time within one point. That is not a flaw in your team. It is the inherent ceiling of human visual assessment applied to a 70-point scale where a single grade difference can mean thousands of dollars. Now multiply that challenge by submission volumes that grow faster than you can hire, and the math becomes unavoidable: AI coin condition assessment is not a threat to your standards. It is the only way to maintain them.
This guide breaks down exactly how AI evaluates the same four factors your graders evaluate: strike quality, luster, surface preservation, and eye appeal. You will see the specific techniques, the accuracy data, and how production-grade systems differ from the consumer apps making headlines.
Why Grading Bodies Need Scalable Coin Condition Analysis at Scale
Your graders are good. The problem is arithmetic.
Between them, PCGS and NGC have certified over 112 million coins. PCGS alone has graded 42.5 million coins valued at over $36 billion, while NGC has certified more than 70 million. The coin grading services market reached $935.9 million in 2024 and is growing at 9.3% annually. Yet turnaround times for economy-tier submissions still stretch to 45 business days, and an estimated 90% of viable coins in the United States remain ungraded.
The bottleneck is not demand. It is supply of qualified graders:
-
Training a competent grader takes years of mentorship under senior specialists
-
Each coin requires focused attention across multiple condition factors
-
Processing time per coin compounds across thousands of daily submissions
-
Turnaround times range from 2 to 65 business days depending on service level
These are not problems you can solve by hiring faster. They are structural constraints that require a new approach, one that augments your existing grading expertise with consistent, scalable AI coin condition assessment.
The Four Pillars of Sheldon Scale AI Grading
The Sheldon Scale, the 70-point system developed in 1948, evaluates every coin across four core factors: strike, luster, surfaces, and eye appeal. Any credible AI grading system must analyze these same factors. Anything less, and you are not grading; you are guessing.
Here is how production-grade AI actually evaluates each one.
Strike Quality Assessment: How AI Measures Detail Definition
Strike quality indicates how well the dies transferred design details to the planchet. A strong strike means every hair strand, feather barb, and letter edge is crisp. A weak strike leaves devices flat or indistinct, particularly on high-relief areas.
AI measures strike quality through a technique called edge detection. Specifically, Sobel edge detection operators quantify the sharpness of transitions between raised devices and flat fields. The system computes continuous gradient values that measure how abruptly brightness changes at device edges, providing a numerical proxy for what your graders assess visually.
More advanced systems use wedge-based spatial analysis, dividing the coin into angular sections (typically eight per side) and evaluating strike sharpness independently across each zone. This matters because strike quality often varies across a single coin: the center may show a full strike while peripheral details remain soft. By analyzing each wedge separately, AI captures the same positional nuances your graders notice when they rotate a coin under magnification.
The AI then compares these measurements against reference images of known full-strike examples for the same type and date. The delta between the submission and the reference produces a strike quality score calibrated to the Sheldon scale.
Coin Luster Analysis: Quantifying How Light Interacts with Surfaces
Luster is the first feature to deteriorate as a coin circulates. Original mint luster creates that distinctive cartwheel effect, a pattern of light reflection caused by flow lines in the metal from the striking process. Your graders tilt coins under focused light to evaluate this. AI does something analogous, but with mathematical precision.
HSV color space clustering separates hue, saturation, and value (brightness) components of each pixel in the coin image. For gold coins, researchers have identified five distinct color categories through this clustering, each corresponding to different levels of luster preservation. Perceptually-weighted brightness computation then models how the human eye perceives reflectivity across the coin's surface.
This approach lets AI distinguish between:
-
Original mint luster: Consistent, directional reflectivity following flow lines
-
Cleaned surfaces: Unnaturally uniform brightness with disrupted flow lines
-
Artificial retoning: Color patterns inconsistent with natural oxidation chemistry
-
Cabinet friction: Localized luster loss on high points only
The ability to differentiate original luster from artificial treatments is critical for certification integrity. A coin that has been cleaned or artificially retoned may appear bright to a quick visual inspection but fails under algorithmic analysis of its reflectivity patterns.
Coin Surface Preservation AI: Detecting Contact Marks, Hairlines, and Damage
Surface preservation is where grading precision has the highest dollar impact. The price difference between an MS-63 and MS-65 on a key-date coin can reach five figures, and the primary differentiator is often surface quality. An MS-64 coin might sell for $100 while the same coin at MS-65 commands $200 or more, according to coin grading guides. On rare dates, those multiples grow dramatically.
AI evaluates surfaces by analyzing high-resolution images for:
-
Contact marks (bag marks): Indentations from coin-to-coin contact during storage and transport
-
Hairlines: Fine, parallel scratches typically from improper cleaning or wiping
-
Rim dings: Damage to the coin's edge from handling or drops
-
Environmental damage: Corrosion, spotting, or discoloration from chemical exposure
The system classifies each detected mark by severity (depth, length, width), location (on a primary focal area vs. the field), and quantity. This mirrors the methodology your graders use, but eliminates the variability that comes from fatigue, lighting conditions, or the subjective weight different graders give to different mark types.
For grading bodies processing thousands of submissions daily, this consistency is the point. Two human graders might disagree on whether a particular contact mark in Liberty's cheek field drops a Morgan Dollar from MS-65 to MS-64. The AI applies the same threshold every time, creating a reproducible baseline your team can calibrate against.
Eye Appeal: The Subjective Factor AI Is Learning to Quantify
Eye appeal is the factor most graders call subjective, and the one that makes skeptics most doubtful about AI. It encompasses the overall visual impression: color, toning, balance, and the indefinable quality that makes certain coins stand out at the same technical grade.
AI approaches eye appeal as a composite score derived from the three technical factors above, plus toning analysis. Color algorithms evaluate whether toning patterns are aesthetically desirable (rainbow crescent toning, for example, typically adds appeal) versus detracting (dark, splotchy oxidation).
The key to AI's ability to quantify eye appeal is training data volume. When a model trains on 200M+ unique items, it does not learn a single grader's preferences. It encodes the collective judgment of thousands of expert evaluations, effectively averaging out individual biases while preserving the consensus on what constitutes positive eye appeal for each coin type.
This is where population data and market intelligence strengthen the model further. Coins with high eye appeal consistently trade at premiums above their technical grade. By incorporating pricing data, AI learns the market's implicit definition of eye appeal, measured in dollars, not opinions.
How Automated Coin Grading Accuracy Compares to Human Experts
The central question for any grading body evaluating AI: is it accurate enough?
The data is encouraging. A peer-reviewed study on automated grading of Saint-Gaudens Double Eagles found:
-
86% exact-grade match using feature-engineered artificial neural networks
-
98% accuracy within three grades on the Sheldon scale
-
91.3% classification accuracy across five coin types in a separate academic study
For context, professional human graders agree on exact grade only 85-90% of the time. That means the best automated systems already match or exceed inter-grader consistency on specific coin types.
Production-grade systems push further. Vardera Labs' deep category models achieve 97-99% authentication accuracy by analyzing mint marks, casting variances, and edge cases that generic image recognition misses. Their coin model is live and in production, processing submissions in seconds rather than the weeks or months of traditional turnaround.
The 97-99% figure represents authentication accuracy (genuine vs. counterfeit, correct attribution). Condition grading on the full 70-point Sheldon scale is a harder problem with more granularity. But the trajectory is clear: AI is approaching and, in some metrics, exceeding human consistency.
AI Coin Grading at Production Scale: From Consumer App to API Infrastructure
Most AI coin grading tools today are consumer apps: upload a photo, get a grade estimate, pay a monthly subscription. These tools serve individual collectors well, but they are not built for grading body operations.
The difference between a consumer app and production-grade API infrastructure is the difference between a calculator and an accounting system:
-
Throughput: Consumer apps process one coin at a time. API infrastructure handles thousands per hour with consistent latency.
-
Integration: Apps require manual photo uploads. APIs plug directly into existing submission intake workflows, imaging systems, and grading management software.
-
Data feedback: Apps are static. Production systems implement a compounding data flywheel where every graded coin improves model accuracy for the next one.
-
Accuracy depth: Apps use generic image models. Production-grade deep category models are purpose-built for numismatics, trained on 200M+ items with domain-specific feature engineering.
Vardera Labs built the world's first coin category model as API infrastructure, the same integration pattern as Stripe for payments or Twilio for communications. Their system is live and in production, with 6+ category models on the roadmap covering coins, comics, cards, and more.
Augmenting Human Graders, Not Replacing Them
Here is the concern no grading body executive will say publicly but every one of them is thinking: if AI can grade coins, why do we need human graders at all?
The answer is that you need both, and AI makes your human graders more valuable, not less.
Consider the workflow AI enables:
-
Triage: AI processes every submission and flags confidence levels. Coins where the model assigns high confidence (clear-cut grades with no anomalies) move through an expedited path. Coins with lower confidence, those with potential cleaning, questionable toning, or grade-borderline characteristics, route to your senior graders.
-
Pre-screening: Before a human grader touches a coin, they see the AI's assessment with specific rationale: strike quality score, luster analysis, surface mark inventory. This front-loads the information gathering that currently eats into grading time.
-
Consistency baseline: AI provides a reproducible starting point. Your graders can calibrate against it, and you can identify when individual graders drift from consensus over time.
-
Edge case focus: Your most experienced graders spend their time where expertise matters most. Problems like cleaned coins, environmental damage, and altered surfaces require human judgment informed by decades of experience. AI handles the volume so your experts handle the complexity.
The result is not fewer graders. It is each grader producing higher-quality output on the coins that actually need their expertise, while the overall operation processes more submissions with shorter turnaround times and tighter consistency.
Your grade is your product. AI does not change what you grade. It changes how fast and how consistently you can deliver it.
See the full article online for an additional section of Frequently Asked Questions. The company also works with other collectibles including sports cards and comic books.
Questions or press inquiries may be sent to Vardera's Tommy Barth
VP, Operations at
tommy@vardera.com.
-Editor
To read the complete article, see:
Strike Quality, Luster, and Surfaces: How AI Assesses Coin Condition at Scale
(https://www.vardera.com/blog/ai-coin-condition-assessment)
To visit the Vardera website, see:
https://www.vardera.com/
To read earlier E-Sylum articles, see:
ON COMPUTER COIN GRADING
(https://www.coinbooks.org/v21/esylum_v21n08a08.html)
DIGITAL COIN GRADING MICROSCOPE LAUNCHES
(https://www.coinbooks.org/v29/esylum_v29n05a08.html)
MORE ON QUANTITATIVE COLLECTORS GROUP
(https://www.coinbooks.org/v29/esylum_v29n06a17.html)
NOTES FROM E-SYLUM READERS: FEBRUARY 15, 2026 :
Quantitative Coin Grading Indiegogo Campaign
(https://www.coinbooks.org/v29/esylum_v29n07a12.html)
Wayne Homren, Editor
The Numismatic Bibliomania Society is a non-profit organization
promoting numismatic literature. See our web site at coinbooks.org.
To submit items for publication in The E-Sylum, write to the Editor
at this address: whomren@gmail.com
To subscribe go to: Subscribe
Copyright © 1998 - 2025 The Numismatic Bibliomania Society (NBS)
All Rights Reserved.
NBS Home Page
Contact the NBS webmaster
|