In addition to studying the history of the time a hoard was placed, numismatists can learn quite a lot from the coins themselves, and artificial intelligence is helping ease the chore. This new paper published on PCI Archaeology describes how AI can assist with sifting through a large trove of hoard data for new insights.
Thanks to Ted Banning for passing this along.
-Editor
In the project "Classifications and Representations for Networks: From types and characteristics to linked open data for Celtic coinages" (ClaReNet) we had access to image data for one of the largest Celtic coin hoards ever found: Le Câtillon II with nearly 70,000 coins. Our aim was not to develop new processes, but rather to demonstrate how existing tools can be used to support the numismatic task of processing and analysing large complexes of coins, thus validating the enormous potential of IT-based methods. Our pipeline thus included the following steps:
- Pre-sorting and size estimation based on Object Detection - Since the focus of our work was the image of the coin itself, the coins need to be detected and cropped from the photos. At the same time the size of the coin can be calculated by detecting the scale bar on the photo, thus allowing a first sorting process to identify the staters on the basis of their being larger than the quarters and petits billons.
Figure 4 - Optimal Prediction of the model. Calculated values: height: 2.321cm, width: 2.194cm. (Photo: Jersey Heritage. Graphic: C. Deligio, Big Data Lab)
- Further pre-sorting based on Unsupervised learning - The intention was to use only the images as input, so that initially we employed methods that do not require any further domain knowledge. This step of the pipeline was repeated in order to identify groups of high similarity that corresponded to the expert's classification, while removing corroded and worn coins to eliminate any bias they might cause.
- Classifying the coins by class based on Supervised learning - The results of step two were checked against the classification of the expert, and the groups that corresponded to the expert's classification used to train a model to assign the coins from the other batches to the numismatic classes. The domain expert was also involved in the process.
- Implementing a die study - By checking the results against the die study already carried out by the numismatists for one of the classes, we compared different approaches (unsupervised, supervised and feature detection) for their effectiveness in the task. Finally, we implemented a system based on Orange Data Mining in order to support the expert in their ongoing work on the remaining five classes to accelerate the process.
For the steps one to three we used a divide and conquer approach, meaning that we divided the dataset step by step in order to facilitate better analysis at the next step. The first division into large/small and broken coins (second row) is the result of object detection and the calculation of the approximate size of the coins. The second division into high and low quality is produced by the unsupervised method, the last one is the result of the application of size identification (step one) together with step two (high/low quality).
Figure 2 - Using the divide and conquer methodology, the data set could be divided step by step into more easily analysable parts (Graphic: C. Deligio, Big Data Lab).
The process from digitisation of a hoard as images to an actual die study is lengthy and work-intensive. We started this research by treating the dataset as if it were a case study of a new find, with no information about the coins available at the outset. With the first step of object detection it was possible to automatically crop the images and, using the scale bar, to calculate the size of the coins and to carry out pre-sorting, in this way helping identify the staters, which were the coins with which we wished to work.
The IT methods used are standard and in our case required little effort/training.
The next step of unsupervised learning still does not need any input from numismatists as domain experts.-but the resulting clusters have to be evaluated manually. A first process with 100 clusters allowed
us to exclude about 12% of the coins, which were identified as unsuitable, badly preserved pieces. The data set was further narrowed down by taking only staters with a calculated diameter of 22mm +-2mm (the standard size as defined by the numismatist), and the unsupervised method was repeated to produce 25 clusters. Since we had received a spreadsheet from the numismatic expert providing his classification of the coins, it was possible verify that a) this dataset did indeed contain only staters, and b) that 18 of the 25 clusters mainly contained coins of the same class. The best result was 0.997% (cluster 20: only two of 772 coins are not of the same class). In a situation where the domain expert has not yet classified the coins, it is clear that the presorting into clusters would significantly speed up his task (as is also the case for the method to support a die study that we developed). However, generating a ground truth is mandatory for training a supervised classification model.
Re-evaluating the data with a supervised method led to a significant improvement in data quality. It also showed that experts and AI had similar problems, in particular in distinguishing classes IV and V, which may indicate that the border between the classes is not sharp. Our experience showed that in such cases it is necessary to involve the domain expert in the evaluation process. Specific modifications can be made to influence the areas or features on which the Al concentrates in order to create a model that is closer to the criteria employed by the numismatic expert, for example concentrating on specific areas (e.g. the nose). But since this was not within the scope of our project, we only briefly looked into this direction.
The results of our work clearly demonstrate that semi-automatic processes can be extremely helpful in sorting and classifying large complexes of coins, and can even support a work-intensive and time-consuming die study. We believe that the system we built around Orange Data Mining will speed up the die study for the other five classes of staters in the Le Câtillon hoard. Furthermore, our experience has shown that a human centric approach that involves close cooperation with domain (numismatic) experts can be a good way to increase trust and acceptance of IT methods and achieve a high success rate.
To read the complete article, see:
Supporting the analysis of a large coin hoard with AI-based methods
(https://zenodo.org/records/11187474)
To read the earlier E-Sylum articles, see:
JERSEY GROUVILLE HOARD BEING DISMANTLED
(https://www.coinbooks.org/esylum_v17n35a32.html)
JERSEY GROUVILLE HOARD WAS TWO COLLECTIONS
(https://www.coinbooks.org/v22/esylum_v22n12a24.html)
Wayne Homren, Editor
The Numismatic Bibliomania Society is a non-profit organization
promoting numismatic literature. See our web site at coinbooks.org.
To submit items for publication in The E-Sylum, write to the Editor
at this address: whomren@gmail.com
To subscribe go to: https://my.binhost.com/lists/listinfo/esylum
Copyright © 1998 - 2023 The Numismatic Bibliomania Society (NBS)
All Rights Reserved.
NBS Home Page
Contact the NBS webmaster
|