Restoring and attributing ancient texts
using deep neural networks
Introduction
The research from the article shows how models such as
Ithaca can unlock the cooperative potential between artificial intelligence and
historians, impacting the way that we study and write about one of the most
important periods in human history.
Here we present Ithaca, a deep neural network for the
textual restoration, geographical attribution and chronological attribution of
ancient Greek inscriptions. Ithaca is designed to assist and expand the
historian’s workflow.
Fig. 1 | Restoration
of a damaged inscription. This inscription (Inscriptiones Graecae, volume 1,
edition 3, document 4, face B (IG I3 4B)) records a decree concerning the
Acropolis of Athens and dates to 485/4 bc. Marsyas, Epigraphic Museum,
WikiMedia CC BY 2.5.
While Ithaca alone achieves 62% accuracy when restoring
damaged texts, the use of Ithaca by historians improved their accuracy from 25%
to 72%, confirming the synergistic effect of this research tool. Ithaca can
attribute inscriptions to their original location with an accuracy of 71% and
can date them to less than 30 years of their ground-truth ranges, redating key
texts of Classical Athens and contributing to topical debates in ancient
history.
Problem (Why?)
Ancient history relies on disciplines such as the study of
inscribed texts known as inscriptions - epigraphy - for evidence of the
thought, language, society and history of past civilizations. However, over the
centuries, many inscriptions have been damaged to the point of illegibility,
transported far from their original location and their date of writing is
steeped in uncertainty.
Specialist epigraphers must then reconstruct the missing
text, a process known as text restoration, and establish the original place and
date of writing, tasks known as geographical attribution and chronological
attribution, respectively. These three tasks are crucial steps towards placing
an inscription both in history and within the world of the people who wrote and
read it.
These tasks are non-trivial, and traditional methods in
epigraphy involve highly complex, time-consuming and specialized workflows.
When restoring damaged inscriptions, epigraphers rely on accessing vast
repositories of information to find textual and contextual parallels. These
repositories primarily consist of a researcher’s mnemonic repertoire of
parallels and, more recently, of digital corpora for performing ‘string
matching’ searches. However, differences in the search query can exclude
or obfuscate relevant results, and it is almost impossible to estimate the true
probability distribution of possible restorations. Attributing an inscription
is equally problematic—if it was moved, or if useful internal dating elements
are missing, historians must find alternative criteria to attribute the place
and date of writing (such as letterforms, dialects). Inevitably, a high level
of generalization is often involved (chronological attribution intervals can be
very long)
Proposed Solution(What?)
Deep learning for epigraphy
Here we overcome the constraints of current epigraphic
methods by using state-of-the-art machine learning research. Inspired by
biological neural networks, deep neural networks can discover and harness
intricate statistical patterns in vast quantities of data. This choice was due
to two main reasons:
·
First, the variability of contents and context
of the Greek epigraphic record, which makes it an excellent challenge for
language processing;
·
and second, the availability of digitized
corpora for ancient Greek, an essential resource for training machine learning
models.
Methodology(How?)
By developing a
pipeline to retrieve the unprocessed Packard Humanities Institute (PHI)dataset,
which consists of the transcribed texts of 178,551 inscriptions. This process
required rendering the text machine-actionable, normalizing epigraphic
notations, reducing noise and efficiently handling all irregularities.
Each PHI inscription is assigned a unique numerical ID, and
is labelled with metadata relating to the place and time of writing.
PHI lists a total of 84 ancient regions; whereas the
chronological information is noted in a wide variety of formats, varying from
historical eras to precise year intervals, written in several languages,
lacking in standardized notation and often using fuzzy wording
After crafting an extended ruleset to process and filter the
data (Methods), the resulting dataset I.PHI is to our knowledge the largest
multitask dataset of machine-actionable epigraphical text, containing 78,608
inscriptions. Ithaca is a model for epigraphic tasks
To begin, contextual information is captured more
comprehensively by representing the inputs as words; however, parts of words
could have been lost over the centuries. To address this challenge, we process
the input text as character and word representations jointly, representing
damaged, missing or unknown words with a special symbol ‘[unk]’.
Next, to enable large-scale processing, Ithaca’s
torso is based on a neural network architecture called the transformer22, which
uses an attention mechanism to weigh the influence of different parts of the
input (such as characters, words) on the model’s decision-making process. The
attention mechanism is informed of the position of each part of the input text
by concatenating the input character and word representations with their
sequential positional information
In the example shown
in Fig. 2, the restoration head predicts the three missing
characters; the geographical attribution head classifies the inscription
among 84 regions; and the chronological attribution head dates it to
between 800 bc and ad 800. Interpreting the outputs
Our intention was to maximize the collaborative potential
between historians and deep learning. Ithaca’s architecture was therefore
designed to provide intelligible outputs, while featuring multiple
visualization methods to augment the interpretability of the model’s predictive
hypotheses.
For the task of restoration, instead of providing
historians with a single restoration hypothesis, Ithaca offers a set of the top
20 decoded predictions ranked by probability (Fig. 3a). This first
visualization facilitates the pairing of Ithaca’s suggestions with historians’
contextual knowledge, therefore assisting human decision-making. This is
complemented by saliency maps, a method used to identify which unique input
features contributed the most to the model’s predictions, for both the
restoration and attribution tasks (Fig. 3d and Extended Data
Fig. 5a). For the geographical attribution task, Ithaca classifies
the input text among 84 regions, and the ranked list of possible region
predictions is visually implemented with both a map and a bar chart
(Fig. 3b). Finally, to expand interpretability for the chronological
attribution task, instead of outputting a single date value, we predict a
categorical distribution over dates (Fig. 3c). By so doing, Ithaca can
handle ground-truth labels more effectively, as the labels correspond to date
intervals. More precisely, Ithaca discretizes all dates between 800 bc and ad
800 into 10-year bins, resulting in 160 decades. For example, the date range
300–250 bc is represented as 5 decades of equal 20% probability, whereas an
inscription dated to 305 bc would be assigned to the single-decade-bin 300–310
bc with 100% probability.
.
Experimental
evaluation
To compare performance in the three epigraphic tasks, we use
four methods:
·
First, we evaluate the difficulty of the
restoration task by assigning two evaluators with epigraphical expertise
(‘ancient historian’) a set of damaged inscriptions to restore, using the
training set to search for textual parallels.
·
Second, we provide the human experts with a
ranked list of Ithaca’s top 20 restoration hypotheses to inform their
predictions (‘ancient historian and Ithaca’), therefore assessing the true
impact of our work as a cooperative research aid.
·
Third, as a computational baseline we
reimplement our previous work Pythia15— a sequence-to-sequence recurrent neural
network for the task of ancient-text restoration.
·
Finally, for the attribution tasks, we introduce
an ablation of the epigrapher’s workflow, the ‘onomastics’ baseline: annotators
were tasked with attributing a set of texts, exclusively using the known
distribution of Greek personal names across time and space to infer
geographical and chronological indicia27.
We
introduce the following metrics to measure each method’s performance.
For
restoration, to obviate the lack of ground truths in damaged inscriptions, we
artificially hide 1 to 10 characters of undamaged input text and treat the
original sequences as the target.
The
first metric used is the character error rate (CER), which counts the
normalized differences between the top predicted restoration sequence and the
target sequence. Furthermore, we use top-k accuracy to measure whether the
correct restoration or region label for geographical attribution is among the
top k predictions, therefore quantifying Ithaca’s potential as an assistive
tool.
As
shown in Table 1, for the task of restoration, Ithaca consistently
outperforms the competing methods, scoring a 26.3% CER and 61.8% top 1
accuracy. Specifically, our model achieves a 2.2× lower (that is, better) CER (cauti
ce inseamna cer) compared with human experts, whereas Ithaca’s top 20
predictions achieve a 1.5× improved performance compared with Pythia, with an
accuracy of 78.3%. Notably, when pairing historians with Ithaca (ancient
historian and Ithaca), human experts achieve an 18.3% CER and 71.7% top 1
accuracy, therefore demonstrating a considerable 3.2× and 2.8× improvement
compared with their original CER and top 1 scores. Regarding the attribution to
regions, Ithaca has 70.8% top 1 and 82.1% top 3 predictive accuracy.
Finally,
for chronological attribution, whereas the onomastics human baseline
predictions are within an average of 144.4 and median of 94.5 years from the
ground-truth date intervals, Ithaca’s predictions, based on the totality of
texts, have an average distance of 29.3 years from the target dating brackets,
with a median distance of only 3 years.
Conclusion
Historians may now use Ithaca’s interpretability-augmenting
aids (such as saliency maps) to examine these predictions further and bring
more clarity to Athenian history.
Bibliography: https://www.nature.com/articles/s41586-022-04448-z.pdf
Niciun comentariu:
Trimiteți un comentariu