Towards a Reading of the Vindolanda
Engineering Science and the Papyrologist
We introduce a collaborative project between the Department of
Engineering Science and the Centre for the Study of Ancient Documents at the University of
Oxford regarding the analysis and reading of the Vindolanda Stylus Tablets. We sketch the
imaging and image processing techniques used to digitally capture and analyse the
the development of the image analysis tools to aid papyrologists in the transcription of
the texts, and lessons that can be learned so far from such an inter-disciplinary
1. The Vindolanda texts
2. Image Processing Techniques
3. The need for an interface
4. Papyrology and Computing
5. Knowledge Elicitation
6. The Papyrology Process
7. Developing the tools
8. Project Aims
9. Lessons from the inter-disciplinary project
About the Author
The discovery of the ink and stylus
tablets from Vindolanda, a Roman Fort built in the late 80s AD near Hadrians Wall at
modern day Chesterholm, has provided a unique resource regarding the Roman occupation of
northern Britain and the use and development of Latin around the turn of the first century
AD. However, although papyrologists have been able to transcribe and translate most of the
ink tablets, the majority of the stylus tablets remain unread because of their physical
characteristics. An EPSRC jointly funded project at the Department of Engineering Science
and the Centre for the Study of Ancient Documents (CSAD), University of Oxford, was
initiated three years ago to analyse these tablets and develop new image processing
techniques to retrieve information from small incisions in damaged surfaces, the
techniques under development being applicable to a wide variety of engineering problems.
Some significant progress has been made using wavelet filtering to remove woodgrain in
images of the stylus tablets, and developing and appropriating Shadow Stereo techniques to
identify candidate writing strokes. However, to aid in the complex cognitive and
perceptual processes involved in papyrology, an appropriate knowledge based interface is
being developed to ensure that the papyrologists can utilise such algorithms, and this
interface will also incorporate the lexical and visual knowledge papyrologists rely on to
help them read such ancient texts. This paper gives a brief background to the project,
before discussing the steps taken towards developing the computer application to aid the
papyrologists in the transcription of the texts. Focusing on the interaction between the
engineers, the papyrologists, and the techniques used to identify the computational tools
required, the paper discusses the benefits and problems surrounding such a
multi-disciplinary project, and what lessons can be learned from such a project that are
relevant to the field of Humanities Computing.
(To the top)
The two types of texts discovered at
Vindolanda are unparalleled resources for classical historians since textual sources for
the period in British history around AD 90 to AD 120 are rare. The ink and stylus tablets
from Vindolanda are a unique and extensive group of written documents from the Roman Army
in Britain, and provide a personal, immediate, detailed record of the Roman Fort at
Vindolanda from around AD 92 onwards (Bowman and Thomas 1994; Bowman 1997).
The ink tablets, carbon ink written on thin leaves of wood cut from the
sapwood of young trees, have proved the easiest to decipher. In most cases, the faded ink
can be clearly seen against the wood surface by the use of infra red photography, a
technique used frequently in deciphering ancient documents (Bearman and Spiro 1996). A
major digitisation project is currently being undertaken by the CSAD to produce high
resolution infra red scans of these texts (now held at the British Museum, London) to
enable their further transcription and reading. The majority of the six hundred writing
tablets that have been transcribed so far contain personal correspondence, accounts and
lists, and military documents (Bowman and Thomas 1994).
The two hundred stylus tablets found at Vindolanda appear to follow the
form of official documentation of the Roman Army found throughout the Empire (Turner 1968;
Fink 1971; Renner 1992). It is suspected that the subject and textual form of the stylus
tablets will differ from the writing tablets as similar finds indicate that stylus tablets
tended to be used for documentation of a more permanent nature, such as legal papers,
records of loans, marriages, contracts of work, sales of slaves, etc (Renner 1992).
Manufactured from softwood with a recessed central surface, the hollow panel was filled
with coloured beeswax. Text was recorded by incising this wax with a metal stylus, and
tablets could be re-used by melting the wax to form a smooth surface. Unfortunately, in
nearly all surviving stylus tablets1 the wax has perished, leaving a recessed surface showing the scratches
made by the stylus as it penetrated the wax2. In general, the small incisions are extremely difficult to decipher.
Worse, the pronounced woodgrain of the fir wood used to make the stylus tablets, staining
and damage over the past two thousand years, and the palimpsestic nature of the re-used
tablets further complicate the problem; a skilled reader can take several weeks to
transcribe one of the more legible tablets, whilst some of the texts defy reading
altogether. Prior to the current project, the only way for the papyrologists to detect
incisions in the texts was to move the text around in a bright, low raking light in the
hope that indentations would be highlighted and candidate writing strokes become apparent
through the movement of shadows, although this proved frustrating, time consuming, and
insufficient in the transcription of the texts.
Figure 1. Stylus tablet 836, one
of the most complete stylus tablets unearthed at Vindolanda. The incisions on the surface
can be seen to be complex, whilst the woodgrain, surface discoloration, warping, and
cracking of the physical object demonstrate the difficulty papyrologists have in reading
(To the top)
From an Engineering Science, rather than
from a classics view point, the stylus tablets can be regarded as "noisy",
uneven surfaces with shallow, narrow indentations which contain information that needs to
be retrieved. There are many cases where subjects might fall into this category: looking
for geological faults or valleys in satellite images, or looking for surface scratches on
machinery in the case of industrial inspection, for example. Many techniques have been
developed to find such incisions in objects, such as three-dimensional microscopy of
various sorts, or indirect methods from computer vision such as structured light or
binocular stereo (Horn 1982). However, any analysis of images of the stylus tablets would
prove tricky; the images are very noisy, textured, and large (because of the need for
detail), and there is low contrast between what may and may not be a candidate hand
writing signal. Because the writing is incised it invites three-dimensional analysis,
particularly as the analysis of only a single image of the tablet taken from a single
viewpoint would make it difficult to distinguish incisions of interest from fine lines
such as woodgrain or stains. The woodgrain and warping of the tablet complicate the shape
of the surface and make any image capture and subsequent processing more difficult. At the
outset of the project, we evaluated a wide range of alternative imaging and analysis
techniques. Some were rejected on the grounds of cost, others on the grounds of potential
damage to what are priceless documents. As a result, we concluded that the currently
optimal approach would be image analysis using one or more light sources and a single
high-resolution camera, positioned (nominally) vertically above the tablet.
In 1998 the Department of Engineering Science and the Centre for the
Study of Ancient Documents at the University of Oxford were jointly awarded a research
grant by the Engineering and Physical Sciences Research Council (EPSRC) to develop
techniques for the detection, enhancement and measurement of narrow, variable depth
features inscribed on low contrast, textured surfaces (such as the Vindolanda stylus
tablets). To date, a wavelet filtering technique has been developed that enables the
removal of woodgrain from images of the tablets to aid in their transcription (Bowman,
Brady et al. 1997). In addition, we have developed a technique called "Shadow
Stereo", in which camera position and the tablet are kept fixed; but a number of
images are taken in which the tablet is illuminated by a strongly orientated light source.
If the azimuthal direction of the light sources (that is, the direction to the light
source if the light were projected directly down on to the table) is held fixed, but the
light is alternated between two elevations, the shadows cast by incisions will move but
stains on the surface of the tablet remain fixed. This strongly resembles the technique
used by some papyrologists who use low raking light to help them read the incisions on the
tablet (Molton 1999).
Figure 2. A diagram of
the image capture system. Whilst the elevation of the light source changes around the
tablet, the camera and tablet remain stationary.
Edge detection is accomplished by noting the movement of shadow to
highlight transitions in the two images of the same tablet, and so candidate incised
strokes can be identified by finding shadows adjacent to highlights which move in the way
that incised strokes would be expected to (Schenk 1998). The difficulty comes in making
these steps precise. Although this is not a standard technique in image processing,
encouraging results have been achieved so far, and a mathematical model has been developed
to investigate which are the best angles to position the light sources (Molton 1999). Work
currently being undertaken is extending the performance and scope of the algorithms, and
the papyrologists are beginning to trust the results and suggestions which are being made
about possible incisions on the tablets. Future work will be done in relating the
parameters of analysis to the depth profile of the incisions to try and identify different
overlapping writing on the more complex texts.
(To the top)
However, whilst this technique has had
some success in analysing the surfaces of the tablets, there needed to be some way of
facilitating the papyrologists in analysing the results from this technique. Granted, the
resulting algorithms can easily be coded and added to PhotoShop using their Visual C++
plugin, but although this would allow others to apply the algorithms themselves and use
the image processing tools that are already available (and familiar to) the papyrologists,
it would do little to actually provide a tool that would actively help the papyrologists
in the transcription of the texts by incorporating their knowledge into the system and
building on the process they already go through to obtain a reading of such texts.
(To the top)
The use of computing in the field of
enjoyed some notable successes. There are many established imaging projects, such as those
at the CSAD4,
and the Oxyrhynchus Papyri Project5; excellent database projects and systems such as the Duke Bank of
Documentary Papyri6, and APIS7 (Advanced Papyrology Information System); and repositories of
information in a user friendly format such as the Perseus Project8. Many standards are
already in place for the digitisation and markup of ancient texts, and papyrologists are
making more use of the kind of image manipulation tools provided by the likes of PhotoShop
(Bagnall 1997). However, currently no systems exist to support papyrologists in the
process of reading ancient texts. Indeed, little work9 has been done to discuss how information is
actually extracted from these texts (in whatever shape or form they may be), and there
does not exist detailed cognitive and/or perceptual information processing models of the
papyrology process. From a Cognitive Psychology stance, although there has been a lot of
study of the processes involved in reading (Gibson and Levin 1976; Eysenck and Keane
1997), few conclusions have been drawn as to how a reader would approach such damaged,
fragmentary, foreign language texts and construct a logical, acceptable meaning. Also,
although image processing is an expansive field in the discipline of Engineering Science
(see Gonzalez and Woods 1993) little work has been done on the role of knowledge and
reasoning in the analysis and understanding of complex images; proposals for integrating
image analysis algorithms with techniques for the representation and mobilisation of
knowledge (the subject of the field of Artificial Intelligence) remain few.
(To the top)
The problem with trying to discover the
process that papyrologists go through whilst reading an ancient text is that experts are
notoriously bad at describing what they are expert at (McGraw and Harbison-Briggs 1989).
Experts utilise and develop many skills which become automated and so they are
increasingly unable to explain their behaviour, resulting in the troublesome
"knowledge engineering paradox": the more competent domain experts become, the
less able they are to describe the knowledge they use to solve problems (Waterman 1986).
Added to this problem is the fact that, although knowledge elicitation and acquisition
from experts is becoming increasingly necessary for the development of computer systems,
there is no consensus within the field as to the best way to proceed in undertaking such a
study. In this project, it was decided to utilise the outline suggested in McGraw and
Harbison-Briggs (1989) as it is the most comprehensive protocol developed to date. Steps
taken were the building up and understanding of domain area literature (a "Knowledge
Library"), records kept of meetings and conclusions drawn from these, and the
adoption of some suitable knowledge acquisition tools relevant to the project, such as:
walking through the process with the papyrologists, viewing them working and discussing
with other experts, documenting their notes and working hypotheses as they reached a final
reading of some example texts, analysis of transcripts of spoken discussions, etc. At
Oxford we have the luxury of having one of the highest concentrations of papyrologists in
the world, and this has aided the knowledge elicitation process a great deal.
(To the top)
Through this it has been possible to
build up a model of how papyrologists approach and start to understand ancient texts
(although more work remains to be done with Cognitive Psychologists to interpret how this
relates to current theories about the resolution of ambiguity in texts and what this can
bring to our understanding of the reading process.) For example, it can be shown that
papyrologists look at the whole of a text, not just at individual letters one at a time,
and are continually putting any hypothesis or predictions to the test by relating them to
the wider context of the document, and corpus of texts, as a whole. There is continual
reference to other lexical and grammatical sources. The readings are based on prediction
and the narrowing of the ambiguity surrounding the texts by recursive re-evaluation and
reasoning. Comparisons at the stroke, letter, grammatical unit, and word level are
continually made10, and a final reading of the text is produced when all ambiguities are
resolved to the best that they can be: in many respects the transcriptions of some texts
will always remain working hypotheses. From this it can be seen that although an image
processing system such as the one described above is inordinately helpful in highlighting
areas of the text which may be possible letters, there are many more contextual and
recursive elements in the transcription of such documents which are not reflected in the
viewing of an image by existing tools (such as PhotoShop), and an essential part of this
project is to construct a system which will take into account the techniques the
papyrologists use to read these texts.
(To the top)
Two types of tools are being developed to
aid in the papyrology process; one to assist in the prediction of possible language
structures which will help the papyrologists to reach a resolution of the ambiguity in the
texts faster than they can at present because of the manual nature of the task, and
another to help aid in the tagging, cross referencing, and annotation of the images to aid
comparison of the visual data. These will then be combined to provide a tool where the
papyrologist can identify an area of the image that may be a possible writing incision,
and use the expertise contained within the system to suggest what language structure this
may represent. It should be stressed here that this is not an attempt to build an
"expert system" that will automatically "read" and provide the best
transcription of the texts, it is a means to which papyrologists will be able to mobilise
disparate knowledge structures, such as linguistic and visual clues, and use these in the
prediction process to aid in the resolution of the ambiguity of the texts. Any
recommendation made by the system will only be a suggestion to the papyrologist, who will
control the parameters through means of a graphical user interface to avoid having to use
any complex use of computers. The advantage of developing such a system is that it enables
the papyrologist to maintain an explicit record of the alternative hypotheses developed,
and to switch effortlessly between these initially competing hypotheses, allowing them to
see the development of their reading of the texts and trace any conclusions back to their
initial thought processes.
Work is nearing completion on gathering the statistical information for
the construction of the language tool. Because of the nature of the tablets there are no
existing comprehensive corpora of texts which directly relate to the language written in
the stylus tablets. However, the majority of the wooden writing tablets have been
transcribed, resulting in almost 20,000 characters of text, and although the subject
matter between the texts is expected to differ, the form and structure of the language
used in both will be closely related due to them being from the same physical and temporal
source, and also due to the fact that it would probably have been the same people who were
involved in the writing of both sets of texts (due to the levels of illiteracy in the
society). The transcriptions of the writing tablets have been marked up in XML to retain
the conventions by which papyrologists transcribe documents11, converted to COCOA encoding using a XLST
stylesheet (Piez 2000), and analysed using TACT and WordSmith Tools to draw up indexes, a
concordance, and lemmas of the text, whilst providing detailed statistical analysis of
what combinations of letters are most and least likely, which letters and combinations are
most likely to be deemed ambiguous by the papyrologists, which letters/combinations are
most likely to be scribal corrections, or indications of abbreviations, etc. A set of
rules will now be drawn up and constraint sets identified to allow a system, based in
LISP, to help the papyrologist access the linguistic data to assist in the prediction of
the language sequences.
Plans for the visual interface include building a prototype interface
utilising XML to mark up and cross reference images and their transcriptions to allow easy
consultation, labelling, and annotation. The final visual interface will utilise Java (for
image viewing and processing), LISP for the predictional elements, and other image
processing algorithms will be implemented in C++, with all modules of the system being
tied together by the Java Native Interface. Rapid prototyping and integrated user
evaluation will be used to ensure that the developing interface is user friendly, and the
building of this system is planned to take around 12 months.
(To the top)
The aims for this part of the project
are, then, to produce a stand alone application which will assist the papyrologists in
interrogating the visual information created during the Shadow Stereo analysis of the
texts a set of image analysis tools to work in tandem with the image processing
tools being developed by the other engineers. The overall aim is of course to aid in the
transcription of the stylus tablets, but it is hoped that the tools may also be used by
papyrologists working on other texts, and through this process we are gaining a further
insight into the papyrology process itself which is relevant to researchers working in
(To the top)
The project is also interesting to the
humanities computing scholar because of its inter-disciplinarity. It is suspected, and
hoped, that in the future such collaborations between disparate disciplines will become
more commonplace. The attempt to solve complex humanities-based problems by utilising the
technical skills of a scientific discipline is an intriguing and fruitful prospect;
granted, other projects utilising similar collaborations exist in IT and the Humanities at
present, but not many on this technical or physical scale. This project also marks a
seven-league-boot step away from the "look, I made a web page!" plateau where
too many humanities computing projects are still stuck regarding their application of IT
in their projects. Working in such an interdisciplinary field can be rewarding, the
facilities available with such collaborations excellent, the environment stimulating, and
the conclusions reached illuminating. However, the partnership between such disparate
disciplines can throw up some interesting, and unforeseen, challenges. This project
benefits from a good relationship between the supervisors of both parties, but it is
obvious that without such an association communication between the two groups would be
very much hindered. Communication between the teams is still at times difficult, partly
due to the different working practices of the two, and the two different languages,
linguistics and mathematical, used to communicate ideas. A humanities computing individual
required to bridge such a gap needs to have the suitable personal and domain based skills
to communicate at a desirable level with both sides of the equation, and it can be
difficult to acquire and maintain suitable expertise across both disciplines. From an
administrative viewpoint, being placed between (or across) two academic disciplines can
prove problematic, as the academic system (in a traditional UK environment, anyway) is not
ready to cope with such an un-pigeon-hole-able working practice, and however trivial these
problems may be, they can add to a feeling of disenfranchisement: as always, the
humanities IT scholar is neither one thing nor another.
(To the top)
This project interlaces many different
computational and literary based techniques to try and find a way of accessing the
information contained within the Vindolanda Stylus texts. However, although this project
is an example of a collaboration which has resulted from trying to find a solution to a
certain technical problem, it is also an indication of the way diverse and disparate
disciplines can collaborate in order to overcome more difficult problems in their own
respective fields. In doing so, the results can often be relevant to other disciplines, or
applicable to other problems, as well as their own. It is the place of the scholar in
computing and the humanities to provide the skills needed to facilitate communication in
such inter-disciplinary environments, and, however challenging at times, such a role can
be rewarding, both for the individual, and for the discipline as a whole.
(To the top)
Thanks to Prof. J.M. Brady and Dr Alan Bowman who are currently
supervising (and encouraging) this research. Thanks also go to all others involved in this
project: Xiaobo Pan, Nick Molton, and Veit Schenk from the Department of Engineering
Science, and Charles Crowther, Roger Tomlin, and John Pearce from the Centre for the Study
of Ancient Documents, University of Oxford. Thanks are also due to Dr Seamus Ross from the
Humanities Advanced Technology and Information Institute, University of Glasgow, for his
continuing support and advice.
Melissa Terras is a doctoral student
at the University of Oxford, persuing a D.Phil in Engineering Science and working with the Centre for the Study
of Ancient Documents. She graduated from the University of Glasgow in History of Art and
English Literature, going on to complete an MSc in IT and the Humanities
subjects of interest include the application of Virtual Reality to Humanities problems,
knowledge elicitation, software development, and the role of image processing in humanities
(To the top)
1. It is suspected that around
2000 of such tablets exist outside of Egypt (Renner 1992). (Back
to the text)
2. Only one stylus tablet, 836, has been found so far with its wax
intact. Unfortunately, this deteriorated during conservation, but a photographic record of
the waxed tablet remains to compare the visible text with that on the re-used tablet.
(Back to the text)
3. Papyrology, simply defined as obtaining "a body of knowledge
from the study of papyri", is now taken to cover " as a matter of
the study of all materials carrying writing
done by a pen"
(Turner 1968). (Back to the text)
to the text)
(Back to the text)
(Back to the text)
(Back to the text)
(Back to the text)
9. Youtie (1963) and Youtie (1968) are the only discussions published
as yet as to what the papyrology process actually entails, with some higher level
discussion available in Turner (1973). (Back to
10. This can be seen to corroborate the interaction activation model of
visual word recognition developed by McClelland and Rumelhart (McClelland and Rumelhart
1986). (Back to the text)
11. The Leiden system, (Turner 1973).
(Back to the text)
Bagnall, R. S. (1997): "Imaging of Papyri: a Strategic View."
In: Literary and Linguistic Computing 12(3), p. 153-154.
Bearman, B. H. and S. Spiro (1996): "Archaeological Applications
of Advanced Imaging Techniques." In: Biblical Archaeologist 59:1.
Bowman, A. and J. D. Thomas (1994): The Vindolanda Writing-Tablets. Tabulae
Vindolandenses II. London: British Museum Press.
Bowman, A. K. (1997):The Vindolanda Writing Tablets. XI
Congresso Internazionale di Epigrafia Greca e Latina. Roma.
Bowman, A. K., J. M. Brady, et al. (1997): "Imaging Incised
Documents." In: Literary and Linguistic Computing 12(3), p. 169 - 176.
Eysenck, M. W. and M. T. Keane (1997): Cognitive Psychology, A
Students Handbook. Hove: Psychology Press.
Fink, R. O. (1971): Roman Military Records on Papyrus. Case
Western Reserve University.
Gibson, E. J. and H. L. Levin (1976): The Psychology of Reading.
Cambridge Mass: MIT Press.
Gonzalez, R. C. and R. E. Woods (1993): Digital Image Processing.
Horn, B. K. P. (1982): Robot Vision. Cambridge, Mass: MIT Press.
McClelland, J. L. and D. E. Rumelhart (1986): "A distributed model
of human learning and memory". In: Parallel distributed processing: Vol. 2.
Psychological and biological models. D. E. Rumelhart, J. L. McClelland and T. P. R.
Group. Cambridge Mass: MIT Press.
McGraw, K. L. and K. Harbison-Briggs (1989): Knowledge Acquisition:
Principles and Guidelines. London: Prentice-Hall International Editions.
Molton, N. (1999): "The Choice of Light Position for Shadow Stereo
with Inscribed Tablets." Forthcoming.
Piez, W. (2000): XSL Characteristics, Status, and Potentials for
Text Processing Applications in the Humanities. Forthcoming (presented at the
Association of Literary and Linguistic Computing Glasgow 2000 conference).
Renner, T. (1992): The Finds of Wooden Tablets from Campania and
Dacia as Parallels to Archives of Documentary Papyri from Roman Egypt. Copenhagen
Schenk, V. U. B. (1998): Visual Identification of Fine Surface
Incisions. Oxford: St Cross College, University of Oxford.
Turner, E. G. (1968): Greek Papyri, An Introduction. Oxford:
Turner, E. G. (1973): The Papyrologist at Work. Duke University,
Durham, North Carolina: The J.H.Gray Lectures, GRBS Supplement.
Waterman, D. A. (1986): A Guide to Expert Systems. Reading, MA:
Youtie, H. C. (1963): "The Papyrologist: Artificer of Fact".
GRBS 4(1963), p. 19-32.
Youtie, H. C. (1966): "Text and Context in Transcribing
Papyri". GRBS 7(1966), p. 251-258.
(To the top)
© Melissa Terras 2000
to Human IT 2-3/2000