|Click image to download a poster describing the current work of the CDLI
The Cuneiform Digital Library Initiative (CDLI) represents the efforts of an international group of Assyriologists, museum curators and historians of science to make available through the internet the form and content of cuneiform tablets dating from the beginning of writing, ca. 3350 BC, until the end of the pre-Christian era. We estimate the number of these documents currently kept in public and private collections to exceed 500,000 exemplars, of which now more than 290,000 have been catalogued in electronic form by the CDLI.
In its early phases of research, the project concentrated on the digital documentation of the least understood archives of ancient cuneiform, those of the final third of the 4th, and of the entire 3rd millennium BC that contained texts in Sumerian, in early Akkadian and in other, still undeciphered languages. For despite the 160 years since the decipherment of cuneiform, and the 110 years since Sumerian documents of the 3rd millennium BC from southern Babylonia were first published, such basic research tools as a reliable paleography charting the graphic development of archaic cuneiform, and a lexical and grammatical glossary of the approximately 120,000 texts inscribed during this period of early state formation, remain unavailable even to specialists, not to mention scholars from other disciplines to whom these earliest sources on social development represent an extraordinary hidden treasure. The CDLI, directed by Robert K. Englund of the University of California at Los Angeles and Jürgen Renn of the Max Planck Institute for the History of Science, Berlin, is pursuing the systematic digital documentation and electronic dissemination of the entire cuneiform text corpus bearing witness to 3500 years of human history. Cooperative partners include leading experts from the field of Assyriology, curators of European and American museums, and computer specialists in data management and electronic text annotation. The CDLI data set consists of text and image, combining document transliterations, text glossaries and digitized originals and photo archives of cuneiform.
This electronic documentation should be of particular interest to cuneiform scholars distant from collections, and to museum personnel intent on archiving and preserving fragile and often decaying cuneiform collections. An important subset of the data will form the basis for the development of representations of the structure of 3rd millennium administrative and lexical documents, making the contents of the texts accessible to scholars from other disciplines. A typology of accounting procedures, graphical representations of formal structures of bookkeeping documents, and extensive glossaries of technical terms later supplemented by linguistic tools for accessing the primary sources by non-Assyriologists are being developed. Data formats, including Extensible Markup Language (XML) text descriptions, with vector-based image specifications of computer-assisted tablet copies, will be chosen to insure high conformance with ongoing digital library projects. Metadata-based lexemic and grammatical analysis of Sumerian in the CDLI markup environment will not only put at the disposal of specialists in the fields of Assyriology and Sumerology available cuneiform documents from the first thousand years of Babylonian writing, but also general linguists, semioticists, and historians of communication and cognition, of administration and early state formation, will for the first time have access to the form and content of these records.
In an initial three-year phase funded by the Digital Library Initiative of the National Science Foundation and the National Endowment for the Humanities for the period 2000-2003, project staff and associates completed the digitization of the early cuneiform collections of the Vorderasiatisches Museum (VAM), Berlin, the Hermitage, St. Petersburg, the Institut Catholique, Paris (ICP), the Hearst Museum of the University of California at Berkeley, and the University Museum of the University of Pennsylvania, Philadelphia. Dual track internet presentations of these collections (conforming on the one hand with individual museum presentation, on the other with archival data sets of the CDLI) are being implemented incrementally. The ca. 3200 tablets of the VAM, representing one of the finest collections of early cuneiform known to us, with representative text groups from all of the major phases of writing in Mesopotamia, went online in 2001. The following year, the collection of the ICP was presented, and in 2003 both the Hearst and the Hermitage, with its substantial archives of pre-Sargonic Lagash (ca. 2400-2350 BC) and Ur III (ca. 2050-2000 BC) administrative documents, supplemented with searchable data sets of all collections of tablets deriving from the period of proto-cuneiform (ca. 3350-3000 BC). Such research tools as a reliable paleography of the first twelve hundred years of cuneiform, and a lexical and grammatical glossary of the wide-ranging records from the period of early Babylonian history will follow from the cooperative research on these data sets sponsored by the CDLI.
With funding from the National Endowment for the Humanities and the Institute for Museum and Library Services for the period 2004-2006, the CDLI continued to implement scalable access systems for a wide array of users, including researchers, museum staff, internet users, and even law enforcement officials. A proposed online educational component combining the support of the NEH Iraq Initiative and a Learning Federation effort of the Federation of American Scientists fulfilled two goals of the CDLI. On the one hand, our resources came to provide a rich learning environment for K-12 and advanced students and their families, for whom ancient Iraq can still seem impenetrable. On the other, an interactive, English/Arabic presentation of shared cultural heritage dating back five millennia assists an Iraqi nation still struggling to re-establish its historical and social unity.
The list of collaborating institutions expanded very dramatically with the initiation in April of 2009 of the CDLI-managed collaboration “Creating a Sustainable Cuneiform Digital Library” funded by the Andrew W. Mellon Foundation. The conceptual framework for the project was agreed to during workshops and subsequent communications among participating partners from the US, the UK, and continental Europe. Collaborators agreed on a set of general goals to serve as a strategic plan for future research in cuneiform:
• The community of professional cuneiformists and scholars of the ancient Near East must be encouraged to embrace and exploit the promise of research strategies based on increasing access to digital content, currently termed “cyberscholarship.” This encouragement should take many forms, including leading by good example and moving increasing amounts of professional communication to online forums; developing future-oriented data infrastructures for the continued internet projection of cultural information and heritage; but also in such unsubtle ways as making the command of information technologies a desirable qualification for future faculty and cultural heritage hires and advancement.
• A networked community of researchers lives by accepted technical and content standards. Cuneiformists should be exposed to technical standards of XML coding, metadata specification, data storage and image processing employed by leading initiatives in the field; they should be encouraged to engage in these standards, to communicate their experience to a networked audience of field experts, and to propose a refinement of content standardization of gathered transliterations, of artifact catalogue fields, in particular in cooperation with experienced collection managers, and of the means of tagging such content features as dictionary headwords (lemmatization), prosopography and numero-metrological notations. An integral feature of this strategy is the implementation of basic electronic tools that assist all learning communities to better understand the data that cuneiformists offer.
• Hand-in-hand with the adoption of technical advances and standards, the communities of researchers and administrators of cuneiform collections must be encouraged to accept policies of open access. No less than in other Humanities disciplines, cuneiform studies must work to alleviate the pitfalls of publication and data dissemination policies that hamper the free exchange of content and interpretation, and that, in broader cultural heritage terms, can delay the digital capture of fragile or vulnerable artifact collections worldwide.
The first phase of the Mellon-funded project focused on developing and implementing methods of collection capture and internet applications designed to foster a new form of collaboration among academic researchers and curatorial staffs of European, American, and Middle Eastern research and cultural institutions. Efforts resulted in the establishment of standardized methods in the electronic capture and permanent data archiving of often fragile cuneiform collections across a broad array of public institutions. By combining XML text description with a variety of high-resolution raster and scalable vector graphic images, the overall goal of creating a flexible and interactive access system containing tools for a networked presentation of early writing for both individual text collections and virtually reconstituted ancient archives was put in place. The first phase succeeded in formalizing an international network of cuneiform researchers and curators to digitize and archive a critical mass of cuneiform data content. In addition, funding supported equipment purchase; the collection, archiving, and processing of data; and the development and implementation of a data management infrastructure to ensure permanent and reliable access to the cuneiform digital library.
The CDLI counts three highlights among the achievements of the period 2009-2011. First, we created digital content documenting the cuneiform collection of the British Museum’s Middle East Department that at once nearly doubled CDLI’s assets before this project began, including more than 14,000 Nineveh tablet composite archival images (based on the form of the composites, CDLI “fatcrosses”). In addition to digital dissemination, the British Museum images found use in print, and feature regularly in university courses. Perhaps most importantly, the British Museum’s participation in this project sent a strong message of encouragement to the international museum community to follow suit in making their own cuneiform collections generally and freely accessible via the Internet, regardless of the status of their publication. Second, the project’s partnership with the University of Pennsylvania proved to be an exemplary research and open access effort in the United States. The Penn collection of Nippur artifacts represent the core of the intellectual and literary history of the Sumerian civilization of the 3rd millennium BC, yet remained largely inaccessible to research since the latter part of the 19th century AD. This first phase enabled the capture and web posting of more than 18,000 digital facsimiles of Penn’s holdings of 28,000 cuneiform artifacts, images that will facilitate research in Babylonian history for generations. Third, the project funded staff centers at MPIWG and at UCLA enabled the capture and dissemination of artifact corpora from, to name only the more significant collections, the Syrian National Museums (Deir ez-Zor, Raqqa and Aleppo); the University of Leiden; the Bibliothèque Nationale et Universitaire de Strasbourg; Harvard University; and the University of Chicago. In addition to raw digital capture, processing and web dissemination, this first phase of the project supported the creation of the persistent digital repository infrastructure within UCLA Library’s Digital Library Program. This infrastructure now cares for the permanent archiving and access to individual collection archives, and will offer collection officials privileged access to full packages of metadata and digital facsimiles of their own collections. It further establishes persistent archival resource keys within the California Digital Library, encouraging Humanities scholars to exploit CDLI’s digital content in paper as well as electronic publications in future.
The second phase of the project, currently underway, was conceived as a continuation project dedicated to the digital capture of cuneiform collections world-wide, while leveraging these digital collaboration and the research assets that they create to seek funding for greater IT functionality of our web resources. Thus, the project proposed to create and post on the web new digital content including 15,000 texts from the British Museum, imaged at four 600ppi images per text; 8,000 from the University of Pennsylvania Museum: 2,000 from Middle Eastern collections; 2,000 from other European collections, and 4,000 from American collections; as well as 5,000 line art files of published texts. Transliteration of 10,000 texts entered in now widely accepted CDLI ATF (“ASCII transliteration format”; 100,000 lines at 10 lines per text) would add philologically critical new material to the two million lines of text transliteration of the continuing project.
Among the highlights of this phase of the project, the following achievements may be counted: creation and transfer a total of ca. 1.5 TB of new data representing 22,000 archival fatcrosses, prominentnly of the collections of the British Museum, the Penn Museum, the Oriental Institute, Chicago and Harvard University; 11,300 archival line art files; and in total ca. 132,000 raw image files stored for purposes of eventual reconstruction of defective “fatcrosses,” RTI files and line arts; the successful digitization of the good majority of collections in Jerusalem; the cooperative imaging, processing and posting of the full collection of Prague’s Charles University, including easily the largest online presentation of Assyrian trading archives unearthed in Anatolia; full capture and dissemination of the Turin and Geneva collections; and, now underway, the capture of the Lewis Collection of the Free Library of Philadelphia, of which 1,500 texts are edited in some form, while 1,500 appear to be fully unknown.
The project, further, revamped the CDLI web pages that offer persistent access to all files created during phase one and phase tow of the project; replaced the Zope/Python core database with dynamic MySQL; and transferred top-level website control to Drupal, an open source Content Management System also based on SQL. These changes create a more stable platform for our data presentations, and one that is aligned with the computing services support offered by UCLA’s Center for Digital Humanities, and to most Humanities divisions in the US generally. The project has begun to tag CDLI assets online for a greater overview and easier access to such higher-level textual data as literary scores, to monumental inscriptions, and to administrative sealing practice; and on May 1, 2013, the project launched a new learning application, “cdli tablet” for the iPad, combining short text with images of cuneiform inscriptions and related archaeological artifacts.
Cuneiform studies, while a relatively young discipline within the range of Humanities, have established a solid record of the traditional use of research and academic publications. Members of the relatively small community of cuneiformists have, at the same time, exploited information technologies with a drive that is not common among Humanities scholars. Indeed, the eminent historian of science, Jürgen Renn, Director of the MPIWG, has remarked that “in the context of the new possibilities opened up by digital scholarship in the Humanities, Assyriology—by having early embraced a technologically innovative open-access based strategy—constitutes a model community in which new ideas for the next generation of cyberscholarship across disciplines may develop.” The field has in place a number of web-based cuneiform studies researchers who have created and made freely accessible a substantial amount of Humanities content. With the support of the Mellon Foundation, these and more Humanities researchers are joining the emerging networked community of global partners, a community that has been lauded as the next revolution in social, political and economic development. A realistic goal of the next decade of cuneiform cyberscholarship will be to transfer much scholarly reference and communication to the web, and to make cultural heritage universally accessible through new digital technologies; the project can play a decisive role in this development.
We believe that with the knowledge acquisition and distribution moving from print books to inter-active web research, we are witnesses to a dramatic change in the role of books and libraries. However, this shift does not necessarily mean the end for libraries as centers of knowledge. The Google Books Library Project has stated clearly its goal to “make it easier for people to find relevant books—specifically, books they wouldn’t find any other way such as those that are out of print—while carefully respecting authors’ and publishers’ copyrights. [The project’s] ultimate goal is to work with publishers and libraries to create a comprehensive, searchable, virtual card catalog of all books in all languages that helps users discover new books and publishers discover new readers.” While this represents a modest assessment of learning in the future, meant to assuage the anxiety of copyright holders in the publishing world, surely no one will challenge the fact that research is transitioning to online resources. Through the development of ever-growing raw source data and online toolsets for their research use, as well as through automatic uploads to partners at the Oracc consortium through the facilitated exposure of our web files to Google and other aggregating harvesters, and most clearly through its support of shared and transparent file management and of the only three extensively hyperlinked online journals in the field, the project is playing a leading role in how cuneiformists view their intellectual input in the process of text annotation. In this, the clear physical boundaries represented by printed text editions are blurred, but the end product is enhanced. Web-based scholarship and the cyber-infrastructure it implies promises, more than any other economic or transborder educational initiatives, to add scholars and informal learners in these regions to the international dialogue that the Humanities envision.