Dear all,

FYI - I would like to share with you an email from Martin Mueller.

Best,

Martina

Von: Martin Mueller <martinmueller@northwestern.edu>
Gesendet: Samstag, 18. Juli 2020 00:25
An: Scholger, Martina (martina.scholger@uni-graz.at) <martina.scholger@uni-graz.at>
Betreff: TEI and meeting the needs of most users most of the time

Dear Martina,

I have been working with TEI Publisher 6.0 (or rather trying to wrap my aging brain around its complexities), and in doing so I have spent the past two days looking at TEI documentation. This has confirmed my long-held conviction that TEI documentation is very good for folks who know what they are doing but need to look up this or that. It does not do much “newbies”. Luther’s “Kleiner Katechismus” (1529) has often been on my mind when thinking about TEI documentation. It badly needs its own Small Catechism in which everything that you need most of the time is clearly described then and there without reference to a very large document, which becomes useful only after weeks of using it.

Within a few years of the first release of the TEI Lou and Michael published TEI-Lite with the goal of meeting 90% of the uses of 90% of the users—a classic and extremely successful implementation of the 80/20 rule. For quite a few years there was a very nice guide to it on an Oxford site. I think it was Sebastian’s work. It told you everything about the ~150 elements of that schema without distracting the users’ minds with stuff they did not (yet) need to know.

In the years since then the TEI has spent a lot more time refining this and that but has spent hardly any time or energy on “preaching beyond the choir”. There are many reasons for the absolute and relative decline of the TEI as an institution, but the lack of attention to ordinary users has probably had a good deal to do with. One of my friends tells me that in Medical School he got the advice “When you hear hooves clattering behind you, think horses not zebras”. The TEI has been all about zebras. An almost funny example is the documentation for TEI Simple, the project that Sebastian and I worked on and that suffered greatly from his illness and death, although its Processing Model was eagerly picked up by Wolfgang Meier and Joe Wicentowski. In the final report to the Mellon Foundation I wrote the following paragraph about it:

At an earlier stage the project was called TEI Nudge, and throughout it has taken inspiration from Richard Thaler's and Cass Sunstein's Nudge with its persuasive argument that in many walks of life people will make better choices if they are offered well-designed default solutions, as long as they are free to opt out of them. TEI Simple focuses on interoperability, machine generation, and low-cost integration. The TEI architecture facilitates customizations of many kinds; TEI Simple aims to produce a complete ‘out of the box’ customization that meets the needs of the many users for whom the task of creating a customization is daunting or seems irrelevant. TEI Simple in no way intends to constrain the expressive liberty of encoders who do not think that it is either possible or desirable to follow this path. It does, however, promise to make life easier for those who think there is some virtue in travelling that path as far as it will take you, which for quite a few projects will be far enough. Some users will never feel the need to move beyond it, others will outgrow it, and when they do they will have learned enough to do so.

On the Customization page of the TEI site, TEI simplePrint is described as “An entry-level customization, focused primarily on the needs of those encoding Western European early modern printed material”. Instead of stressing what you can do with it, the description dwells on what you cannot do. This is plain wrong in several ways. You can encode the Iliad in Greek, the Torah in ancient Hebrew, Anna Karenina in the Cyrillic alphabet, the Koran in Arabic (I believe),and a lot of other stuff. It is neither “Western” nor “early modern” But the summary of TEI Lite is not much better, describing it as “basic elements for simple documents”. Its 150 elements are not particularly ‘basic’, and the documents encoded by it are anything but simple.

I wrote an early draft of the documentation for TEI Simple, using (with their permission) Lou’s and Michael’s’ excellent TEI Lite documentation, but did not see the final version, and indeed until this morning never read the following paragraphs to which I would have strongly objected at the time:

Like every other TEI customization, TEI simplePrint was designed for use with a particular type of material. If the material you are planning to encode matches the following criteria, then TEI simplePrint is for you. If it does not, it may not be.

You are encoding print material, rather than manuscript: simplePrint provides no way of encoding manuscript features such as correction, deletion, or scribal variation.

You are encoding material from the Early Modern period (i.e., up to the end of the nineteenth century): some of the features for which simplePrint provides encodings are rarely found in modern materials.

You are encoding material written, broadly speaking, within the Western European tradition, using largely Western European characters. simplePrint does provide facilities for encoding short passages in non Western European languages, but many features needed to

cope with Asian or ancient scripts are missing.

Your intention is to provide a relatively simple encoding for a large amount of material, rather than a rich encoding of a small amount of material: simplePrint is intended to help libraries and archives wishing to go beyond basic digital facsimiles, rather than to support specialist research. It does not, for example, include features for detailed linguistic tagging beyond simple word-level tagging, nor for specialised text types such as dictionaries, historical or biographical databases, etc.

These paragraphs are condescending in tone and inaccurate in substance . The term “Early Modern” is misused, the reference to “Western European characters” is grossly inaccurate, and the claim that texts encoded in this fashion do not “support specialist research” does not hold up to much examination.

My mother, a pediatrician, did not care much for Dieter Borsche, especially in one of his films where as a doctor he proclaims that “Der ist kein wahrer Arzt der noch kein Serum hat erfunden.” Medicine most of the time is about horses, not zebras—a useful lesson right now when the daily Corona virus deaths in Florida (20 million people) are as many as in the European Union (400 million people). But the TEI is also about horses most of the time—a dangerous thing to forget.

The language of the Simple documentation and of quite a few paragraphs in the Guidelines carries an undertone of indifference to encoding schemes that fall below some level of complexity, as if the interest of a TEI encoded document were only a function of the complexity of its encoding. But this is exactly the wrong way of looking at it. Most encounters of end users with TEI-encoded documents involve documents with moderately complex encoding. I can certainly appreciate the challenges of virtuoso encoding and the pleasures of a difficulté vaincue , but the thriving of the TEI depends primarily on providing quite mundane services to a growing community of readers and users.

This is not a new topic for me: some years ago I warned that the TEI might become an Orchideenfach. https://sites.northwestern.edu/scalablereading/2016/09/20/whither-tei-the-next-thirty-years/ Chapter 23 of the Guidelines has the sentence “it is almost impossible to use the TEI schema without customizing it in some way.” That is not true: a large majority of TEI encoded texts have been encoded encoded in schemas that are identical with or slight variants of TEI Lite, TEI Guidelines for Libraries, Level 4, the EEBO dtd, the DTA schema, the Clarin schema and perhaps a few others. “ Standardize where you can, customize where you must” might be good advice. It would not be difficult to derive from these schemas a new version of TEI Lite. I would call it TEI Vanilla because it does for encoding what Vanilla does for ice cream: it works most of the time in the sense of supporting moderately complex encodings of quite complex documents.

If the TEI wants to thrive it needs to have some version of TEI Vanilla with a built-in Processing Model and a documentation that has the virtues of Luther’s Small Catechism: enough to give ordinary users a clear pathway to heaven. Feel free to share this letter with the Council or Board

With best wishes

Martin Mueller

Professor emeritus of English and Classics

Northwestern University