Monday, 24 September 2012

Markup and Enrichment Reader Response

Discussion of the texts:
Two years ago the markup of texts was my first foray into Humanities Computing, and it is interesting, looking back, to reflect on just how little I understood about what I was doing. As this week's readings discuss, one of the benefits of TEI is its simplicity; given basic instruction a person can markup text without any knowledge of the principles behind the process. That being said, it's nice to finally have an explanation of why Oxygen rejected certain arrangements of tags.

My main take away from these articles is an extension of the discussion in the class on digitization: what is gained by text markup and what is lost? In Text Encoding Alan Renear outlines numerous advantages of descriptive markup, my favourites including information retrieval and the support of analytical tools. He then launches into a discussion of OHCO, SGML, XML and TEI and breezes by several potential downfalls of markup. For instance he states that in OHCO, the foundation idea of the others, texts "...are not things like pages, columns, (typographical) lines, font shifts, vertical spacing, horizontal spacing, and so on". For modern books and papers this may be the case, but in older documents, particularly handwritten manuscripts, the layout of pages can be just as important as the text itself depending on the interests of the researcher. I myself have struggled when transcribing and encoding texts with the information lost by not indicating, for example, that a line is centered on the page.

Renear, Mylonas and Durand examine the evolution of OHCO ideals, and the continued problems with it, in the third reading. The authors explain a founding principle of overlapping hierarchies: that texts naturally conform to a set structure based on their type which nest inside each other and do not overlap. They then show how this was refuted through counter examples, and the theory softened as a result. The most intriguing idea presented in this article, in my opinion, however, is one that the authors introduce and then abandon in their discussion. The 'Theoretical' defense of OHCO they present states that a "layout feature" of a text can change without effecting the content, but the structure cannot. As discussed in the previous paragraph, I disagree with this assertion; layout can be integral to a text, but the underlying principle (as described by the authors) is in my mind the key to understanding a text and creating a good markup: differentiating between "essential and accidental properties". 


My first impression of A very gentle introduction to the TEI markup language was how far should I trust an author who in his own words hasn't "...quite learned how to write an XML document and display it with links in a frameset". This trepidation soon passed, however, first because I don't know how difficult it would be to do that, and more importantly because this document is a great example of why you don't want an expert to explain technical concepts. I can't speak for a reader that had no previous knowledge of XML or TEI, but as someone whose had some experience with it I feel this explanation was simple enough to be easily followed and at the same time brief enough to not be frustrating or repetitive. The numerous examples, a feature often underused by 'experts', were a big part of its clarity. [I know these blogs are supposed to be prompting discussion about the texts, not just praising them, but I really have nothing else on this one]. It was a great way to launch into HuCo 520's instruction on XML tomorrow, and I hope is a solid enough foundation for the upcoming HuCo 500 discussion.

No comments:

Post a Comment