HUCO 500: 2012

Tuesday, 13 November 2012

Serious Games

Discussion of Readings:

Woods, Loading the Dice: The Challenge of Serious Videogames
Castell and Jensen, Serious Play

The Society, Economics, and Culture of World of Warcraft, Or, Can a Commercial Game be "Serious"

World of Warcraft (WoW) is, according to the delimiting factors set by Woods, the very definition of a non-educational, non-serious game. Its genre, MMORPG, names it specifically as a role-playing game. It is extremely character-centric, with players forming an almost cult-like attachment to their avatars (see the numerous companies that now make t-shirts depicting a player's toon). It is set in an entirely fictitious world populated by unrealistic species and creatures designed to be killed with positive repercussions. And, worst of all, it is a commercial product, made to be repeatably and continually playable, with a plethora of character and quest options, and an ever-increasing number of expansion packs. All of this seems the trademark of a game that is meant solely for entertainment, where if there is interaction between real-life players it is included only to increase the game's interest and to keep users coming back. An examination of how the game is actually played, however, of the social interactions that actually occur and of the world that has been built, begs the question: are commercial MMORPGs actually non-serious, or are researchers simply asking the wrong questions in analyzing the potential of games for education?

[Note that this discussion will extend only as far as the Cataclysm expansion, as I have no knowledge of later additions. Also, bear with me through the explanations and descriptions: I am going somewhere with this.]

Each character or avatar in WoW belongs to a series of social groups composed of more and more limited segments of the population of the game.

Factions: there are two factions: Alliance and Horde, which are, have been, and likely always will be at war. Each character is born into one or the other of these factions, and the dynamics of their conflicts and interactions form the framework for the game. There are external 'factions' outside of these two composed only of non-player characters (NPCs), but from a character's perspective, the world is composed of their faction (Alliance or Horde) and a world of 'others' which are enemies.
Races: each faction has six races that the game designers have provided with rich back stories of how they came to be members of their respective faction, and which differ from each other heavily in appearance and moderately in abilities.
Class: each character is a member of a class (ie. warrior, mage, rogue, etc.) that defines their skill set, and which spans across races and factions, occasionally superseding them (for example Tauren and Night Elf Druid NPCs are peaceful among each other even though they are in different factions).
Occupations: players can also choose for their avatars to belong to certain trades (eg. miner, tailor, alchemist, etc.), but are limited to two in total.
Guild: a group composed of other player's avatars, which support each other on quests, share resources, and can undertake challenging missions to earn renown for their group.
Player Alts: a collection of all a player's avatars forms, by default, its own social group in which money, resources and skills are shared equitably as they all benefit the player behind the characters.

I could stop right here and point out the potential for teaching about Medieval society, let's say the Crusades specifically, where there were two warring religions (factions) each composed of multiple countries (races). The people involved included knights, priests and merchants (classes), many of whom banded together into, for example, the Templars, Hospitallers and Teutonic Knights (guilds), all of which was settled around the intense bonds of a Medieval family unit (player alts).

Instead, however, I want to look more closely at how WoW reflects, and can be used to study, modern society, a topic more akin to Woods' point. For this I will focus on the things in the game of which a player is in control ("the ability to alter the rule structure of a game while that game is in progress" as Woods puts it), and the built in social dynamic (avatar interactions, chat functions, guilds, etc.). The ultimate question here is: if they were prompted to do so, what might a player realize about the way society functions, and the way they are part of that functioning, based on how they behave in WoW? I've neither the time nor the inclination to come up with an exhaustive list of examples, but the following are a few interesting considerations.

How do players in WoW handle what they consider to be unacceptable social behavior?

There is a built-in complaints and punishment system for severe offenses in WoW which are handled by the game designers and maintainers. This has, in fact, echoes of the real world justice system: a player is 'found guilty' only after enough complaints are filed to form a consensus, or after the 'law enforcing' game designers find evidence to support the claim, and punishment is in the form of a 'jail term' of account suspension or 'execution' of account termination. Interesting and educational as this is, however, I am speaking here not of what accounts to 'law breaking', but of the less overseen social training we all undergo as a part of family, school, and work life that teaches us how to fit into our own cultures.

The ways this training has developed in WoW is fascinating primarily because it is not a designed part of the game-play. There are no programmed repercussions for certain behavior, it is shown to be 'wrong' only by the way other players respond to it. There are too many possible examples of 'bad behavior' to even attempt a list, and ultimately this is less important than what the responses to the actions are; this section will thus outline a few methods developed and employed by players in the game to curb poor social behavior, and will demonstrate parallels to real-world equivalents.

Social Ostracism

A natural and effective response in WoW to a player being rude, selfish, or counterproductive is their expulsion from a group or guild. Ultimately the game does not require these sorts of social attachments to be played, but there are numerous rewards to being part of such teams, from conversation to easier questing to financial benefits. The loss of these social connections is thus interpreted, much as in real life, as an extremely negative event, and the threat or reality of it can cause a player to rapidly change their behavior. Ostracism has long been a foundational modifier of behavior in real-world societies; it is the basis of excommunication, for example, and the purpose of Medieval stocks. In modern settings cliques and social groups still use the technique rampantly, most noticeably among teens. The carry over of ostracism to a game world helps demonstrate just how ingrained in our society it is.

Derogatory Terms

I'm not sure it's pointed out clearly enough in attempts to staunch the use of terms like "queer" or "retard" that
they stem from a perhaps unconscious belief that the character traits with which they are associated are wrong. This is not the place for a discussion on such, however; the point is that an entire array of derogatory terms have developed in WoW and other MMORPG games to describe and discourage unwanted behavior. Labels like:

Noob or Newb - an inexperienced player who hasn't learned obvious and easy parts of game play
Twink - a tricked out low-level 'vanity' character whose activities are supported by the money of a much higher levelled alt (notably this can be a negative or positive term depending on whether a group supports or discourages such behavior)
Ninja - a player who steals loot or materials from a guild bank without regard for other members of their group/guild

These terms are not only used to insult a player, but can be used in conjunction with social ostracism to (particularly in the case of a guild bank ninja) discourage other groups from interacting with the offending player.

An interesting factor in this is the wider cultural context, as Castell and Jensen noted was important. These terms developed as a part of game play but have since jumped into use in the real world. Noob has become especially common to describe a person who is new and unskilled at any activity. Derogatory terms are thus a social tool that not only translated from the real world into the game, but also came back out again.

What is the economic system in WoW and why did it become so?

Without holding out for suspense, the economics of WoW are, on the widest level, capitalistic. But this is hardly a shocking observation; the main system of trading in the game is an auction house - a method designed to function through outbidding and underselling - and this was bound to result in a capitalistic system. The interesting development here is two-fold: just how extreme the economic concept has become in connection to the auction house; and the alternative methods for buying and selling goods that players have developed to, dare I say, circumvent the game design.

The game designers built in a suggested value for all items in the game, both through the prices from NPC merchants and the initial price to appear when an object is placed in the auction house 'sell' window. It was, I'm sure, predicted that economics would push the prices higher, but the degree to which this has occurred is fairly shocking. Rare items might be understandable, but trade goods, items which a player could get for free simply by killing creatures during questing, are sold at prices that are truly mind boggling. Wool, for example, a second tier tailoring material, is frequently sold at prices well above silk or even mageweave (the third and fourth tier materials respectively). There are players who spend entire sessions just 'farming' for such materials to sell at the auction house in order to increase their wealth. This activity raises fascinating questions about the concept of wealth in real society, and how the values associated with it are transferred to the game world (for, as Woods quotes from Greenblat "any social actor has a history, and, hence, definitions of a situation are partly biographically determined, affected by the individual's unique stock of previous experiences and recollections"), as well as the lazy nature of consumerism.

There are alternatives to the auction house in WoW which unmask other types of economic exchange in the real world. Announcements of items to trade in city chat windows not only help avoid auction house prices but also introduce a barter system of goods for goods. Guild banks function like a charitable system within a societal sect, where 'wealthy' players earn social clout by donating items which can be taken at will by low level characters in need. There is also, as noted above, the communism-like equitable sharing among a player's numerous alts, where a grander 'common good' is easy to recognize.

How and why does a player choose to create the avatar they do?

To answer this question, it must be understood that a user who plays WoW socially, meaning that they regularly or exclusively quest, enter battlegrounds, or run through dungeons as part of a group, will have an avatar designed to fill one of three roles:

Tank - meant to attract enemy players and take the brunt of the assault, surviving through high defensive capabilities
DPS (damage per second) - meant to attack the enemies and kill them as quickly as possible
Healer - meant to maintain the health of the other players, particularly the tank

The different possible classes, skill sets and equipment a player can choose for their character defines which of these roles they will play.

When a player is designing their avatar (again, assuming the intent is to play socially) they thus make minor choices about appearance, gender, etc., but the most important factor is 'what role do I want to take in groups?'.

There are implicit in this an entire series of questions with parallels to real life. First there's the basic decision that 'I want to be part of a social structure' rather than playing alone. Next there is a balanced consideration of 'what do I want to do', 'what will I be good at', and 'what will make me a desirable member for a group' (tanks, for instance, are much rarer than DPS players). An obvious comparison to this contemplation is choosing a career, but joining sports teams and social clubs can also be considered similar.

I think this is quite enough background and explanation for me to come to my point: in both this week's articles the authors draw a very clear line between 'commercial games' and 'educational', and therefore serious games. At least in the realm of the social games that Woods discusses I dispute that there is any such line. A game doesn't need to purposefully model modern society for it to demonstrate the way it functions: as soon as you let loose players their interactions, naturally founded on the way they behave in real life, provide the parallels to society. They only need to be given reason to think about what they are doing and why. I would contend that many of the things that the authors pointed out about commercial games, ie. that they're better at drawing players in and keeping them coming back, or that they're meant to be played repeatedly and continually (allowing an in-game society to develop), make them a better learning tool than an 'educational game' where the player must be lied to (and the designer has to hope the lie isn't discovered) for the game to work, and which is only meant to be played once and thus the player either learns the lesson or must be told what it was afterwards (making it no better than a lecture). This was admittedly a very long road to take for such a disputation, but I think the examples I provided, only a small spattering of the potential questions that could be asked of a game like World of Warcraft (I didn't even get into the portrayal of races and genders, for example), stand as evidence that if researchers are attempting to find ways to create educational games, they would perhaps be better off asking how they can make commercial games educational.

Friday, 26 October 2012

Electronic Publishing

Discussion of texts:

Kay and Goldberg, Personal Dynamic Media
John Willinski, Toward the Design of an Open Monograph Press
Peter Suber, A Primer on Open Access to Science and Scholarship

The more I read for this class the more amazed I am of, as is pointed out in the introduction to the Kay and Goldberg reprint, the uncanny ability of some people to predict the future of technologies long before they come to fruition. I must point out, however, that there remains a fine line between a visionary and a crackpot, and for every person with a fantastical idea that actually comes true there are 99 people who are remain crazy (Xanadu come to mind specifically).

If we treat Kay and Goldberg as visionaries, though, there are several key aspects of their ideas and experiences as described in the article that perhaps help explain why they didn't end up in the loon camp. Their research letting children program computers was particularly intriguing for the results: youth walk into a project with no preconceptions of what is and isn't possible, and are far more inclined to original innovation as a result. The article states directly that the kids they worked with looked at programming for what it "ought" to do, not what it had previously done and therefore might do. I think part of what the authors brought to notebook computing was a similar perspective (as a result of this research or independent of it I'm not sure). Specifically in the line that reading on a computer "...need not be treated as a simulated paper book since this is a new medium with new properties" they show a tendency toward true innovation; they aren't just slicing bread and calling it a new creation, they're questioning if and why 'sliced' was ever the best way for bread to be in the first place. Vitally they also add the catch that the resulting product must "...not to be worse than paper in any important way", something that I'm sure many failed designs did not do, particularly those that seek to imitate the real world but collapse it into (effectively) 2 dimensions.Their genius only went so far, however, as other functions they describe are less innovative. In the painting function, for example, "a brush can be grabbed with the “mouse,” dipped into a paint pot…" and then used to draw. This fundamentally goes against the standards they set for the reading application by replicating arguably the biggest draw back to real world painting and adding no improvement to compensate for what is lost (eg. fine control of motion). Kay and Goldberg may have been visionaries, and may have been proven right by time, but they were also ultimately just a single domino in the line that has taken computing technology this far.

This topic overlaps conveniently with some very interesting discussions I've been having in my LIS course about online publication, and I can't help but incorporate some of the things pointed out in these conversations, so consider this my disclaimer that not all of what follows are my own original ideas, but rather extensions of things that have come up elsewhere.

I am suspicious of Willinski's use of statistics to demonstrate the switch from monographs to journals. Everything he states is in percentages and ratios, both of which can easily misrepresent the raw numbers. Even if the figures he gives aren't misleading however, I think it switch may be more representative of the consistent rise of the Sciences (at least comparatively if not directly to the detriment of the Humanities) in universities. Scientific research has frequent publication at a high turnover which (as pointed out in Suber's article) is better suited to journal publication. If there are an increasing number of students in scientific disciplines who need access to these journals then naturally more of them will be purchased, and in increasing proportions over time. Monographs, however, remain important for Humanities scholars, and Willinski admits that sales in many Humanities disciplines remain strong, but if the number of students enrolling in English or History, for example, remains constant, then it is only logical that the number of books purchased by libraries to support them would also remain constant.

Furthermore, Willinski's solution to a perhaps misunderstood problem is itself flawed in some respects. As he states, the benefit of a monograph format is that it can "...work out an argument in full...provide a complete account of consequences and implications, as well as counter-arguments and criticisms". The Humanities and monographs have a symbiotic relationship for this very reason: Humanities scholars go back, sometimes hundreds of years, to longstanding sources to build comprehensive arguments that can only be expressed in monograph form, as a result of this thoroughness the monograph they produce may still be of use to future scholars hundreds of years down the road who are building a similarly thorough argument. Willinski's attempts to move monograph online, where their lifespan will in all likelihood be much shorter, all but defeats the purpose of writing lengthy works to begin with. It makes the electronic monographs in many ways as transient as journals.

The push to move online as outlined in both Willinski's and Suber's articles poses several intrinsic problems, only some of which the authors discuss. The issue of peer-review, for example, is explored fairly thoroughly by Suber, and his arguments hold well enough for a reputable open source journal or repository. The nature of these projects, however, is that anyone with the effort can create an online journal, and there is no enforcing body beyond reputation to implement a thorough peer review process, but reputation takes time to accrue, and at its best can be fairly subjective. This means that researchers run the risk of either accessing an open source repository with no verification that it's contents have been properly vetted, or they access only those that are acknowledged as reliable, potentially creating an oligarchy of open source journals. This takes us to Suber's point about pay journals being more likely to be corrupt than open ones. If you have only a few trusted open source article sites, whose turn out is limited because of their screening process, what is to stop them from beginning to charge more for each article considered in order to ensure their overhead costs are covered even during times of lean submissions and publications? And why would they suddenly reduce those costs again when the number of incoming papers increases? Greed is an unfortunate portion of human nature, and I wonder if any 'open source journals' can or will maintain the values on which they were founded.

My final point is about the current importance of citation analysis in universities for hiring and maintaining professorships. The topic is mentioned in passing by Willinski, but I think it is a far larger consideration for the future of electronic publishing than either author suggests. As much as some studying bodies are moving towards an analytics approach rather than a 'times cited' model, at this point tracking source use through unindexed sites is far more difficult than in print. More importantly though (as per a LIS discussion) some institutions are apparently now setting standards as to exactly which publications count in terms of performance reviews. In this context open source repositories may let scholars 'take back' the forum of journals, but that power is for naught if what their publishing doesn't help promote their careers. I fear online publishing efforts may well hit a wall soon if this becomes a norm, and some collective agreement, however unspoken, will need to be reached about the value of open source journals before they can become more valuable.

Sunday, 21 October 2012

Multimedia

Discussion of text:
Manovich, The Language of New Media PDF (first two chapters, pages 43 to 114)

My first comment on this text has to be a distracting number of grammatical errors. I'm not even sure 'grammatical' is the way to describe it. More like typos where the incorrect letter still resulted in a real word and the spellchecker didn't catch it. Apparently the author wasn't into proofreading. I bring them up first because of quantity (there are at least four on pages 50-51 alone and they only become more frequent thereafter), and severity: some of them were bad enough that I had to reread the sentence a few times to figure out what he actually meant to say. I hope the international students in the class aren't bogged down by them. The author's name, Manovich, suggests that it's possible their first language isn't English, and therefore a certain level of mistakes are understandable, but some of the errors are grievous enough that they cannot be pinned on a language barrier alone (Van Gog, for example, on page 53). [An addition as per the discussion in class: I maintain even as a first draft there are an unbelievable number of mistakes. Particularly those referring to proper names are mind boggling.]

As for the content, honestly I walked away from the text with a very limited understanding of what the author's point was. At the beginning of each section I could follow what topic he was going to be discussing, but by the time I slogged through a series of cultural references (most of them, in my opinion, used to keep the book 'hip' and not because they helped explain the idea) I had completely lost the thread of what he meant. For example I understand that he disagrees with the five points that supposedly define 'new media' but I have no idea what he thinks actually defines it. But this is starting to sound like a book review, and that is not the point of this blog.

I fundamentally disagree with the author's assertion (on page 87) that the printed word is being surmounted by cinema. In the context of what he's talking about, I am pro multimedia: being exposed to the same issue through different formats can only increase understanding, but neither in computers nor in the rest of world is text losing importance. If you want proof just look at the slew of movies being produced that are based on books; if printed works weren't being produced cinema would be lost. Even in terms of webpages, while the YouTubes and Flickers of the world have become incredibly popular, I would need to see some VERY convincing statistics to believe that sites where text is a major component (if not the main component) are not multiplying faster.

I did, however, enjoy his discussion of modern computer screen capabilities and limitations, for example that, unlike previous mediums, modern screens can present realtime changes but is still limited to a single perspective and limited actions. I remain skeptical that virtual reality monitors will negate the latter problem; ultimately the user is still going to be tied to what was programmed into the system, and I maintain that no matter how no matter how diligent the programmer a user will inevitably find a way to move which was unanticipated and the system won't be able to handle. As amazing as a holodeck would be, I don't think anyone alive today is going to live to see one.

One of the most interesting ideas the author brings up for me, though one that goes away to some degree from multimedia, is the punchcard controlled Jacquard loom. Manovich discusses it as an early example of computers being used to produce images, but I find it fascinating in the context of computerized production. In a world of mass production and automated assembly lines, computer use in industry is now a common-place occurrence, but I had no idea it went back to 1800. Maybe earlier. I'm not really well informed on history past about 1700. Bringing it back to multimedia though, I'm thinking less of making cars, for example, or even textiles (though this is the obvious continuation of Jacquard's invention), and more of the creation of precise three-dimensional didactics by first modeling them in a computer and then having the said computer build a physical representation (usually out of plastic) layer by layer. I have seen such projects (out of the U of A actually) done to recreate archaeological sites, showing what the buildings would have looked like thousands of years ago. This is more an example of computers being used to create multimedia than multimedia existing in computers, but if Manovich gets to go off on a long discussion of performance art then why not.

Sunday, 14 October 2012

Hypertexts

Discussion of Readings:

Vannevar Bush, As We May Think
Conklin, HyperText

I'll begin with Conklin's article, since it describes the general concept of hypertexts, while Bush's "As We May Think" is an example of one. The most difficult part of reading "Hypertext" for me was that, being published in 1987, the ideas and examples it lists are at least 25 years old at this point. Computer displays and input devices have changed so much in this time that I have trouble visualizing what he means, or, in cases where screenshots are provided, understanding how the programs would have been used. I am going to start, therefore, by trying to provide modern examples of the "four broad applications" Conklin names, though this is as much to solidify the ideas in my own mind as it is to comment upon them.

The 'macro literary systems' are simple enough to compare (unless I've completely missed the mark) to a wiki: a database that holds multiple text-based articles (files), with links providing shortcuts between files, as well as within a file. I don't know where the graphical representation of the links that Conklin maintains must be present (although he goes on to list numerous examples where it isn't) comes in, but maybe there is such a feature that I have just never used.

The 'problem exploration tools' are somewhat more difficult for me to understand. Conklin describes them as "early prototypes of electronic spreadsheets", easy enough to picture as a user of Excel, and says that users can hide sections and subsections, which makes me think of iTunes and showing only the song information you care about as columns in a playlist. He goes on, however, first to discuss "teleconferencing" and moving between threads, then ranking other people's posts, which sounds like a message board: a series of posts and comments provided by multiple users which form a tree structure as people comment and then comment on comments. The last example he gives, though, 'outline processors' are effectively interactive table of contents, such as can be found on some Adobe pdf files, suggesting that the technology he's describing split into two streams (quite possibly before this article was written, though no one was aware of it yet).

Examples Conklin provides describing 'structured browsing systems' explain fairly clearly what they are; the help manuals attached to word processors apparently haven't changed much (except to add search functions, perhaps). The processes he outlines also sound very similar to how internet browsers work as well though: the user follows a series of links through different pages, and the browser keeps track of the path followed to allow backtracking.

In the 'general hypertext technology' section, the first few examples seem to have aspects of a course management system, or bibliographic tools such as Zotero, in that it is a place to electronically collect webpages, pdfs, files, etc. that re relevant to a topic, and to organize them into a tree structure using headings and subheadings. These do not, however, have what I would call links between the files, rather links to the files from a main page. Furthermore the later examples vary from this significantly. As Conklin says, this section represents tools meant to explore hyperlink capabilities, not to perform any specific task, so trying to find a general modern counterpart is a futile effort. It remains interesting though that some modern applications did apparently evolve from these tools.

These modern counterparts having been established, it is incredible to think, if Conklin's assertion is correct, that computer scientists were wary for decades of implementing hypertext technology, as it is now one of the most widely used forms of text analysis. Whether this change started out of increased computer use (as a result of reduced cost) or from a few programmers having faith enough in the technology to undertake projects showing what it could do, it certainly continued because of the capabilities and advantages Conklin lists. The ability to navigate through multiple arts of multiple texts with a few clicks or keystrokes was a genuine step beyond printed reading, one I'm not sure has been matched by any other developments.

The discussion of hierarchical vs. non-hierarchical structures takes me back to HTML and XML theory, where all boxes must fit inside other boxes for the system to function, and the limitations this leads to for overlapping features (eg. paragraphs spanning multiple pages). In many ways this is the same, of course; non-hierarchical arrangements are effectively just a series of empty tags, but the parallel concerns are intriguing.

On to Bush's article:

I once went on a trip to England with a girl who would take in the ballpark of 1000 pictures a day of everything from architecture, to people, to her meals. The only real reason for most of the photos was that there was no reason not to take them: the memory card in her camera could easily accommodate them, taking each snapshot needed only a moments time, and at the end of the day she could instantly erase any she no longer wanted with no added expense incurred. Bush's example of advances in photography allowing scientific process to be easily recorded is apt. As is his assertion that the problem then becomes slogging through such a plethora of data.

With computers and the internet we face a similar problem. SLIS courses extol to there students that online access to resources will not replace librarians precisely because someone needs to organize and catalogue the information. Bush wrote this article with no idea what was to come, but in his own terms still hit upon a "a future device for individual use, which is a sort of mechanized private file and library". His prophetic descriptions of hypertexts are almost eerie, but they make me wonder whether there is a subconscious collective concept as to how information should be displayed and interacted with, or if there is rather an unbroken line of scholarship extending back this far: if Burns had conceived of his memex differently, would computers and hypertexts look as they do today?

Finally, I also need to point out a few fantastic lines. If Burns' other works are similar to this one in writing style a book of quotations could be made.

"truly significant attainments become lost in the mass of the inconsequential"

"for at that time and long after, complexity and unreliability were synonymous"

"One might as well attempt to grasp the game of poker entirely by the use of the mathematics of probability"

Sunday, 30 September 2012

Text Analysis

Description of the texts:

What is Text Analysis - by Geoffrey Rockwell and Ian Lancashire
Computer-Assisted Reading: Reconceiving Text Analysis - by Stéfan Sinclair
The Measured Words - by Geoffrey Rockwell and Stéfan Sinclair

I'm not sure whether to acknowledge or ignore that the person marking this blog is also the co-author of the majority of this week's readings. It will undoubtedly effect my comments on some level. Let's start with Stéfan Sinclair's work and delay the issue temporarily.

I feel that in Computer-Assisted Reading: Reconceiving Text Analysis Sinclair presents several valid and note-worthy points. However, I also believe that his underlying principle is incorrect: there is not, in my mind, any fundamental "...incongruities between humanistic and scientific traditions". Theoretical physics, for example, functions very much like traditional humanities subjects; the scholar observes a peculiarity and speculates as to why it is so using previous research to support the theory, for a philosophy scholar this might be George Grote, for a physicist it could be Niels Bohr, regardless the 'scholarly tradition' being followed is the same. An experimental physicist might pick up his theoretical cohort's work and design an experiment (perhaps one based in computers) to test it, but so might a humanities computing scholar apply text analysis techniques to his comrades theory. My point is that Sinclair (and much of academia) treats the realm of subjects as divisible among sciences, social sciences, and humanities. I think in reality you could better understand research and researchers not by arranging them by their disciplines, but on a continuum from (for the purposes of Sinclair's argument) computer-based analysis to purely theoretical research,

on which each scholar could be plotted according to their preferred research techniques, with no regard for their subject matter. If it should happen that there are more science-based researchers on the computer side of this spectrum, then that is likely a result of more software existing to support their regularly quantity based research, and it is in this that I whole heartily agree with Sinclair's conclusions: we do need "tools that accentuate reading rather than counting" if humanities computing is to progress.

That being said, I am troubled by a possible implication in Sinclair's article that while in "...traditional text-analysis tools, data completely displace the text..." in his proposed tools this would not be the case, allowing a researcher to read and analyze at the same time. I am not certain if he is suggesting that since this is the case a scholar should proceed directly to a text-analysis tool without first having read the text sans-markup. I sincerely hope this is not the case, as text analysis tools can't help, in my opinion, but bias the reader towards focusing on particular features. The word frequency feature of HyperPo, for example, which changes the colour of the text depending on how often it appears, would naturally cause a reader to place more importance on those more common terms which appear in darker colours. A word that appears only once, however, might be of the utmost importance in its usage, but be overlooked because the programming implies it is unimportant. I have trouble believing that Sinclair meant such text-analysis tools could replace unadorned text, so it is perhaps that such is not clearly stated that bothers me, particularly since he felt it necessary to point out clearly that "whimsical and playful musings" were not a replacement for literary criticism.

I'll start my discussion of The Measured Words with a list of typos and grammatical mistakes:

pg 5 - "Strings are sequences of character that either have a beginning and end."

I'm guessing it was either supposed to be 'beginning or end' or 'beginning or end or both'. There should have been an 'or' somewhere anyways to complete the 'either'

pg 7 - "...non-printing characters like "line feed" and "return" than could be used..."

'that could be used'

pg 7 - "<emph>...</emp>"

I don't know offhand what the correct tag is, but I'm pretty sure the start and end tags should match

pg 11 - "Most scholarly projects use open formats based on XML and following guidelines..."

'and follow'

I don't know if this article has been published yet or not, but in case it hasn't enjoy these corrections.

I'd also like to compliment a few of the examples used of difficulties faced while treating words as strings of characters. Specifically the crossword puzzle reference (pg 6) and the use of contractions ruining 'end word upon punctuation' commands (pg 20). I doubt I would have thought of either of these scenarios. It makes me wonder whether the authors thought them up ahead of time, stumbled across them while writing, or racked their brains afterwards trying to think of exceptions. Similarly I enjoyed the comparison of foreign language texts to how computers read (pg 8). I have definitely ended up just looking for patterns in letters when trying to read Latin and Greek.

As far as content in the article, I find myself wondering whether research tools can ever truly be useful to a person who merely goes to a website and accesses them. The person who designs the program has the benefit of the "thinking through of formalizing and modeling" process, whereas a user sees only the end product: a text which has been broken and rearranged in the ways the encoder deemed informative. The quality of this is ultimately dependent on both the skill of the encoder, and the effort they're willing to put in to the work. Depending on the programmer's goal this also becomes a question of productivity: if they can encode 3 texts adequately in the time it takes to encode 1 thoroughly, which is the better option?

The question posed (all be it rhetorically) of "when is a variant way of writing a letter actually a new letter?" was particularly intriguing because of an issue I am facing in marking up a text in XML code for another class. The typeface of the text includes odd letter forms such as an elongated 's' and 'VV' for W. If the end game of this project was text analysis then these features would likely have to be ignored to ensure the computer could process the letter for intended meaning, even though this means information regarding the printer would be lost. For the 's' specifically an alternative Unicode character would have to be substituted (likely ʃ) which would replicate the design of the letter but carries an entirely different purpose. I am therefore left with, as the article says, a choice as to "what is important, and what is not essential to the representation".

Monday, 24 September 2012

Markup and Enrichment Reader Response

Discussion of the texts:

Renear, Allen, Text Encoding
A very gentle introduction to the TEI markup language
Renear et al. "Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies"

Two years ago the markup of texts was my first foray into Humanities Computing, and it is interesting, looking back, to reflect on just how little I understood about what I was doing. As this week's readings discuss, one of the benefits of TEI is its simplicity; given basic instruction a person can markup text without any knowledge of the principles behind the process. That being said, it's nice to finally have an explanation of why Oxygen rejected certain arrangements of tags.

My main take away from these articles is an extension of the discussion in the class on digitization: what is gained by text markup and what is lost? In Text Encoding Alan Renear outlines numerous advantages of descriptive markup, my favourites including information retrieval and the support of analytical tools. He then launches into a discussion of OHCO, SGML, XML and TEI and breezes by several potential downfalls of markup. For instance he states that in OHCO, the foundation idea of the others, texts "...are not things like pages, columns, (typographical) lines, font shifts, vertical spacing, horizontal spacing, and so on". For modern books and papers this may be the case, but in older documents, particularly handwritten manuscripts, the layout of pages can be just as important as the text itself depending on the interests of the researcher. I myself have struggled when transcribing and encoding texts with the information lost by not indicating, for example, that a line is centered on the page.

Renear, Mylonas and Durand examine the evolution of OHCO ideals, and the continued problems with it, in the third reading. The authors explain a founding principle of overlapping hierarchies: that texts naturally conform to a set structure based on their type which nest inside each other and do not overlap. They then show how this was refuted through counter examples, and the theory softened as a result. The most intriguing idea presented in this article, in my opinion, however, is one that the authors introduce and then abandon in their discussion. The 'Theoretical' defense of OHCO they present states that a "layout feature" of a text can change without effecting the content, but the structure cannot. As discussed in the previous paragraph, I disagree with this assertion; layout can be integral to a text, but the underlying principle (as described by the authors) is in my mind the key to understanding a text and creating a good markup: differentiating between "essential and accidental properties".

My first impression of A very gentle introduction to the TEI markup language was how far should I trust an author who in his own words hasn't "...quite learned how to write an XML document and display it with links in a frameset". This trepidation soon passed, however, first because I don't know how difficult it would be to do that, and more importantly because this document is a great example of why you don't want an expert to explain technical concepts. I can't speak for a reader that had no previous knowledge of XML or TEI, but as someone whose had some experience with it I feel this explanation was simple enough to be easily followed and at the same time brief enough to not be frustrating or repetitive. The numerous examples, a feature often underused by 'experts', were a big part of its clarity. [I know these blogs are supposed to be prompting discussion about the texts, not just praising them, but I really have nothing else on this one]. It was a great way to launch into HuCo 520's instruction on XML tomorrow, and I hope is a solid enough foundation for the upcoming HuCo 500 discussion.

HUCO 500