why TEI?

After reading this week , I still don’t feel extremely competent in my understanding of  the TEI beyond a conceptual level. That being said, I think the idea of necessity in relation to TEI standards is an interesting one. I think that all of the benefits of using an encoding system are overshadowed by one major benefit: symbiosis. For both computers of varying styles and people of varying education backgrounds, using the TEI allows harmony between all. Because they are not tailored to specific people/places/things, the TEI standards have allowed “a standardized way to structure and express [information] for machine processing, publication and implementation” since their inception around the 1980s. Helping the computer separate information (the text in the document) from meta-information/metadata (the text about the document) makes search, retrieval and review a smooth, streamlined process.

Think about the idea of form and content, or the relationship between them. I think the idea that purpose is directly related to performance or utility of the TEI is a valid claim. In “A very gentle introduction to the TEI markup language,” an assigned reading for my ENGL 530 Class, the archives of Supreme Court materials were mentioned. They are predominately audio files, however one stops to think that when “the TEI is used to encode the written transcripts and synchronize them with the audio files so that you can listen and read at the same time,” this doesn’t separate form and content—it increases your ability to embrace the relationship between both. The triple translation lets you see the content (read) while hearing the content in its original form (listen). The only thing missing is the reality of being in the courthouse when the opinions are read. This then allows for not just search and retrieval with ease, but the power to more readily consume data.

Another reason (although not so sound) that the TEI may be deemed useful is it’s extant longevity. “The use of markup goes back to the beginnings of electronic text technology…Without markup, only very simple searches can be carried out on a text… When large quantities of text are being searched, markup becomes more crucial…. It is the only way to indicate the location of words that have been retrieved or to restrict the search to portions of the textbase, for example works by a particular author or within a specific period of time. Attempting to search a text without markup is rather like searching a library catalogue which is a continuous sequence of text, where the records and fields (author, title, subject, etc.) are not distinguished at all.” Because markup language and encoding texts have proved themselves useful time and time again, I think that the possible detriments (too many options, not specific enough, loss of form or alteration of content and context) are superimposed by the benefits.

Something to consider when talking about the utility of the TEI, however, is whether or not it will continue to stand the test of time. It has been upheld as standard for so long that one cannot help themselves from thinking that at some point, something else will challenge its position.

link to ENGL 530 reading, if interested: http://www.tei-c.org/Support/Learn/mueller-index.htm

Posted in Week 9 blog post | Leave a comment

Daniel White – Week 9 Blog Post: Computers Are Dumb

When I was a junior in high school, I had an English teacher who would always stress the importance of reading back through our papers carefully and editing them for any subtle typos or grammatical errors that spellcheck may have missed.  He would say, “Check your paper for typos!  Spellcheck doesn’t get everything.  It is a computer and it is dumb.”  He was a fantastic teacher and a great guy, but it always bothered me when he said this.

I guess I just grew up believing that computers knew more than humans, that they could never be wrong.  Obviously I know that spellcheck isn’t 100% accurate all the time—no, I do not want to capitalize that word, stop asking me to—but in general computing efficiency and reasoning, computers seemed pretty infallible.  I don’t want to speak for anyone else, but I feel as though most people my age and in my generation grew up thinking the same thing, that computers were superior to humans in knowledge, computing power, and intelligence.

The section ‘Procedural and Descriptive Markup’ from this week’s reading stuck out to me.  I couldn’t help but think of my teacher’s claims of how computers were dumb (or at least imperfect) when reading about how “Computers…need to be informed about these issues [such as how to interpret italics] in order to be able to process them.”  The reading explains how humans “Through their cognitive abilities…usually have no problems selecting the most appropriate interpretation of an italic string of text,” because we have human reasoning that includes things like context, past experience, and inferencing.  Computers can’t do all of these things on their own.  They need to be told what to do, how to think.  And who’s telling these computers how to think and work and process information?  Humans.  Human computer coders are the real brains behind the “intelligence” of a computer, not the computer itself.

I think this is what my teacher was getting at.  Of course computers are extremely good at guessing and predicting (and they only get better as technology advances—I mean, just look at how autocorrect improves with every phone update), but computers can only guess and predict and compute as well as their coders.  My teacher knew that there would inevitably be flaws within this computer coding, so he warned us all to be cautious and not rely 100% on computers for catching every little mistake of ours.

Posted in Week 9 blog post | 2 Comments

Digital Abstractions

As we delve deeper into the constructs governing digital representation, specifically this week with TEI (Text Coding Initiative), I find myself constantly amazed by the ingenuity displayed by the nameless groups or individuals who defined the systems in use today – all underneath the surface, all requiring a certain level of interest and perseverance to understand. Modern computers have been abstracted so far away from the electric pulses, boolean algebra and binary digits that allow everything to work underneath the hood, providing the beautiful Graphic User Interface that all of us utilize regularly. But it is not only digital geniuses who have the ability to create in this digital space. Similar to TEI, text-based languages and unique schemes functioning on top of this graphic window abstraction give us and any other regular person the power to create content as well.

Although the TEI article is titled with “introduction,” I think it would be helpful to read a few other pieces to understand XML a bit better before beginning. I found Introduction to XML for Text on the TEI website very useful.

In it, XML is described as “an open and non-proprietary standard that specifies ‘a simple data format that balances the needs of people to read/write data with the needs of machines to read/write data.” The fact that XML gives us a general framework to define our own language to markup documents and define data schemas makes it much easier to save and share different web documents (filled with data) on the Web.

For a better understanding of how XML can be very useful, check out the two examples of how XML has been used to create standard ways of sharing data concerning news and weather here.


Posted in Week 9 blog post | Leave a comment

Symbolic Symbols and Coding

I’m not going to lie, seeing all those movies, television shows, and the likes when I was younger (and maybe even still if I’m honest) that portrayed hackers and computer geniuses that created and destroyed and created again all in a digital medium were enticing. I could be like Felicity Smoak in the DC Comic Arrow. She’s constantly hacking into traffic cameras and government feeds to save the day. That could be me. I could save the day without having to put myself on the front line, rather just online. I often dreamt of being able to code like that, to create and invent spaces on the internet for fun and for passion. Unfortunately for me, I’m bloody awful at coding, even just understanding it. This week’s reading has taught me that much. I understand the basics, but the most I’ve ever done is copied and pasted coding from one website to another.

I suppose that’s why the idea of learning coding, even in its most basic form, still draws me in. I mean, there’s a ‘metalanguage’ for mark-up languages to be created for different purposes. It makes sense, yes, but the first time I read that premise I’m pretty sure I was blown away that it took enacting a standard for everything to fall into place. People didn’t simply have a standard, one has to be created. It took until 1998 for the metalanguage XML to come about. In perspective, I was four. I was four before there was an effective way to standardize coding.

Granted not many people had computers at this time, but honestly, why wasn’t this happening simultaneously with the invention of computers? Why did we wait to standardize? Why wasn’t that forefront in the process?

It gets me thinking about this though: an ampersand is the start entity reference, but to have an ampersand symbol you have to type out the references, both beginning and end and the code for they symbol you wish to appear: ‘& ‘. We have codes for symbols, because symbols me something else entirely in this context.

What else to we do with symbols?–I ask introspectively as I suddenly begin to contemplate the universe through a series of cyclical thoughts initiated by this question. I’m slowly understanding this whole, ‘mark-up language’ thing we’ve just read about, but I’m also just curious as to why our symbols need symbols. Why can’t we just make things easy? Why can’t our symbols just be symbols? Why is there a double meaning?

Also, why can’t coding be easy? Even the rules have rules. I just want to help vigilantes and be like Justin Long in “Live Free of Die Hard” or Neo from “the Matrix.”

Posted in Week 9 blog post | 4 Comments

XML Language Pros & Cons

As we reach the midpoint of the semester, this class has had an impact on the way I think about the web. It is a little strange, but I am more fearful about my web use. Yesterday, I looked at a shirt on an online store, and the shirt followed me around the web appearing on the margins of my Facebook, articles, twitter, and just about everywhere I else I surfed. When I finally thought I had escaped, I opened my phone and there it was! It had jumped from my computer to my phone! Freaky stuff.

So, this week’s reading has given me a little bit of a better understanding of how the meta-information of life is created. I am thankful to be able to understand the mystery of the stalking-shirt. I find the TEI community’s structure both relieving and concerning. I like how the article, TEI by Example, explains that there is creative freedom in markup languages: “The conclusions and the work of the TEI community are formulated as guidelines, rules, and recommendations rather than standards, because it is acknowledged that each scholar must have the freedom of expressing their own theory of text by encoding the features they think important in the text.” There is some level of democratization when it comes to the creation of the markup language. The authors are not restricted to one organization’s understanding of what is correct, which is great because we all know that a strict structure could not possibly fit the diversity of information out there.

But, the danger of manipulation is not gone. We know this from our many discussions of algorithms, etc. The article mentions that while no one can own the XML language itself, corporations can own the XMLs that they create. Some of these markups are immensely powerful and influential. So, while there is freedom in XML creation, we are still susceptible to the XML maker’s agenda. As we start to become markup artists, we should keep in mind what it is like to be on the receiving end of information when one does not know who is behind their meta-information.

Posted in Week 9 blog post | Leave a comment

Complications with TEI

This reading, though apparently a mere introduction to TEI, proved difficult to digest.  Though many terms such as entities, delimiters, and markup language were explained, as I progressed through the reading I had trouble remembering their meanings.  I found myself overwhelmed at the little knowledge I have regarding coding, and I hope that perhaps lectures will provide some insight that I did not grasp in the reading.  Reading this and finding how little I understood the concepts, I continue to believe that coding is necessary education for the younger generations to have nowadays.

Despite the difficulties, I was able to understand some information we previously discussed in class that applied to TEI markup language.  For example, TEI catalogues the actual text and the information about the text, known as meta-information.  Therefore, I applied my previous knowledge regarding metadata with this technology of tagging and meta-information in TEI.  Some of the twenty-one modules the reading discussed contained different subjects that also made sense when connected to the concepts about metadata that we have discusses previously such as the TEI Header which I believe we will have to tackle when we look at the life histories for our project.

The most interesting part of this reading, however, stems from the history of its standardization because it relates once again back to the notion of rhetoric and its power.  Similar to Couch’s outline for his writers in the Federal Writer’s Project, TEI as a standard tagging markup language provides the same rhetorical power for writers.  TEI organizes metadata for a text in a standard that is meant to be efficient yet still is under the bureaucratic control of a minority of experts with its creation and adaptation over time.  How do we adapt the standard to fit our needs when we tackle the life histories?  As we questioned the indigenous ontologies and the more democratic means of documenting native history, how will we document these life histories so that we can claim full authenticity while also making these stories easy to search for future researchers?

Posted in Week 9 blog post | 1 Comment

Democracy in Encoding

To be honest, I’m still confused about the differences between XML and TEI and all of the other loud, angry, high-tech acronyms that this reading shouted out at me. I looked up videos on YouTube to try to clarify things, and I found a few useful basic introductions (https://www.youtube.com/watch?v=Q0k5ySZGPBc), but I still don’t fully grasp all of the concepts and examples provided by the reading. I reread sentences and tried to transform my mind from a human brain into a robotic intellect. I somehow expected to immediately understand and remember every little aspect of encoding. But learning these metalanguages seems to very closely resemble learning any language; at first glance, it seems impossible to comprehend, but then once you start to learn the basics and gradually progress into more complex territory, suddenly it all starts to make sense (or so I’m hoping). I’m fascinated by this new behind-the-scenes look into the world of computers and software. I want to learn how these metalanguages work and how people have been using them. I’m eager to apply these skills to the life histories that we have been entrusted with. But first I have to realize that I won’t immediately get all of this overnight, that this is a whole new way of thinking that has been invisible to me for years.

The TEI Consortium seems to contain democratic ideals. Its “four fundamental principles” include the requirements that “The TEI guidelines, other documentation, and DTD should be free to users,” “Participation in TEI-C activities should be open (even to non-members) at all levels,” “The TEI-C should be internationally and interdisciplinarily representative,” and “No role with respect to the TEI-C should be without term.” These principles are impressive with their out-right focus on open access and widespread representation. Yet my mind goes back to a previous week of class, in which we discussed whether or not coding should be taught in schools, and if so, to what extent. How democratic can these metalanguages be if I barely knew they existed until now, in college, an institution that many people don’t have easy access to? How many people actually put in the effort to learn these skills, and how useful are they in the everyday world, regardless of one’s career? Computers are everywhere; should most of those who use them also know the behind-the-scenes aspects of them, such as encoding?

Posted in Week 9 blog post | 2 Comments

Land Before TEI

We can’t go a class period without questioning how to present the past authentically.  Over the past several weeks, we’ve all made and read suggestions such as shifting back to the list-like form of the medieval annals, making the recording process collective, or, like Couch, gathering history from the mouths of the people.  Whatever methodology, it seems that we’ve all agreed on one thing: there’s no “one size fits all” strategy.  Instead, just as when representing data visually, different recipes of techniques are needed depending on context; however, this week’s reading, “TEI by Example,” seems to warn against such an anarchic system.

Before standardized markup languages were developed in the 1980s, every project employed their own systems, as well as the software needed to analyze these patterns, rendering it nearly impossible for humanities researcher to share texts and programs.  It was as if every set of scholars wrote their notes in different brands of invisible ink which could each only be read when placed beneath the glowing rays of a brand-specific, highly expensive light bulb.  While it might sound absurd, I’d like to extend this analogy to history–the markup language of the past.  If all historians record, and attempt to make sense of, events using random combinations of various methods as formerly proposed, will we be left with a view of days gone by that is as diverse, but also as murky and useless, as encoding in a time before TEI?

The answer, I believe, is yes.  Picture this: one historian walks into a World War I convention with a neatly bound narrative, and he is soon joined by hundreds of others, some clutching poster boards marked with timelines and others waving CDs on which they’ve recorded propaganda songs.  There are men in the corner waving guns in the air and decked out in uniforms while nearby, a woman holds a stack of love letters sent from soldiers to worried wives.  They all have the same passion but very different ways of displaying and assigning value to it.  Who gets to present first? And once this has been decided, how can those used to studying garments join in a scholarly analysis of music?  The convention falls into chaos, and everyone goes home frustrated, unable to share valuable bits of knowledge because of too much diversity in format.

What’s the solution?  This is the million dollar question to which, as I’ve previously stated, there is no single response.  Here’s my stab at it, for what it’s worth.  Taking TEI as an example, historians should create a list if guidelines–not rules or standards–which will help cut through the chaos while still, like the markup language, providing scholars the choice of customizable elements with which to represent their research.  Just as TEI’s are generated by a consortium, these historical representation guidelines should be compiled by an international panel of historians.  Once put forth, they will be voted on and amended.  Hopefully, the end result will be something which will enable the study of history to come together in one, democratic and representative way.  Let’s learn from example.

Posted in Week 9 blog post | 1 Comment

The Art of Curating

All throughout this week’s readings, this idea of representation kept returning. What interested me most was the link between illiteracy and who was underrepresented—in short farmers and slaves during the 20th century, people who couldn’t curate their stories themselves. However, as time progresses, this idea of self-representation, that if you believe that yourself or a group you belong to has a narrative worth curating, you must do it yourself. Yet the organization of this information gave light into why it was curated in the first place.

A common thread of the publications by the Federal Writer’s Project was this idea of authentic depiction of a group. Much like the Soape’s article, my immediate reaction—easpeically when reading the slave narratives—was that the interviewers would somehow muddle or inaccurately tell a person’s story. However, spending a few minutes on Photogrammar website, I saw this multimodal archive in contrast to the narratives of the Federal Writers Project. In a flash I saw nothing but room for objectiveness or for error on the part of the person trying to capture this moment in time. It seemed to me that it was very inaccurate to stage photographs—in essence its not capturing a still moment, rather captures the idealized art that the photographer is trying to display.

Yet, I realized that there is an art of curating, both for the Federal Writer’s Project and the photographs of the Photogrammar website. I realized that there is a difference between a photographer, and a photojournalist; a difference between an investigator, and merely a writer. Knowing this allowed me to realize the people in charge of capturing and documenting the stories, narratives, pictures, etc. want to depict them in accuracy and with authenticity, but also in a way that is meaningful. What is the benefit of having story upon story, or picture upon picture, without first making sure its something striking, informative, worth reading.

Posted in Week 9 blog post | 1 Comment

Daniel White, Blog Post 8 – The Struggle of Authenticity

Achieving true objectivity through photographic work is nearly impossible.  Once the camera is there, the scene, the people involved, the situation—it’s altered, it’s different from what it would be like if the camera wasn’t there.  Yes, most of the time it’s such a minuscule change, a very slight difference, but the more people are aware of that their situation or event is being captured or recorded, the more subjectively they’re going to represent themselves based on who is there documenting them or their situation.

I thought of all this after reading pieces of William Couch’s These Are Our Lives and while looking through a bunch of the pictures on the Yale Photogrammar website.  I wonder how many of these pictures are actually “objective” and how many of them are staged.  The idea that all these pictures could be objective really gets me thinking about what the Photogrammar website intends to preserve through archiving these images.

I also thought of the word “authenticity”.  The authenticity of Couch’s stories, the authenticity of the Photogrammar photographs, the authenticity of the work of the Federal Writer’s Project.  Two years ago, during my freshman year, I took the class FOLK 202 (also ENGL 202), Introduction to Folklore.  The class mainly focused on Southern Appalachian folklore and history, but some of the themes and stories we are discussing in class overlap with things we learned about and read in FOLK 202.

The main reason I thought back to that class, though, was because we also talked a lot about objectivity and authenticity.  My professor believed that once anything was taken out of its original context, it’s no longer authentic, no matter how hard you try to preserve its original state.  He was mostly talking about physical objects that wind up in museums, but I feel the same goes for these photographs.  As objective as the pictures may have been when they were taken, viewing them now, we’re going to have a different view on them, a different vantage point.  When we see these pictures, we are automatically going to have a completely unique and different reaction to them than we would have if we had actually been alive during the time of the photograph and experienced it for ourselves.  We are so far removed from those photographs that sometimes it’s hard to imagine that we really understand what they are supposed to mean at all.

Methodist church, Unionville Center, Ohio

For example, this photograph of a Methodist church in Unionville Center, Ohio from the Photogrammar website looks candid enough, but how are we really supposed to know what was going on, and if those actions were truly portrayed in the final photograph, or if people altered themselves because they knew that the photo was being taken.

The only way to discover their authentic meaning, would be to go back in time and be in that very moment ourselves.  But alas, we do not have a time travelling DeLorean that would enable us to do so…

Posted in Week 8 blog post | 2 Comments