Digital Dorothy

As I described in the last class, I’m going to use a data set that is a text.  At first, I wanted to create a “diachronic” map of a particular place—the English Lake District—which is a popular destination for hikers, walkers, photographers, and Romantic literature enthusiasts. This last category also includes a great many Japanese tourists.

My first plan was to create a corpus of 18th– and 19th-century poetry and prose related to the Lake District (read: dead white males), explore the way landscape was treated, map locations mentioned in these texts or create a timeline, and then add excerpts of text along with the present-day visual data.

For the present-day component, I was thinking about how to scrape and incorporate data and photos from Flickr and Twitter that were tagged with the names of local landmarks and landscape features of the area.

mapping the lakes image in Google Earth

An image from Mapping the Lakes in Google Earth

Early on, I discovered Mapping the Lakes – a 2007-2008 project (apparently still in pilot phase) at the University of Lancaster that uses very similar strategies to explore constellations of spatial imagination, creativity, writing, and movement in the very same landscape. From the pilot project:

The ‘Mapping the Lakes’ project website begins to demonstrate how GIS technology can be used to map out writerly movement through space. The site also highlights the critical potentiality of comparative digital cartographies. There is a need, however, to test the imaginative and conceptual possibilities of a literary GIS: there is a need to explore the usefulness of qualitative mappings of literary texts… digital space allows the literary cartographer to highlight the ways in which different writers have, across time, articulated a range of emotional responses to particular locations … we are seeking to explore the cartographical representation of subjective geographies through the creation of ‘mood maps’.

The interactive maps are built on Google Earth; therefore, don’t try to view this in Chrome. You can also use the desktop version of Google Earth. The project is quite instructive in its aims as well as its faults and failures, and the process and outcomes are described on the website. (Actually, the pilot project might be a very good object lesson on mapping creative expression with GIS.)

However, if you’re interested in this kind of mapping, you should take a look at the Lancaster team’s award-winning research presentation poster on their expanded Lakes project:

I wrote to one of the authors to ask her about it—methodology, data set, etc. She was happy to respond, and was encouraging. Although the methodology is way beyond my technical chops at present, she referred me to a helpful semantic text-tagging resource that they used, and I’m sure will come in handy at some point.

After some floundering around, I defined a data set and project that is challenging but more manageable. It will involve a map and one text: an excerpt of Dorothy Wordsworth’s journals, from 1801-1803— not long after the second edition of Lyrical Ballads was published, and she and her brother moved to the area with their friend Samuel Taylor Coleridge.

The journals are a counterpoint to William Wordsworth’s early poetry, in that she kept them as much for her brother as for herself—recording experiences they had together, and personal observations that she knew would inspire him—to provide the raw material for his poems. There is a not extensive yet established amount of scholarship on the subject. She even describes this collaborative process in her journal—although it’s not called collaboration, and until more recently wasn’t characterized as such by critics.

To prepare the data set, I downloaded the text file of the most complete edition of her journals from Project Gutenberg, took out everything not related to this time period, and did a lot of “find and replace” work to get rid of extra spaces and characters, editorial footnotes, standardize some spellings, and change initials to full names. Following the advisories on the semantic tagging and corpus analysis sites, I also saved the file in both ASCII and UTF-8 text formats, with line breaks. (This may or may not prove necessary, depending on the tools I use later on). I have considered using a concordance tool of some kind (like Antconc) to visualize those connections, since I don’t think that has been done. However, this would entail creating a second data set with the book of poems and it’s a secondary interest.

My primary goals are these:

  • I’m hoping this project will confirm or complicate existing assumptions about Dorothy and her journals, which until now—as far as I know—have only been developed through close reading, not visualization.
  • Using this text, I want to map her life in the Lake District during this period – socially, physically, and emotionally. (In her brother’s case, his poetry does a good job of that, and stacks of books have been written about his relationships to other people, women, landscape, time, etc.)
  • I want the map to be interactive to some degree, so that users can trace these different aspects of her life geographically, by clicking on related keywords. Ideally, I would like to include supplementary images—paintings, engravings, and portraits—that were created in the era, to provide a contemporaneous visual component. Including related excerpts of journal or poetic text would also be helpful: it would be a means of mapping her creativity, in a way. A similar map of  William Wordsworth‘s creativity exists. It is more extensive but not very user-friendly.

On the cartographical front, I have been considering CartoDB and Mapbox. I also looked at the British Ordnance Survey topographical map of the area, which, like all the ordnance maps, is now online. The OS website includes a feature similar to Google maps, whereby you can personalize maps to some degree, and connect text and image data. Of course, Google Earth can be used this way too. Mapbox has nice backgrounds, but less options. CartoDB is visually pleasing,  versatile, and allows for more elegant “pop ups,” which I could use to include bits of text, images from the time, etc. But it can’t be embedded into a webpage. As they come into focus, the project goals will ultimately determine what I use.

In the meantime, I’m using Voyant to explore the text/data set. It is a great resource to help you define the parameters for a more focused project. You can see what I’m working with here. Eventually I will geocode the locations, either by hand or via Google Maps, input location data, temporal data, and data about her social interactions (all in the text) in a CSV file that can be uploaded into a mapping program, and figure out how to connect everything . (Or I will die trying.) I also plan to study the new and improved “Mapping the Lakes” project more carefully, for ideas on how best to present my own, less ambitious project.

Along the way I’ve encountered some other software that may be useful for those of us who like working with olde bookes: VARD is a spelling normalization program for historic corpora (also from Lancaster U. It requires permission to download but that is easy to get).

That is all.