Author Archives: Sara Deniz Akant

Data Presentation: Neologisms in Two Manuscripts

For my dataset project, I eventually used a combination of – – – Voyant Tools, Sublime Text, and Excel – – – to generate / visualize the words that DO NOT appear in the dictionary (based on a list of words from the MacAir file) – that is, “neologisms” in two manuscripts of my own poetry: PARADES (a 48 pg chapbook, about 4000 words total, fall 2014), & BABETTE (a 100 pg book, about 5500 words total, fall 2015).

The process looked like this :

  • Voyant Tools (to generate word frequencies in manuscripts)
  • Sublime Text (to generate plain text and CSV files)
  • Excel (to compare words in manuscript to words in dictionary)
  • (& back to) Voyant Tools (to generate word clouds with new data set)
  • (& back to) Excel (to generate column graphs with new data set)


Here are the results for neologisms that occur more than once in each manuscript, in 4 images :





PARADES NEOLOGISM GRAPHPARADES column graph (screen shot)


BABETTE NEOLOGISM GRAPHBABETTE column graph (screen shot)


What did I learn about the manuscripts from comparing their use of neologisms this way?

  1. Contrary to what I thought, I actually used MORE neologisms in Babette than I did in Parades
  2. The nature of the neologisms I used in each manuscript (do they sound like Latin, like a “real” word in English, like a “part of a word” in English, or like an entirely different thing altogether?)
  3. … SINCE I actually only finished creating these visualizations today (!) this kind of “interpretation” is much to be continued!


I ALSO tried to visualize the “form” (shape on the page) of the poems in each manuscript using IMAGE J – here are a few images and animations from my experiments with PARADES (you have to click on the links to get the animations… not sure they will work if IMAGE J isn’t downloaded) :

MIN_parades images

parades image J sequence parades images projection image j 2

Projection of MIN_parades

Projection of parades

Projection of Projection parades min z Projection of SUM_Projection

parades min v

SUM_Projection of MIN_parades

Report from the Eng. dept – First Year Comp Exam

Kathleen Fitzpatrick’s book, Planned Obsolescence, and the class discussion with her last Monday, has recently become relevant in my own path through academia. Over the course of last week and over the holiday weekend, I was asked by the English Dept. for my input on their proposals to re-model the first-year comprehensive exams.

You all may know about these in some way, but let me first briefly describe how it works, esp at CUNY / the GC. This is the exam that all PhD candidates in English must take before moving on to the next stage of their program. Not every department has them, but almost any PhD program in English seems to. I can’t speak to too many programs, but I do know that at Harvard, for example, they have a “Comprehensive Exam” they dub as the “100-book” exam: you must read and know a 100-book canon (gag me) like the back of your hand, and then go into a timed room of 3-4 faculty and spit out all your knowledge.

When I looked at the GC program, I was glad to see that it used a different model, one which didn’t seem to favor any particular canon. That said, it is still a full day, 8-hr, timed “exam,” in which you speed-respond to given essay prompts on an “empty” (brainless) computer.

After reports from students about the uselessness of this exam to measure their skills as thinkers / writers / teachers, and / or prepare them for “advanced study” (not to mention the fact that it penalizes students with learning disabilities or ESL), the Department has recently decided to try to change the model.

(Don’t worry, I’m going to get back to Kathleen’s points soon).

The model now under review is a “Portfolio” in the place of a test. It would consist of: one “conference paper,” one “review essay,” and one teaching syllabus.

As someone who has tested as “learning disabled,” I was certainly happy to hear that we were moving away from the timed exam.

And yet, looking back at Kathleen’s arguments made me re-think how “great” the Portfolio model really would be. As a poet, I’m interested in creative + critical teaching and practice… in building new “forms.” I’ve never written a review essay, and I’ve never attended an academic conference. I always worried that my lack of desire to do so would prevent me from getting my degree. But maybe I’m right: as Kathleen prescribes, we should be focusing more on the “process” of research, rather than the finished “product” (the review / conference papers). Maybe those are obsolete forms – forms that work towards the obsolete academic dissertation – which in turn work toward the obsolete academic book. Or am I just screaming in my head, “Don’t make me write a conference paper! I’m just a poet! Get me out of academia now!”

I have two answers to these questions. The first is: great, I finally have some smart argumentative backing (from Kathleen’s book, and our DH discussions all semester) to encourage my program to move away from the purely academic model of scholarship that is merely required, rather than wanted or needed. The second is: rather than wasting my time worrying that “pure academia” would come to get me, I should believe that I can actually interrogate these forms to create the type of work I want to do and see.

If we are given the Portfolio model, I have options, not limits. I can write, lets say, an open-access review essay. I can work collaboratively with other thinkers, perhaps even non-academic thinkers, online. I can write a conference paper both “about” and “demonstrating” joint creative and critical practice, and thereby question the form of the “paper” itself. I can certainly be grateful that I don’t have to spend all summer sweating about “failing” a biased timed-exam, and that I didn’t go to Harvard. Most importantly, I can think about the question of whether, by fixing the broken parts of a broken machine (rather than throwing them all away out of frustration, fear, and anxiety)… perhaps the machine will eventually start running well again; running somewhere new.

Teaching and Learning with Blogs

Kathleen Fitzpatricks’s emphasis on the importance of blogs in the maintenance, creation, and development of critical thought and academic communities, lead me to consider the function of “blogs” in academic teaching. Particularly, first-year writing.

I’ve taught writing courses and seminars using a WordPress blog for 3 or 4 semesters now (funny that I can’t really remember) – – – and have always struggled to get my students to use it. I’ve even struggled to get them to join it. Part of the issue was, clearly, the fact that I didn’t really know how to use these blogs myself – – – at least not “optimally.” (I’m hoping to attend a WordPress workshop before teaching again next year!)

But another part of the problem seems to run deeper – even, as Kevin pointed out on Monday – in our own DH Praxis class.

Along with student resistance to engaging with new (or really, unknown and thus intimidating) technical skills, the problem seems to be linked to the fear of exposing oneself online (as discussed last night). Exposing oneself in writing (which we are taught must be perfect, or precious), exposing oneself in permanence (rather than aloud, with no recording), and exposing oneself in front of peers and teacher(s), who might pass judgement for all sorts of reasons (this post is too long (I know), this post is too academic or too casual, this post is too short, this post is offensive, this post is irrelevant etc.)…

I myself have struggled to post on this blog, and this of course feeds my interest in the matter. Why? Perhaps its because, when I asked my own students to “post on the blog before every class,” it lead to a very difficult classroom situation. We all ended up repeating the same ideas over and over again. Because of this experience, I may have some illogical fear of being somehow forced to repeat myself, or to choose between ideas expressed “in class” and “on the blog.”

I admit that I have, at times, withheld a thought in class, deeming it “better for the blog.” I decide that I need more time to think it out; that I can express it better in writing. I’ll take copious notes, then go home with every intention of posting my thoughts. But then when I type it out, I get in over my head. Is the comment still relevant, has it become too heavy, long, or intricate in writing… too “developed”? Not blog-worthy. Turns out that if you “hide” a thought in order to work it out alone, expressing it can become a far more difficult task. I think this speaks to Kathleen’s ideas of being transparent rather than hidden, thinking and writing “in real time” rather than in time… delays.

I wonder how we can make classrooms – and academic communities – work both “in person” and “online.” How do you teach effectively both in person and with a blog? Matt & Kevin’s suggestion to post on this blog only 4 times, on subjects that are not often addressed in class discussions – is a far better model than ones I have used in my own classes. I’m definitely going to try to take this strategy to my writing classes. I’ve addressed the classroom community with “real time” “draft workshops” for each student’s paper, but I’d love to create an online community for the students to communicate about undiscussed topics, too. Perhaps the “draft workshop” can even go online. I see some connections here.

And as for the (serious) issue of self-consciousness in “public,” in writing, or “online.” that’s probably just a matter of getting used to the blog form. I still have far to go as both a student and a teacher – – –

Workshops: What I learned, and how…

… it helped me this semester :

I learned what I didn’t want to know.

Which is valuable! Here’s a quick run-down, for anyone who’s interested :

First, I went to “Scraping Social Media,” on October 19th, which was taught by a very energetic and helpful woman named Michelle Johnson-McSweeney. The workshop moved quickly, but I was able to keep up, especially by sneaking questions to the excellent lab partner next to me (JoJo). After learning about the various interests, reasons, and concerns about gathering data from the likes of Twitter and Facebook, we moved on to actually “scraping” those sites – which worked for the most part, and felt quite satisfying. There were of course, some issues, and these became more present towards the end of the workshop. The main disappointment I remember was the I couldn’t “scrape” Twitter on a Mac… at that point I hadn’t been considering doing a project exclusively on CUNY computers. Nevertheless, the workshop was encouraging enough to lead me to think of projects for which I could use this tool. This lead me to my first (overwhelming) data set proposal: scrape the web for data regarding a controversy in Best American Poetry. Unfortunately, as soon as I went down that rabbit hole, I ended up composing a project that was totally unmanageable, about “Appropriation” in contemporary poetry. It was way too big. So I moved on to something else:

I had a book come out November 1st, and I thought, why not just use my own poems? This was a moment of anxiety for me – I felt that I could “thick map” my book, create a hypertext version of it, disclose more information and “be transparent,” perhaps take some responsibility for my own “appropriations.”

So, the next workshop I attended was “Text Encoding,” taught by the ever-wonderful Mary Catherine Kinniburgh. I was pretty excited to learn about “code,” excited about the prospect that I might one day learn to “code,” excited overall to lock down some acronyms at the start, such as HTML, TEI (the focus of this workshop), and XML. However, as the workshop progressed, I naturally started wondering whether I wasn’t up-to-speed enough to be here. Or rather, that my “hypertext” project idea wouldn’t actually benefit from TEI. If HTML stood for “hypertext mark-up language,” wasn’t that what I needed to learn first? The TEI projects we looked at were Shakespeare plays, and some Latin / Greek texts, and it was great to learn more about the “backbone” of how text is encoded, with plenty of examples and explanations.

But even more than realizing HTML was what I would probably need for my hyper-text project, I realized once again that hyper-texting my book of poems wasn’t really a “data set.” I went back to my idea of “deformance” (interpretation + performance). I wanted to try to learn something about the language in my poems, and to simultaneously make “art from art.” I regretted that I had forgot to register for the “Data Visualization” workshop a week prior before it filled up.

So, although my path through these workshops may have felt like a bunch of (gentle) dead-ends, I do think that they helped me arrive at a project, albeit late to the game. I’d imagine that if I had gone into the semester knowing more about the digital terms (why did I have to miss the “DH Lexicon” workshop! And why was it so late in the semester, too?) – I might have been able to learn tools that would actually help me conceive of a project / start conducting it quicker.

There’s a kind suggestion here: have more workshops early on that might help students get grounded without prior knowledge of DH and digital tools. That said, I did learn a lot from each workshop, even if it wasn’t what I “wanted” to learn. And there’s a lesson in that: I should have gone to more workshops, or at least done better research on my own before just “following my gut.”

Deformance / Hypertext Project

This is a sort of two-pronged post, addressing Matt’s question towards the end of last class, re: how the readings / class discussions are helping me think more about my data (or final) project.

I’m really interested in the ideas and examples of “deformance” (in Jerome McGann’s definition = interpretation + performance) that have come up recently, especially and most recently in Lev Manovich & Kevin’s digital work. I suppose I think of “deformance” as a way of turning art into new art… the purpose of which is beyond just “playing around” and being creative (good purpose in itself), but also, as Kevin pointed out, to ask questions of the “data” (the art, or the world in which it was produced) that you wouldn’t have known to ask before. Disordering the work of art (text, photo, or film) in order to change its questions, its answers, its “rules.” I have also been interested in the way that digital “deformance” tends to “aesthetically pleasing” results – Kevin and Lev’s work simply “look good,” and I’d love if one my projects in this course (i.e., project fully executed) could aspire to that type of artistic attention (which seems to derive from direct intention + skills + a level of pure play or “accident”).

Along these lines, it is now my intention to do a “deformance” project that is focused on my own writing / creative process. That is, rather than trying to uncover and work with the huge and somewhat impossibly impenetrable “data set” I previously proposed (Appropriation in Contemporary Poetry), I would like to either:

  • 1 – Make a digital hypertext edition of my book manuscript (Babette, recently published in print this month), adding one or more layers of text to discover more information about the language on the page. This may include anecdotes, links, or perhaps even other “poems,” that seem to enrich, deconstruct, or disorder the present text. Thus the “data set” would be the original text (+ the new text?) I would like this hypertext edition to move the reader away from the “search” (for meaning) and towards the “browse” function, revealing both writing and reading as dynamic, non-linear, and layered, with interconnected information and experiences. On that note, a final goal would be to open the text to “community, relationship, and play” (Stephen Ramsay) by allowing “users” to add their own interpretations, experiences, links, etc. (though I understand this might be beyond the scope of this project).


  • 2 – Create a digital hyper-text edition of my three published manuscripts (Babette, Parades, and Latronic Strag) and do a data-visualization of the neologisms I’ve used in these works. The “data set” would thus be these “neologistic” words, about which I could ask starting questions such as: “how often do they appear in each book,” “how much do they sound like one another,” “how closely are they “related” to each other (by the computer’s definition),” how closely are they “related” to “real” words, what words do these associate with in my mind (or the computer’s, or in the minds of other readers)… what “real” language do they sound like, and is there some sort of “neologistic” conversation going on between the words, phrases, poems, manuscripts? Again, the aim would be to use the language as data to “browse” for new questions about the text, rather than “search” for these answers, and one ultimate goal would be to have the project allow for “users” to add in their own experience of these words (creating more data).

Allowing others to add reactions, data, or personal experience is one way for me to get away from the fear that this would be a “vanity project” (in which the data in the set is simply my own data). Another way would be to see this project as a starting point for hypertext-ing or disordering other texts, texts that are not my own. Perhaps I see this project as one that might move me closer to that more “research”-like or scholarly question of how language is appropriated or repurposed in contemporary poetry.

As for creating a “digital edition” of one (or more) of my books, I found a tool called Ediarum on the DIRT site, which claims to help authors “transcribe, encode, and edit” manuscripts.

As for the second (and I’d imagine, more fun and elaborate) task of “hypertexting” the book(s), I had to do a little more research to see what’s out there, and where it’s coming from. What “kind” of hypertext am I looking to produce? Based on the Wikipedia definitions of “forms of hypertexts,” I’d surely like to create something that is “networked,” i.e. “an interconnected system of nodes with no dominant axis of orientation… no designated beginning or designated ending.” And, if I wanted to be able to add that user interaction, I’d want something “layered”: a structure with two layers of linked pages in which readers could insert data of their own.

Searching for tools to create networked / layered hypertext lead me to two options on DIRT: Mozilla Thimble, and TiddlyWiki. (It also lead me to investigate what software is or has been available for hypertext, starting with Ted Nelson’s ProjectXanadu, and ending, it seems, with the popular (and expensive, at $300) program from Eastgate called StorySpace, neither of which I think will be very helpful).

I’d love any thoughts on which project (1 or 2) seems more interesting, appropriate, or feasible for this project… I’m going to make an appointment with the Digital Fellows to get their advice (and guidance on the tools).


– Sara


Poetry, Appropriation, and the “Avant-Garde”

For my data project (and perhaps leading into the final project), I’m interested in finding a way to map, graph, or visualize a set of linguistic / formal trends in contemporary poetry. Based on a set of inter-connected issues and ideas, I’ve arrived at the following question, which is probably still too large: what is the relationship between “appropriation,” race, and gender in poetry of the “avant-garde”?

In coming up with this question, my first idea was to use digital tools to see how certain words or trends in language have been appropriated (or repurposed) in poetry. How much is “creative” (i.e., original) and how much is “uncreative” (i.e., stolen), and where is the line between the two?

As a writer who has almost always used other texts to generate my own, I know the politics, practice, and implications of this question are complicated. It’s certainly not a question that can or should be cleanly “solved,” but perhaps that makes it fertile for a digital project. And a number of recent course readings (e.g.,” Topic Modeling and Figurative Language”), workshops (Web Scraping Social Media) and blog posts (Matt’s on “Poemage”; Taylor’s on “Hypergraphy”) have lead me to believe there are some tools I could explore to attempt this kind of “close”-and-“distant” reading.

But what do I mean by “language,” “appropriation,” and “contemporary poetry”? Which text(s) do I want to analyze, and how?

I thought of looking at poems in the most recent issue of Best American Poetry, if only because of the controversy surrounding a particular poem by a white man named Michael Derrick Hudson in that anthology. Hudson submitted his poem 40 times under his own name, and then 9 times under the pseudonym of Yi-Fen Chou, hoping that by posing as a Chinese woman (“appropriating” a particular name and identity), his poem would be accepted. And eventually, it was. According to Hudson, “it took quite a bit of effort to get (this poem) into print, but I’m nothing if not persistent.”

(The idea of “persistence” lead me to a (possibly) related question: how often do men, women, and POC submit the same piece of writing for publication, and how often are they published? Is this even a type of data that I could find?)

On a language level, an analysis of “appropriation” in a selection of texts could look something like this: take a set of poems (perhaps from that same issue of Best American Poetry), and use tools like Poemage or Topic Modeling to identify certain language trends: words, phrases, or perhaps bigger-picture patterns, like syntax, formal constraints, or rhyme. Then, scrape the web to see how and where these language and formal trends have been used prior to these poems: in literature, and / or in other places (blogs, social media, etc.). And if this dataset is too unmanageable, perhaps just look for how language from non-literary texts gets “appropriated” into poetic texts. This doesn’t relate to “appropriated” language to race and gender yet, but I’m getting there.

Another idea I had for the dataset (which I originally thought of as separate, but now seems related), was to use language-analysis to ask: what “is” (or marks) the “avant-garde” in poetry?

It’s hard, and perhaps silly, to try to define or locate a set of poems that are somehow representative of the “avant-garde,” which is itself a problematic term: most likely only historical, and not really in or of contemporary use. But the reason I thought of this question was my interest in an essay titled “Delusions of Whiteness in the Avant-Garde” by Cathy Park Hong, almost a year prior to the Michael Derrick Hudson case, in the journal Lana Turner – a journal “of poetry and opinion” in which I have published often, and which might also be thought of as a home for the “avant-garde.” In this essay, Hong claims that “to encounter the history of avant-garde poetry is to encounter a racist tradition.” A second dataset would then be in service of documenting and support this claim (and those that follow) through maps, graphs, or even hypertexts.

To do so, I might first take the poems in (lets say, that issue of Lana Turner), and use (lets say, “topic modeling”) to see if any words, syntax, or forms can be constituted into a pattern that might be used to define “the avant-garde” (or at least, the machine’s perception of it). To be more specific by borrowing some terms from Hong’s essay: what are the “radical languages and forms” that have been “usurped” (appropriated) without proper acknowledgement? What are “Eurocentric practices” in poetry these days? How can I use digital tools to further define these terms, and then map them against the race and gender of their authors? How does this information relate to the “persistence” with which (these poets) tend to submit their work? Is this part of the same dataset, or related?

The biggest issue with my proposal seems to be figuring out the scope of the project; how many and which texts to analyze. If I just use the most recent issues of Best American Poetry and / or Lana Turner, would I have enough, or too much data? And if I’m exploring these greater social issues, should I instead be mapping the controversies surrounding this discourse (on social media, for example), rather trying to analyze any particular text itself? How could I possibly choose texts that are representative of such large claims? One tentative thought I had was to analyze my own poems. This approach is appealing, not only because I’m most comfortable targeting myself, but also because it could offer the clearest dataset. That said, I hesitate to make this “critical” project about my own creative work, about which I may know or think too much, or at least, too much more than an algorithm or computer. And my poems are certainly not “representative” on their own.

That’s enough of this meandering post for now – – (I’m glad I can edit this) and welcome any thoughts or feedback – –

– – Sara

PS – and if this “dataset” proves too large or complicated with its various tools and politics (which I’m starting to think it very well might), another idea I have (not related!) is to analyze the language (again: words, syntax, and structural forms) that teachers (adjuncts and or full-time professors) use in their writing composition syllabi (lets say, within the CUNY network), as well as looking at the texts that they teach. I’m pretty sure that “official” student evaluations are made public (at CUNY, teachers need to agree to this) – but there is also the (problematic, but possibly useful) Rate My Professors, among other blogs and social media where student reactions might occur. It could be interesting to look at the relationship between how composition syllabi are written and how students perform and / or react. And I think this project could lead to the kind of “browsing” that Stephen Ramsey describes, where as my long, pervious proposal above might constitute too much of a “search,” and one that is overloaded, at that.