Author Archives: Davide G. Colasanto

Necropolis – Group Post #2

During our last meeting we have identified more concretely our future steps to get the project done. Here listed the most important aspects:

  1. We shared all the information we gathered during the previous week
  2. After discussing the pros and cons, we have selected two potential cemeteries for the project: Shearith Israel (with 3 but small burial grounds all in Manhattan) and Prospect Cemetery in Jamaica, Queens (one site but bigger).
  3. With a very useful brainstorming we have identified the major categories of information we would like to gather and display on the website/project:
    1. Biographic details on people buried (names, dates, age, photo, epigraphy)
    2. Geolocation of the graves
      1. Identify technology needed
      2. rent a device
    3. Environment (cemetery, surroundings)
    4. Historical data
  4. We have developed a very rough calendar for the future
    1. 1st phase – Planning
      1. Contacting potential partner
      2. Find out what they have
      3. Visit the site
    2. 2nd phase – Research
      1. Create database
      2. Polishing data
    3. 3rd phase – Development
      1. Design project and its component
      2. Development
    4. 4th phase – Launch
  5. Division of tasks (keep evolving)
    1. Conn -> outreaching
    2. Lisa -> presentation and contacts, general manager
    3. Davide -> group post, research
    4. Taylor -> research on theoretical structure and design

Since Tuesday we have already received a few important news. Shearith Israel’s representatives seemed very welcoming and open to our propositions, which were very well phrased by Conn who also sent us many very instructive images of the Shearith Israel and Prospect sites. In addition, he is in contact with the Queens Historical Society and he received encouraging news also on the data available for the Prospect Cemetery in Queens.
Things are moving and you all will have a more concrete sense of the project next week with the very well-developed Lisa’s presentation.

Workshop – Data Visualization

The data visualization workshop (10/29) was particularly helpful in understanding how data visualization should be approached. Indeed, the digital fellows stressed to importance to think and visualize with our minds before working with data.

Assuming that we already have a particular set of data, it is essential to understand what and how we would like to visualize. Therefore, it is useless to start right away with excel (and it was indeed, interesting how many of us had this impulse of using right away that software but without really knowing what to do).

We started with a data set sample providing different information about the Titanic’s passengers (gender, survived, age, name, class, etc.). The following step was the decision of which relationships we wanted to visualize. So, (and this is just an example) we picked a few categories such as survival and class, in order to see their relationship. How did class affect the survival rate?

Then, by using some crayons and a white piece of paper, we tried to imagine and reproduce how we wanted to visualize the questions we asked to our data and their relationship.

Pretty soon the how became the big issue. How does a particular visualization convey certain information? How can we avoid confusion and provide a clear understanding of a particular issue?
So, for instance, we moved quite quickly from the pie diagram to different kind of graphs. The pie does its best when we want to show two things: one independent and one independent, in other words, how much is one thing in percentage to the other thing.
In order to provide a clearer representation of more than two categories we looked for a different kind of visualization.
In this process we learned to look carefully at 4 categories that can help us in elaborating different categories and variables in the same visualization:

1) line
2) color
3) shape
4) area/fill

Line_Shape_Color_area

After having chosen the data, and the relationship we wanted to represent, we entered our data in excel. From this point on, we learned to use a powerful excel tool: pivot tables. This function allows users to sort and summarize data according to different criteria. This selection is then represented into a second table. In other words, pivot tables help in understanding particular realtionships and trends in larger data set.

Pivot tablesFrom this point on, we could use Excel’s previews of different graphs in order to understand which visualization we could use to construct the most effective representation of the relationship we wanted to emphasize.

Finally, the digital fellows suggested us many different softwares paricualry helpful for data visualization:

GUI Based:
– google charts
– tableau

R-ggplot
Python-matplotlib/seaborne/basemap/cartopy
Javascript-D3

Mapping:
GIS-ArcGIS
CartoDB —> change over time

p.s. I’m sorry for the excel picture in Italian 🙂

Clio, Mnemosyne and Internet

Cohen’s and Rosenzweig’s Introduction to Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web, 2005 was a very interesting reading because it echoed with some texts I’ve read in the last few months.
In this post I would like to discuss two main assumptions on the potentiality of the web: infinite possibilities of recording everything, easy access to all information on the web.

Can we really record everything?
I thought yes. Then a few months ago I read this article in The New Yorker where the author tells the wonderful story of the Internet Archive and its extreme usefulness in tracking and recording the digital world. What made me understand this important task was a bunch of illuminating sentences:
“No one believes any longer, if anyone ever did, that “if it’s on the Web it must be true,” but a lot of people do believe that if it’s on the Web it will stay on the Web.”
– “Web pages don’t have to be deliberately deleted to disappear. Sites hosted by corporations tend to die with their hosts. When MySpace, GeoCities, and Friendster were reconfigured or sold, millions of accounts vanished.”
– “Facebook has been around for only a decade; it won’t be around forever. Twitter is a rare case: it has arranged to archive all of its tweets at the Library of Congress.”
– “The Web dwells in a never-ending present. It is—elementally—ethereal, ephemeral, unstable, and unreliable. Sometimes when you try to visit a Web page what you see is an error message: “Page Not Found.” This is known as “link rot,” and it’s a drag, but it’s better than the alternative. More often, you see an updated Web page; most likely the original has been overwritten.”
– “According to a 2014 study conducted at Harvard Law School, “more than 70% of the URLs within the Harvard Law Review and other journals, and 50% of the URLs within United States Supreme Court opinions, do not link to the originally cited information.”
– “Last month, a team of digital library researchers based at Los Alamos National Laboratory reported the results of an exacting study of three and a half million scholarly articles published in science, technology, and medical journals between 1997 and 2012: one in five links provided in the notes suffers from reference rot. It’s like trying to stand on quicksand.”

And as Cohen and Rosenzweig affirm, the dream of preservation ad infinitum is indeed a dream: “The current reality, however, is closer to the reverse of that—we are rapidly losing the digital present that is being created because no one has worked out a means of preserving it. The flipside of the flexibility of digital data is its seeming lack of durability—a second hazard on the road to digital history nirvana.”
We should not take for granted the storage power of the web. Therefore we should improve our efforts in archiving the digital. Not just for the sake of academic articles but mainly because much of our life is indeed on the web.

But then a second question, to which I really don’t have any answer, came up to my mind.
Do we have to record everything?
Should this be also a moral question? Is it right to record everything just because we can do it? If we try to make a parallel between a conversation among two friend in a bar, and a conversation among two friend on the FB wall of one of them, we can clearly see the difference of how something that was highly ephemeral becomes something that is highly easy to capture and preserve. Maybe this is not the right example, because things change over time and we are talking about two different spaces. But, the act of communication is the same.
Recently, in Europe there has been much discussion on the right to oblivion. What do you think about this?
As an historian, of course, I would love to have all the possibilities to reconstruct the past. But, we have to acknowledge the importance of oblivion in societies (much of history research is indeed constructed around the effort to understand social amnesia and the role of memory in defining individual and collective identities). Should we leave this right to the vagaries of history or people should have the right to rewrite their past?  Should oblivion even be a right?

Access
I guess that to this issue, there is alo a question of access. Much of these data, of personal information exchanged by people, is owned by private companies that use these same data to make more profit (see the well-establisehd practice of Terms of Service – By the way, here a wonderful tool to understand the different terms of service of various web services). In other words, it seems to me that when we talk about the web and the richness of its data we too quickly assume an easy access to these same data. As Jeremy Rifkin wrote in The Age of Access, in the next future, which is already now since the book was published in 2000, power will gravitate around those who control the access to information. (see also Cohen and Rosenzweig: “A more serious threat in digital media, which runs counter to its great virtues of accessibility and diversity, is the real potential for inaccessibility and monopoly.”)

Probably one effective answer is given by common creative and open sources projects (Internet Archive, Open Culture etc) and I would suggest that we need an education to open access. We should praise more open access projects rather than the new designs of Apple (hysterical obsession for 6-months pseudo new products) and Google (do we really need all this insistence on Google glasses? Another screen just in front or our eyes?).
What I’m trying to say is that maybe even in the digital world we need to buy less and share more in common owned virtual spaces.

These same issues are very clearly discussed also by Cohen and Rosenzweig: “open source should the slogan of academic and popular historians”. In other words, this is a strong call for more digital preservation. This book was published in 2005. Did something change in the meanwhile? In the last weeks we talked about the decreasing importance of digitalization projects and a growing insistence on “more complex” DH projects. Why is that? Is it because since 2005 the digitalization process reached an “acceptable” level, or because it has acquired a lower status in the academy, and in the emerging DH field, compared to more analytical DH projects?

Finally, here a wonderful example, according to me, of digital pedagogy in history and many other disciplines: https://www.youtube.com/watch?v=Yocja_N5s1I&list=PLBDA2E52FB1EF80C9

Best wishes,

Davide

DH in the news + Research tools

I think that the readings of these first weeks cover almost all the aspects of the DH debate. However, in our discussions we have also pointed out how the interest in DH can also be traced in newspapers’ articles both online and in print. Indeed, if you look for major newspapers contribution on DH, you’ll find many articles discussing fundamental issues of DH.

I really would like to read an article reconstructing how DH have been perceived, presented and discussed in the news media. Thus, the reading I would like to add to the syllabus is something that maybe doesn’t exist yet (or probably I haven’t searched enough). I think that this kind of contribution could add interesting insights to the DH debate (especially because of DH’s insistence on openness as one of its fundamental values).

Regarding DH pedagogy, a possible addition to the syllabus could have been some readings focused on useful digital tools on the organization of academic research (and I guess this is not just for humanities but for all researchers). For instance, in this workshop (https://historyprogram.commons.gc.cuny.edu/september-11-digital-tools-to-control-the-chaos/) we discussed how to structure the process of finding, reading and storing digital sources and  which are the tools that we can use to organize our research practices.
For instance, we have learned the possibilities of combining different software (such as Pocket, Evernote, Zotero, Dropbox etc.) in order to develop a structured work flow.

This is a list of interesting article on how to use Evernote for academic purposes: https://www.evernote.com/pub/raulpachecovega/evernoteforacademics#st=p&n=e5d8fbd0-c4cf-480c-a9a4-2aeca1308d9c