Searching around for interesting datasets to play with, trying to understand how DH can help my job, I found this interesting collection on GitHub about the MoMA.
I would be good file to play with for a project.
The Museum of Modern Art (MoMA) acquired its first artworks in 1929, the year it was established. Today, the Museum’s evolving collection contains almost 200,000 works from around the world spanning the last 150 years. The collection includes an ever-expanding range of visual expression, including painting, sculpture, printmaking, drawing, photography, architecture, design, film, and media and performance art.
MoMA is committed to helping everyone understand, enjoy, and use our collection. The Museum’s website features almost 60,000 artworks from nearly 10,000 artists. This research dataset contains more than 120,000 records, representing all of the works that have been accessioned into MoMA’s collection and cataloged in our database. It includes basic metadata for each work, including title, artist, date made, medium, dimensions, and date acquired by the Museum. Some of these records have incomplete information and are noted as “not Curator Approved.”
At this time, the data is available in CSV format, encoded in UTF-8. While UTF-8 is the standard for multilingual character encodings, it is not correctly interpreted by Excel on a Mac. Users of Excel on a Mac can convert the UTF-8 to UTF-16 so the file can be imported correctly.
Here is the link of the page to download the Excel format.
I hope this could be useful and interesting to open your horizons