April 17, 2019
Doe Library 180
9:30 - 11 AM
DH+LIB: BUILDING AND PRESERVING COLLECTIONS FOR DIGITAL HUMANITIES RESEARCH
Wednesday, April 17th, 9:30 - 11:00 AM
This session will feature panelists building collections and tools for local digital humanities projects. Kathryn Stine, manager for digital content development and strategy at the California Digital Library, will talk about building web archive collections through collaboration, preparing these collections for discovery and use, and tapping the research potential of the resulting captured content and data. Mary Elings, Head of Technical Services for The Bancroft Library, will talk about the role libraries can play in developing research-ready digital collections to facilitate emerging research methods. And Gisèle Tanasse, Film & Media Services Librarian at the Library, will discuss her role in Shakespeare’s Staging, a DH project to help digitize, preserve, and make accessible Shakespeare performances from UC Berkeley students.
Setting the Scene:
I found out about this event on a list service that I joined that I think is mostly for staff librarians, etc. In bottom of Doe library. Majority white women in 40s-50s in attendance. Many have been part of the founding of the Berkeley DH Fair which started in 2012. The Q&A had D-Lab lady, co-founder of DH Fair lady, Chinese woman, North African man, and myself ask questions.
The lady at D-Lab uses the Q&A to pitch herself and her project on online hate speech. Mentions tagging and labeling and other technical sounding jargon.
I ask the last question and the panelists don’t have a good answer. “What are the risks, challenges and aspects of integrating data science into library and archival science that we should be wary of?”
Mary asks: do you mean of DS itself or using it.
ME: “you can take the question however you want and go with it.”
Mary: Still hesitant, unsure how to answer.
ME: There have been an increasing level of critique about algorithms and machine learning, esp. Due to the lack of transparency and black boxing of them. What do we need to be wary of as archivists and librarians? What are the risks we need to be thinking about?
Mary: We need to understand why we are using the algorithms and tools that we are and not just use them blindly. I have seen how you can have 4 different models and they each produce different results. So we need to find the one that gives us the truest picture and most reliable and credible results.
ME: Thought to myself but not said: In response to Mary’s answer - but how many librarians really have that training? What are the implications for how library sciences are being taught? For how data science is being taught?
Another panelist says: This is an exciting moment because we in the digital humanities and librarians and archivists are increasingly being influenced by data science.
But she doesn’t mention that DH and humanities need to also influence data scientist. DS also has a lot to learn from DH but this panel group didn’t seem to recognize that. As I left I came to the conclusion that the people in this room (largely UC Berkeley affiliates, faculty, DH, etc.) are focused on the technical tools and on “catching up” the libraries to the opportunities that data science technologies offer to “get to scale”. But not about critically engaging with which tools should be used if at all and about the bigger question of what is the value that libraries offer (other than “collections as data” which was mentioned severally).
Difficult for any of the presenters to talk about when the tech tools and platforms discussed might not be appropriate or what inequalities they might foster because feels like this is more about pitching why library sciences need to use large-scale neural networks and machine learning.
The audience asked several questions about international collaboration and use of the platforms in international contexts but that is clearly not a focus for the panel group.
ME: worry that as particular contexts increase their online archiving, etc. Who is not represented in these archives. Who should be the owners of such archives.
“Greater aggregation” mentioned as part of the reason to use DS. But for me, question is do we really want greater data aggregation within Berkeley’s library or is the other option to develop more disaggregation. What power comes from being the one who holds all of the data and archives? Have those who produced the content given consent to have you archive it? The CDL lady seemed to get it the most but otherwise was not mentioned much.