Can This Tool Actually Help Me? A discussion about information literacy and digital humanities

Hello! It’s me again, your trusty Digital Humanities Intern, Claire. Today, I come to you with a more theoretical blog post, which I hope I’ve made easy to understand. Too often, I’ve read articles about digital humanities (DH) that go WAY over my head about all sorts of theoretical stuff (and, let’s face it, not so theoretical stuff). Today I’m going to be discussion information literacy and how it applies to DH. 

Before this semester, I had only heard the term information literacy in passing had never stopped to explore the meaning of the term. So, what is information literacy? After a quick Google search, I decided to define it this way: the ability to use the tools you have available to yield a result that you want. Here’s a great blog post that gives a more detailed definition. How about an example of information literacy in action? A student is asked to write a class research paper. The tools that they have access to are Google, their school library, their school’s online databases, and their teacher. Information literacy is the ability to use these sources, filter out the good and bad (or irrelevant) information, and write their paper. Let’s say the student is writing about clothing in Colonial Latin America and needs ten sources. She does a quick google search and one item comes up. She goes to the library and picks out five books. She searches the databases, but nothing is forthcoming. Realizing that she needs more sources she talks to the teacher and is able to find four more sources. Now, the student has enough books but she still must determine what is relevant and what is not. For instance, if the student is writing about the colonial period, she shouldn’t include information about the post-colonial period. (Does this sound like an obscure topic? I agree, it is. Yet somehow, I find myself writing an eight-page paper on the subject.)

In DH, information literacy works in essentially the same way. DH researchers have a wealth of tools at their fingertips (if they know how to use them; learning those tools takes a lot of time and brain power). However, we must know how to use those tools to get what we want or solve a problem. And sometimes, that’s harder than you may think.

 Here’s a real life example. This semester, I’ve been working on a project about the Women Accepted for Voluntary Emergency Service program (WAVES) in the 1940s. OSU was a training center for this program and the school has kept the 10,800 registration cards that the WAVES filled out when they arrived. These cards are chock full of information about the women’s hometowns, education, and age. We weren’t terribly sure what we wanted to do with this information, but my supervisors saw it as a great way for me to learn Tableau Prep and Open Refine (tools that they use a lot and which I talked about in a previous blog post) and continue honing my skills with Tableau. After spending many hours cleaning the data, I put it through Tableau to see what would show up. I created a variety of visualizations with the data but after a while, I was stuck and what the data showed were things I had learned from other sources. My primary supervisor was absolutely sure that she could figure out a way to show the date these women arrived in Stillwater and when they left. After spending quite a bit of time on it, she hit a dead end as well. So, my supervisors and I had a small conversation about information literacy and asked the question, “Can this tool actually help me?”

To solve this issue, I decided to start asking questions: questions that would help me not run into brick walls. Knowing the answers to these questions allows us to use our time wisely. In no particular order, here are my questions: 

Data

  • What do we want to do with this data?
  • How can we visualize this data?
  • Can this data be visualized the way I want?
  • What is the end goal? Do we want to show something or just count it?
  • What do I think this data is saying? Can we find a way to show the data that proves or disproves my hypothesis?
  • Do we need a visualization to prove the hypothesis? 

Tool

  • Can this tool help me answer the questions about my data? 
  • What tool will give me the visualizations I need?
  • What are the limitations of this tool? Perhaps I’m not the intended audience for this tool- will that influence me ability to use this tool effectively?
  • Are we trying to make the data look like I want it to look like, or are we allowing the program to do its own thing? Are we using the right tool?

Data Questions: in a project, we are given our data, and it is up to the researcher to turn that data into a visualization that proves or disproves the hypothesis. In my WAVES project, we weren’t sure what we wanted and the main goal was to teach me how to use those tools because they are useful tool for many projects. Looking at the data, I quickly saw a few visualizations that I could do, but overall, I didn’t feel like I needed a visualization to prove or disprove my hypothesis. However, maps and visualizations are incredibly useful for teaching others about your project. 

Tool Questions: As I just discussed, I didn’t have many questions because I didn’t need to look at the visualizations to get an answer to my hypothesis. One pitfall we did have was trying to make the data do what we wanted. As I mentioned earlier, my primary supervisor wanted to see when people arrived in Stillwater and when they left. However, we quickly had to abandon the project as we were ‘trying to fit a square peg into a round hole.’ The data and the tool weren’t compatible for what we wanted to do. Because of the data, I’m not sure that there is a way to do that visualization, even with a different tool. Additionally, Tableau is meant for business people, not historians, so perhaps the tool wasn’t 100% perfect for us. 

There are some scholars who say that visualizations allow researchers to see things that aren’t overtly obvious just by looking at the data up close. This is called distant reading. Personally, I don’t think that this is a watertight theory, especially in light of this project. While these visualizations did show me that the second largest section of WAVES came from the North East, I already knew from other sources that the majority of WAVES came from California and New York. Regardless, these visualizations are extremely valuable for teaching others about this program and helping others learn, which is what DH is all about. So, all in all, I’d say this project was a roaring success. Keep an eye out for some blog posts talking about these visualizations in more detail in the next few months!

Tableau Prep, Open Refine, and the WAVES

Hello readers! Once again, we find ourselves in the throes of the semester, but take heart! Fall is well upon us and we are now able to walk to class without dying of heat stroke. Please celebrate with copious helpings of sweater-wearing and coffee-drinking. In the midst of celebrating the arrival of fall, I have been learning how to use two new tools called Tableau Prep (TP) and Open Refine (OR). These two tools do essentially the same thing: allowing users to clean messy data so that it can be run through various programs such as Tableau (a data visualization software not to be confused with Tableau Prep). While TP and OR achieve the same result, they are rather different in ease of use. For this project, I learned how to use both TP and OR to see the pitfalls and the benefits of each program. In this blog post, I’ll be discussing the project that I am working on, how I used these tools in that project, and compare TP and OR side by side. 

In the previous paragraph, I mentioned ‘messy’ data, which is an odd term to those who aren’t well-versed in using digital tools. ‘Messy’ data is data that is out of order or has been entered incorrectly. Running this messy data through a software will yield a skewed result, which is of absolutely no use to anyone. Let me give an example. In the 1940s, OSU was a training center for the Women Accepted for Voluntary Emergency Service (WAVES) program. Upon arrival, each WAVE filled out a registration card, which OSU holds in its collection today. There are over 10,000 cards which have been recorded in an Excel sheet. While this is a great resource, I couldn’t just enter data into Tableau, because there were flaws in the way that the data had been entered into the Excel sheet. Each WAVE had written her home state, but the people who entered that information into the Excel sheet had written many of the state names incorrectly. When this messy data is run through Tableau, Tableau creates a unique category for each misspelling. I hope you can these the issues that this creates. 

To correct the state names, I uploaded that data into TP and OR. When cleaning data, it’s a big no-no to touch the original data source (in this case the Excel sheet) because it is too easy to mess up and change things that aren’t supposed to be changed. Moreover, the WAVES data had over 10,000 entries, so combing through each entry and cleaning it in Excel would be time-consuming and ineffective. Thankfully, TP and OR make it (somewhat) easy to do all the data cleaning that needs to get done! TP gathered all the state names and I corrected each individual spelling. I spent several hours on that before discovering that there was a way to group the misspelled states correct them all at once. After a fair amount of banging my head against the table, I decided to chalk that one up to experience. In retrospect, I should have known that there was an easier way to rename the states, as these softwares are all about making your life easier. I was not as ignorant when I cleaned the data in OR, and the process went more quickly. 

But data cleaning isn’t just about labeling everything correctly, data cleaning is also about knowing what you want the end result to look like and changing the way the data is formatted to achieve that result. When starting the project, I knew that I wanted to be able to create a visualization of these women’s home states. My data included the not just the states but also the towns that these women were from. These two pieces of data were connected, and if I ran the data as it was through Tableau, I would have gotten over 10,000 unique categories, which would not have been terribly practical to show or use. To achieve the result that I wanted, I needed to separate the town and state names from each other. The technical term for this is separating the fields. After separating the fields, I was able to load the data into Tableau and create some neat visualizations! 

TP and OR served their purpose and helped me create the project that I wanted to. I began with TP, which comes with around an hour’s worth of training videos. I’ve heard before that using Tableau is easy…if you know how to use it. And therein lies the issue. TP is not terribly intuitive, making it difficult to use without re-watching the videos. When using TP, I operated on the most basic level. When correcting the data, I found TP slow, however, I am working on a Mac laptop, which is probably not the most compatible with that software. Uploading the data from TP to Tableau proved yet another challenge. Not once, but twice, I exported my cleaned data and uploaded it into Tableau, only to find that it had uploaded my messy data! You would think that exporting and uploading from these two programs would be rather easy, considering that they were created by the same team, but that was not the case. While I eventually uploaded the correct data, it involved quite a bit more banging of my head against my desk.

OR proved much easier to contend with. While you download both TP and OR, OR is run in the computer’s web browser. I did not see any training videos for OR, but it proved to be much more intuitive that TP and I was able use it within a few minutes. When correcting state names, OR was efficient, fast-paced, and user friendly. Moreover, I had zero issues uploading my cleaned data to Tableau. Overall, OR was easy, intuitive, and effective. In future project, I would much rather use this software. 

To summarize:

Tableau Prep

  • Had to have training to be able to use
  • Slow
  • Difficult to use
  • More sophisticated
  • Did not load into Tableau easily

Open Refine

  • Did not need training
  • Faster than Tableau Prep
  • Easy to use/simple
  • Loaded into Tableau easily

I hope that this blog post shows the benefits and downfalls of each software, and piqued your interest in how data is recorded and used. Till next time!

Claire

1930s Mapping Project

Hi, friends! Welcome back to Stillwater! I hope that your school year has gotten off to a great start and you are getting settled into your routine. With the start of school comes the start of work, and these past few weeks I’ve picked up a project that I began in December of 2018.

In previous blog posts, I’ve talked about working with the Oklahoma Agricultural and Mechanical College yearbooks. At the end of these books is an advertisement section which lists many drug stores, grocery stores, clothing cleaners, doctors, and dentists found in Stillwater. I was curious to see what could be learned about Stillwater in the 1930s through a visual representation of these locations.

I came up with this idea in December of 2018 and I began to list the advertisements in an excel sheet. There were some discrepancies in the yearbooks. For instance, some of the yearbooks did not list an address for a specific business while the same business would list an address the follow year. Through listing the advertisements from each year in an excel sheet, I could correct these discrepancies.  

After taking a semester’s hiatus from this project, I began working on it again in July. Using a map of Stillwater from 1929, I located the addresses and marked them on the map with a small sticky note. The map was fairly detailed and I was able to place the business rather accurately. While this created a neat, and colorful, visual representation, it was not terribly practical for sharing with the world. 

Enter, My Google MapsMGM is a lesser known feature on Google which allows you to drop points and draw lines on Google maps. I added points onto the map to represent the addresses and drew the outlines of the physical Stillwater map onto the Google map. It was difficult to drop the points and draw the lines approximately because Google maps was more modern and lacked specific building names. Moreover, some roads in Stillwater had changed since 1929; in the 1929 map, some of the roads on the east side of Stillwater were listed as dirt roads.

This is an image of the map I created in MGM. Each numbered dot represents a business and the purple lines are where the 1929 map ends.

After completing this section of the project, my supervisor helped me to export the map to ArcGIS. ArcGIS will allow me to customize my map more and make it more user friendly. I hope to be able to share that soon!

-Claire Ringer

Amplified Oklahoma Podcast

During the Spring 2019 semester, I got to work on a podcast episode for Amplified Oklahoma, a podcast produced by the Oklahoma Oral History Research Program at Oklahoma State University’s Edmon Low Library. I decided to do a podcast that used the Dust, Drought, and Dreams Gone Dry oral history collection, in which women speak about their experience with the Oklahoma Dust Bowl. 

Continue reading “Amplified Oklahoma Podcast”

Yearbooks Collection

Hello again, everyone! The last few blog posts have been chronicling my role as the DH intern over this past school year. The last blog post talked about my work with the Oral History collections.

For the next step in my research, I turned to the 1930-1939 Oklahoma State University (then Oklahoma A&M) yearbooks. The questions I hoped to answer were:  

  • How did students get around town?
  • What did they eat?
  • How did they pay for school? 
  • Were there student jobs?
  • Where did the students come from?
  • Where did they live?
Continue reading “Yearbooks Collection”

Creating a Research Project

In the summer of 2018, I interviewed for the position of Digital Humanities (DH) Intern. (Not sure what Digital Humanities is? Check out our “About Digital Humanities and this Internship” page!) I wasn’t sure that I would get the job, as I had relatively little idea what DH was and I’m not a computer person. Through some stroke of luck, I landed the job and began working in the Fall! When I began working, my supervisors and I had a meeting to discuss the research project that I would be doing. Building off some different research that I was doing, we decided to do a project about Oklahoma Agricultural and Mechanical University (Oklahoma State University’s previous name) during the Great Depression. 

Continue reading “Creating a Research Project”

Learning Tableau

Hello, everyone! My name is Claire Ringer and I began working as the Digital Humanities Intern in Fall 2018. This is my inaugural blog post, which I’m happy to be sharing today!

For my summer project, my supervisors decided to have me focus on learning a new data visualization software called Tableau. This software allows the user to upload their data, analyze it, and create interactive graphs that are easily shared. With this software, I will be working on a project concerning Chilocco Indian Agricultural School, a non-reservation boarding school open from 1884 to 1980, which has a fraught but powerful history. There is a moving documentary about the school on the Chilocco website, which I have linked to below. 

Continue reading “Learning Tableau”