#Fleeting Support— Turning Numbers into Meaning

Anissa Santos
10 min readMay 6, 2019

--

Background

We live in an age where most members of society are accompanied by some form of technology at all times. Today there are even more people who have cellular devices than toilets (Forbes). With all that being said, it’s not too surprising that most of our news is cultivated from social media sites like Facebook, Twitter, and even, Instagram. Over the past few years we have seen the phenomenon of the hashtag serving as the main catalyst of social movements and the spreading of information about numerous events. Although the hashtag has had a relatively positive impact on spreading awareness about movements and events, like the #metoo movement, and the #WomensMarch, the power of the hashtag is ultimately fleeting.

Our Goal

For this project, my team and I wanted to use python and Twitter’s API to represent how Twitter’s fast-paced online culture has allowed for the rapid spreading of news/movements but also the unfortunate, rapid dying out of coverage and support for these events. Our main objective was to compare the prevalence of tweets from multiple events and movements during certain time periods. By doing this, we hoped to gain some insight into how social media privileges certain movements over others, and or how social media collectively drops the support for certain causes.

Visuals

Although we were working primarily with numbers and strings, we knew we had to figure out the best way to display the disconnect between stories that are blown up by hashtags versus those that either gain little online support and or lose their support quickly. In short, we strove to develop ways to artistically visualize trends and tweets counts that the average user cannot normally see on the normal Twitter site. Most of my team, including myself, chose to represent our data using the Java-based IDE, Processing 3.0. The great thing about Processing is that creators can design interactive experiences and bring their code to life. For my visualization, I wanted to design it in a way so the data could be both comprehensive to a wide audience; as well as, evoke an emotive response from viewers.

Process

After deciding on our concept, my team and I devised a simple flow chart to outline our workflow. Although the chart went through some revisions, the general process was consistent throughout all versions. The final chart shows how we planned to scraped for specific tweets using the Tweepy library, convert the scraped data into a CSV file, and then finally insert that file into Processing for visualization. Although this seems like a fairly straightforward process, be rest assured that it was very complicated to accomplish.

The team flow chart

Since we were all at different experience levels with coding, Remi Wedin and I took on the python portion while our other members assisted with the visualization portion. Although neither Remi nor I had any prior experience with Python, we figured that all we needed to do was to get over the learning curve for this coding language. I enlisted the assistance of a computer science major, Robert Steele, to help teach me the basics of python and search for a proper twitter scraper. During this time, we found multiple promising code samples on Github and Medium, but they could not accomplish exactly what my team needed.

The snippet of code below is an example from KennethRietz on github that Robert and I worked with. This script actually uses the nltk and markovify libraries to “get_tweets” rather than Tweepy. In this example, we scraped tweets from President Donald Trump’s twitter that contained the phrase “fake news.” Unsurprisingly, there was a decent return of tweets from the past few months, and we were able to create a mock up visualization at the bottom using matplotlib.pyplot. Although this looked promising in the beginning, after some time messing with this library, there was no function that would allow us to simply search a general hashtag that’s not linked to a specific user.

First test with parsing for Tweets using kennethrietz’s code with Robert

From here on out I started working with the Tweepy library, and found this great example from Mikael Brunila that would, in theory, parse twitter data that falls under a specific hashtag and return the following: the tweet, the user id, the screen name, the number of tweets with that user tweeted with that hashtag, and the user’s location. With this data, we would then synthesize the number of tweets with a hashtag like #womensmarch, #metoo, or #blacklivesmatter that occurred in certain regions and compare those numbers with reported attendance to marches in those same areas. Unfortunately, both me and Remi had issues running the code even after Remi acquired the proper access keys from Twitter. By the time I was able to get past some of the typos from the example, scrape, and finally gather the code onto a JSON file, most of the data that we needed returned null which offered no benefit to us.

Working with Mikael Brunila’s example

Both frustrated and concerned, I knew that I had to find a solution quickly or else no one would be able to create their visualizations. I was concerned that I was holding the team up, so I was determined to get a working script up and running by the end of Easter weekend in order for us to have something to show our colleagues by our next class period. Earlier, Remi discovered a way to import data from a CSV file into Processing and create a plotting diagram. This discovery allowed us to think of a new way to re-imagine our project. Rather than working with amount of Tweets in specific locations and comparing that to physical attendance, our team could simply work with density of tweets in general. That way, I would only need to scrape for strings and total number of tweets that fall under certain hashtags and convert those into CSV files to be imported into processing. This way, we could visualize differences in online support for different movements, and even changes in amount of support over time. By focusing in this area, our project turned into a statement about the the unstable and unequal support in movements via social media rather than one about slacktivism. After a fairly extensive search, I finally found an example from a user named vickyqian on Github called Twitter-Crawler which executed exactly what I needed.

The working script that I titled “GoldenChild”

The Diagrams

When choosing hashtags, I chose some that were around specific movements or events that blew up specifically on Twitter. Therefore, I chose to collect data from the following hashtags: #Women’sMarch, #Flint, #NotreDame, #TheyAreUs (which is related to the recent New Zealand shooting), and #SriLanka. I originally hoped that my team would be able to visualize and make interesting inferences based on the comparison of the coverage of some these events, for example the burning down of a European church versus the shootings in POC churches.

Unfortunately, there was one fatal flaw to my code which prevented us from achieving what we wanted. Because we were not Premium Twitter developers (which we discovered would be $100 a month), we could only access tweets from 7 days prior to when we were running the scripts. This meant that the comparisons we wanted to make would be restricted to the same time frames rather than analyzing both hashtags during the times when they were most popular. ” Despite this, we still created visualizations based on what could be achieved if we had access to the full archive

Visual 1

This diagram shows the data we collected from all the hashtags and separates the total numbers of tweets into different segments for comparison.

Code Can Be Found Here

Visual 2

This diagram compares the amount of tweets posted with the hashtag #NotreDame or #SriLanka during a specific time period.

Code Can Be Found Here

Reflection

Although I wish we could have developed it further, I am still happy with the work my team and I produced in this project. I, personally, was glad that I was able to figure out how to work with python and develop a basic understanding of the language. Because we are currently living in an age where social media plays such an integral role in both the exposure of stories and the support of social movements, our project not only aimed to visualize the discrepancy of online support for different events, but also show how quickly support for one thing can die out with the emergence of another. Due to the fast paced nature of social media, it is difficult to notice these changes, so we hope that our visualizations show audiences just how powerful of a part social media plays in determining where our attention is directed. We live in a technologically-infused world that greatly impacts how we interact and learn about one another. Because of this post human phenomena, it is important to use the affordances of technology to, ironically, understand the impacts of technology on news and society’s awareness and support of movements. By visualizing the amount of tweets certain movements gather on Twitter, we can then start a discussion about what produces those numbers and learn how to attract greater attention for movements or see if we can change how the current system of social media activism and news functions.

References

Worstall, Tim. (2013, March 23). More people have mobile phones than toilets Forbes. Retrieved May 1, 2019 from https://www.forbes.com/sites/timworstall/2013/03/23/more-people-have-mobile-phones-than-toilets/#11fed9036569

Links and Websites Referenced or Used During the Creation of this Project

Hashtags and Movements

Code

--

--