Data + Women’s Spaces

Media plays a large role in the lives of most people. It’s everywhere. Even if you don’t actively purchase magazines, you are exposed to the covers in daily life. They are at newsstands, in grocery stores, in waiting rooms, online and more. Intrigued by the messages embedded in magazine covers, Nathan Liang (psychology, statistics), Sandra Luksic (philosophy, political science) and Alexis Malone (statistics) sought out to understand how women are represented in media as a part of a research project in the Data+ program.

Data+ is one of the many summer research opportunities at Duke. It’s a 10-week program focused on data science that allows undergraduate students to explore different research topics using data-driven approaches. Students work collaboratively in small interdisciplinary teams and develop skills to marshal, analyze, and visualize data.

The team’s project, titled Women’s Spaces, focused on a primary research question: Which messages are pervasive in women’s and men’s magazines and how do these messages change over time, across magazines, and between different target audiences.

Together, the team analyzed 500+ magazine covers published between January 2010 and June 2018, from Cosmopolitan, Esquire, Essence, Good Housekeeping and Seventeen. They used image analysis, text analysis and sentiment analysis in order to understand how women are represented on the magazine covers.

To conduct image analysis the team used Microsoft Azure Face Detect with Python in order to identify cover models. This software accounted for perceived emotions, age and race. They also noted the race/ethnicity and hair length of the cover models. Their research revealed that excluding Essence, 85 percent of magazine covers were white and had below average body sizes. One specific thing they found was that men had a greater range of emotions while women seemed to always appear happy. Furthermore, there was less emotional variance among minorities and in general, no Asian men. However, they did note that there may have been a software bias in that Microsoft Azure may not have picked up as well on the emotions of minorities.

In order to conduct text analysis, the team had to self-type the text on the magazine covers because oftentimes the text on magazine covers was layered on top of images making it hard for software to detect. This reduced the number of magazines that they were able to analyze because it took up so much time. They then used a Term Frequency-Inverse Document Frequency (tf-idf) algorithm to determine both how often a term occurred on the cover how important a term was. Their results revealed several keywords associated with different magazines. Some of these include sex (Cosmopolitan),  curvy, beauty, and business (Essence), cooking, cleaning, and kitchen (GH), cute (Seventeen), and cars, America, and Barbeque (Esquire)

Tf-idf word cloud for all magazines

Lastly, they conducted a sentiment analysis. Sentiment analysis involved computationally identifying the opinions expressed in the magazine covers to determine their attitude on the topic being displayed. While sentiment libraries exist, there were not any that had magazine/advertising industry-specific sentiments and thus, were not usable for the research. As a result, the team created their own sentiment dictionary with categories like “positive,” “negative,” “sex,” “sell-words,” “appearance,” “home,” “professional,” “male” and “female.”

At the end of the summer, their main takeaway was that magazines tend to reinforce gender norms and stereotypes. The covers also backed up some of the established preconceived notions they had about magazines. However, they also discovered messages of empowerment. Interestingly, these were often connected to beauty as well as consumerism.

In a presentation, the team explained that one of the lessons they took away from the summer was that Data science is not objective, but biases are hard to spot. They noted that throughout the process they made sure to question their methodologies of analyzing data. It was particularly challenging to determine where the biases were coming into play: be it their questions, data sources or even understanding of feminism. Because of the interdisciplinary nature of the project, combining humanities with data science, the team was academically diverse. Luksic stated in the presentation that she, especially, came in skeptical of the idea that technology was assumed to be “objective”.

Luksic added, “It’s one thing to know, on a abstract level, that data science is not objective. It is another thing entirely to try to do or practice data science in a way that minimizes your subjectivities. Ultimately, we hope for a data science that can incorporate subjectivity in a way that emphasizes differences, such as between black-centered feminism and anti-black feminism.”

The discoveries made by the team play into a larger discussion about women’s roles in media and how that influences feminism and empowerment in relation to marketing and how that impacts women’s movements.

Luksic stated, “the versatility of data science allowed us to pursue multiple different paths with different conceptions of feminisms underlying them, which was exciting and empowering.”

By Anna Gotskind