Duke Research Blog

Following the people and events that make up the research community at Duke.

Category: Data Page 1 of 6

Style Recommendations From Data Scientists

A combination of data science and psychology is behind the recommendations for products we get when shopping online.

At the intersection of social psychology, data science and fashion is Amy Winecoff.

Amy Winecoff uses her background in psychology and neuroscience to improve recommender systems for shopping.

After earning a Ph.D. in psychology and neuroscience here at Duke, Winecoff spent time teaching before moving over to industry.

Today, Winecoff works as a senior data scientist at True Fit, a company that provides tools to retailers to help them decide what products they suggest to their customers.

True Fit’s software relies on collecting data about how clothes fit people who have bought them. With this data on size and type of clothing, True Fit can make size recommendations for a specific consumer looking to buy a certain product.    

In addition to recommendations on size, True Fit is behind many sites’ recommendations of products similar to those you are browsing or have bought.

While these recommender systems have been shown to work well for sites like Netflix, where you may have watched many different movies and shows in the recent past that can be used to make recommendations, Winecoff points out that this can be difficult for something like pants, which people don’t tend to buy in bulk.

To overcome this barrier, True Fit has engineered its system, called the Discovery engine, to parse a single piece of clothing into fifty different traits. With this much information, making recommendations for similar styles can be easier.

However, Winecoff’s background in social psychology has led her to question how well these algorithms make predictions that are in line with human behavior. She argues that understanding how people form their preferences is an integral part of designing a system to make recommendations.

One way Winecoff is testing how true the predictions are to human preferences is employing psychological studies to gain insight in how to fine-tune mathematical-based recommendations.

With a general goal of determining how humans determine similarity in clothes, Winecoff designed an online study where subjects are presented with a piece of clothing and told the garment is out of stock. They are then presented with two options and must pick one to replace the out-of-stock item. By varying one aspect in each of the two choices, like different color, pattern, or skirt length, Winecoff and her colleagues can distinguish which traits are most salient to a person when determining similarity.

Winecoff’s work illustrates the power of combining algorithmic recommendations with social psychological outcomes, and that science reaches into unexpected places, like influencing your shopping choices.  

Post by undergraduate blogger Sarah Haurin
Post by undergraduate blogger Sarah Haurin

Magazine Covers Hew to Stereotypes, But Also Surprise

Data + Women’s Spaces

Media plays a large role in the lives of most people. It’s everywhere. Even if you don’t actively purchase magazines, you are exposed to the covers in daily life. They are at newsstands, in grocery stores, in waiting rooms, online and more. Intrigued by the messages embedded in magazine covers, Nathan Liang (psychology, statistics), Sandra Luksic (philosophy, political science) and Alexis Malone (statistics) sought out to understand how women are represented in media as a part of a research project in the Data+ program.

Data+ is one of the many summer research opportunities at Duke. It’s a 10-week program focused on data science that allows undergraduate students to explore different research topics using data-driven approaches. Students work collaboratively in small interdisciplinary teams and develop skills to marshal, analyze, and visualize data.

The team’s project, titled Women’s Spaces, focused on a primary research question: Which messages are pervasive in women’s and men’s magazines and how do these messages change over time, across magazines, and between different target audiences.

Together, the team analyzed 500+ magazine covers published between January 2010 and June 2018, from Cosmopolitan, Esquire, Essence, Good Housekeeping and Seventeen. They used image analysis, text analysis and sentiment analysis in order to understand how women are represented on the magazine covers.

To conduct image analysis the team used Microsoft Azure Face Detect with Python in order to identify cover models. This software accounted for perceived emotions, age and race. They also noted the race/ethnicity and hair length of the cover models. Their research revealed that excluding Essence, 85 percent of magazine covers were white and had below average body sizes. One specific thing they found was that men had a greater range of emotions while women seemed to always appear happy. Furthermore, there was less emotional variance among minorities and in general, no Asian men. However, they did note that there may have been a software bias in that Microsoft Azure may not have picked up as well on the emotions of minorities.

In order to conduct text analysis, the team had to self-type the text on the magazine covers because oftentimes the text on magazine covers was layered on top of images making it hard for software to detect. This reduced the number of magazines that they were able to analyze because it took up so much time. They then used a Term Frequency-Inverse Document Frequency (tf-idf) algorithm to determine both how often a term occurred on the cover how important a term was. Their results revealed several keywords associated with different magazines. Some of these include sex (Cosmopolitan),  curvy, beauty, and business (Essence), cooking, cleaning, and kitchen (GH), cute (Seventeen), and cars, America, and Barbeque (Esquire)

Tf-idf word cloud for all magazines

Lastly, they conducted a sentiment analysis. Sentiment analysis involved computationally identifying the opinions expressed in the magazine covers to determine their attitude on the topic being displayed. While sentiment libraries exist, there were not any that had magazine/advertising industry-specific sentiments and thus, were not usable for the research. As a result, the team created their own sentiment dictionary with categories like “positive,” “negative,” “sex,” “sell-words,” “appearance,” “home,” “professional,” “male” and “female.”

At the end of the summer, their main takeaway was that magazines tend to reinforce gender norms and stereotypes. The covers also backed up some of the established preconceived notions they had about magazines. However, they also discovered messages of empowerment. Interestingly, these were often connected to beauty as well as consumerism.

In a presentation, the team explained that one of the lessons they took away from the summer was that Data science is not objective, but biases are hard to spot. They noted that throughout the process they made sure to question their methodologies of analyzing data. It was particularly challenging to determine where the biases were coming into play: be it their questions, data sources or even understanding of feminism. Because of the interdisciplinary nature of the project, combining humanities with data science, the team was academically diverse. Luksic stated in the presentation that she, especially, came in skeptical of the idea that technology was assumed to be “objective”.

Luksic added, “It’s one thing to know, on a abstract level, that data science is not objective. It is another thing entirely to try to do or practice data science in a way that minimizes your subjectivities. Ultimately, we hope for a data science that can incorporate subjectivity in a way that emphasizes differences, such as between black-centered feminism and anti-black feminism.”

The discoveries made by the team play into a larger discussion about women’s roles in media and how that influences feminism and empowerment in relation to marketing and how that impacts women’s movements.

Luksic stated, “the versatility of data science allowed us to pursue multiple different paths with different conceptions of feminisms underlying them, which was exciting and empowering.”

By Anna Gotskind

Math on the Basketball Court

Boston Celtics data analyst David Sparks, Ph.D, really knew his audience Thursday, November 8, when he gave a presentation centered around the two most important themes at Duke: basketball and academics. He gave the crowd hope that you don’t have to be a Marvin Bagley III to make a career out of basketball — in fact, you don’t have to be an athlete at all; you can be a mathematician.

David Sparks (photo from Duke Political Science)

Sparks loves basketball, and he spends every day watching games and practices for his job. What career fits this description, you might ask? After graduating from Duke in 2012 with a Ph.D. in Political Science, Sparks went to work for the Boston Celtics, as the Director of Basketball Analytics. His job entails analyzing basketball data and building statistical models to ensure that the team will win.

The most important statistic when looking at basketball data is offensive / defensive efficiency, Sparks told the audience gathered for the “Data Dialogue” series hosted by the Information Initiative at Duke. Offensive efficiency translates to the number of points per possession while defensive efficiency measures how poorly the team forced the other offense to perform. These are measured with four factors: effective field goal percentage (shots made/ shots taken), turnover rate, successful rebound percentage, and foul rate. By looking at these four factors for both offensive and defensive efficiency, Sparks can figure out which of these areas are lacking, and share with the coach where there is room for improvement. “We all agree that we want to win, and the way you win is through efficiency,” Sparks said.

Since there is not a lot of room for improvement in the short windows between games during the regular season, a large component of Sparks’ job involves informing the draft and how the team should run practices during preseason.

David Sparks wins over his audience by showing Duke basketball clips to illustrate a point. Sparks spoke as part of the “Data Dialogue” series hosted by the Information Initiative at Duke.

Data collection these days is done by computer software. Synergy Sports Technology, the dominant data provider in professional basketball, has installed cameras in all 29 NBA arenas. These cameras are constantly watching and coding plays during games, tracking the locations of each player and the movements of the ball. They can analyze the amount of times the ball was touched and determine how long it was possessed each time, or recognize screens and calculate the height at which rebounds are grabbed. This software has revolutionized basketball analytics, because the implication of computer coding is that data scientists like Sparks can go back and look for new things later.

The room leaned in eagerly as Sparks finished his presentation, intrigued by the profession that is interdisciplinary at its core — an unlikely combination of sports and applied math. If math explains basketball, maybe we can all find a way to connect our random passions in the professional sphere.

Meet Dr. Sandra K. Johnson, Engineering “Hidden Figure”

When Dr. Sandra K. Johnson first tried her hand at electrical engineering during a summer institute in high school, she knew that she was born to be an electrical engineer. Now, as the first African-American woman to receive a Ph.D. in computer engineering in the United States, Johnson visited Duke to share her story as a “hidden figure” and inspire not just black women, but all students not to be discouraged by obstacles they may face in pursuit of their passion.

Though she did discuss her achievements, Johnson’s talk also made it clear that more than successes, it was the opposition she faced that most motivated her to persevere in electrical engineering. While pursuing a Master’s degree at Stanford, she met Dr. William Shockley, who in his free time was conducting research he believed would prove that African Americans were intellectually inferior to other races. Johnson had originally been planning on just finishing her program with a Master’s and then going into the workforce, but after hearing what this man was trying to prove, she decided she would prove to him that she was capable of doing anything that the non-black students in the same program could do. She finished the program with a Ph.D. in electrical engineering. She continued to make this declaration to anyone who didn’t believe she was capable: “before I leave this place, I will make a believer out of you.”

Dr. Johnson is the founder, CTO and CEO of Global Mobile Finance, Inc., a finance and tech startup based in Research Triangle Park, NC. Photo from BlackComputeHER.

While mapping out her own path to pursuing her goals, Johnson also firmly believed in making the path easier for other black people pursuing advanced degrees. When asked what the current generation of students could be doing to help themselves, she said to find mentors and to mentor others. Johnson shared an anecdote of sitting in a lab at Stanford waiting to begin an experiment when a man walked up to her and said she was in the wrong place. After talking to him for several minutes and showing him that she knew even more about the subject than he did and was in the right place, she told him that the next time someone who looked like her walked into the lab, not to be so sure of himself. Johnson went on to become an IBM Fellow, an IEEE Fellow, and a member of the prestigious Academy of Electrical Engineers. At the end of her talk, Johnson discussed what she believes is the best way to expedite change — to have people of color as founders and CEOs of major corporations that have the power to increase minority representation in their workforce. This is what she intends to do with her own company, Global Mobile Finance, Inc. If her current track record is any indication, there is no doubt her company will become a major corporation in the years to come, opening more doors for black women and other minorities pursuing their passions.

Post by Victoria Priester

Cracking the Code on Credit Cards at Datathon 2018

Anyone who has ever tried to formulate and answer their own research question knows that it means entering uncharted waters. This past weekend the hundreds of students in Duke Datathon 2018 did just that, using only their computer science prowess and a splash of innovation.

Here’s how it worked: the students were provided three data sets by Credit Sesame, a free credit score estimator, and given eight hours to use their insight and computer science knowledge to interpret the data and create as much value for the company as they could. Along the way, Duke Undergraduate Machine Learning (DUML), the organization hosting the event, provided mentors and workshops to help the participants find direction and achieve their goals. 

Datathon participants attempting to derive meaning from the Credit Sesame Data

This year was the first such ‘Datathon’ event to take place at Duke. The event attracted big-name sponsors such as Google and Pinterest and was made possible by the DUML executive team, headed by co-presidents Rohith Kuditipudi and Shrey Gupta (to see a full list of event sponsors, click here).

DUML faculty advisor Dr. Rebecca Steorts said that even the planning of the event transcended disciplines: one of her undergraduate students and co-president of DUML, Shrey Gupta, found a way to utilize statistics to predict how many people would be attending. “It’s all about finding computational ways of combining disciplines to solve the problem,” Steorts said, and it’s very apparent that her students have taken this to heart.

The winning team (Jie Cai, Catie Grasse, Feroze Mohideen) presenting on how they can best gauge which customers are most “valuable” to Credit Sesame

After more than an hour of deliberations, the eight top teams were selected and five finalists were asked to present their findings to the judges. The winning team (Jie Cai, Catie Grasse, Feroze Mohideen) proposed a way to gauge which customers who create trial accounts are most likely to be profitable, by using a computer filtering program to predict likely customer engagement based on customer-supplied data and their interaction with the free trial. Other top teams discussed similar topics with different variations on how Credit Sesame might best create this profile to determine who the “valuable” customers are likely to be.

DUML hosts other events throughout the year to engage students such as their MLBytes Speaker Series and ECE Seminar Series. To learn more about Duke Undergraduate Machine Learning, click here.

by Rebecca Williamson

 

 

 

 

 

Coding: A Piece of Cake

Image result for cake

Imagine a cake, your favorite cake. Has your interest been piqued?

“Start with Cake” has proved an effective teaching strategy for Mine Cetinkaya-Rundel in her introduction-level statistics classes. In her talk “Teaching Computing via Visualization,” she lays out her classroom approaches to helping students maintain an interest in coding despite its difficulty. Just like a cooking class, a taste of the final product can motivate students to master the process. Cetinkaya-Rundel, therefore, believes that instead of having students begin with the flour and sugar and milk, they should dive right into the sweet frosting. While bringing cake to the first day of class has a great success rate for increasing a class’s attention span (they’ll sugar crash in their next classes, no worries), what this statistics professor actually refers to is showing the final visualizations. By giving students large amounts of pre-written code and only one or two steps to complete during the first few class periods, they can immediately recognize coding’s potential. The possibilities become exciting and capture their attention so that fewer students attempt to vanish with the magic of drop/add period. For the student unsure about coding, immediately writing their own code can seem overwhelming and steal the joy of creating.

Example of a visualization Cetinkaya-Rundel uses in her classes

To accommodate students with less background in coding, Cetinkaya-Rundel believes that skipping the baby steps proves a better approach than slowing the pace. By jumping straight into larger projects, students can spend more time wrestling their code and discovering the best strategies rather than memorizing the definition of a histogram. The idea is to give the students everything on day one, and then slowly remove the pre-written coding until they are writing on their own. The traditional classroom approach involves teaching students line-by-line until they have enough to create the desired visualizations. While Cetinkaya-Rundel admits that her style may not suit every individual and creating the assignments does require more time, she stands by her eat-dessert-first perspective on teaching. Another way she helps students maintain their original curiosity is by cherishing day one through pre-installed packages which allow students to start playing with visualizations and altering code right away.

Not only does Cetinkaya-Rundel give mouth-watering cakes as the end results for her students but she also sometimes shows them burnt and crumbling desserts. “People like to critique,” she explains as she lays out how to motivate students to begin writing original code. When she gives her students a sloppy graph and tells them to fix it, they are more likely to find creative solutions and explore how to make the graph most appealing to them. As the scaffolding falls away and students begin diverging from the style guides, Cetinkaya-Rundel has found that they have a greater understanding of and passion for coding. A spoonful of sugar really does help the medicine go down.  

    Post by Lydia Goff

What Happens When Data Scientists Crunch Through Three Centuries of Robinson Crusoe?

Reading 1,400-plus editions of “Robinson Crusoe” in one summer is impossible. So one team of students tried to train computers to do it for them.

Reading 1,400-plus editions of “Robinson Crusoe” in one summer is impossible. So one team of students tried to train computers to do it for them.

Since Daniel Defoe’s shipwreck tale “Robinson Crusoe” was first published nearly 300 years ago, thousands of editions and spinoff versions have been published, in hundreds of languages.

A research team led by Grant Glass, a Ph.D. student in English and comparative literature at the University of North Carolina at Chapel Hill, wanted to know how the story changed as it went through various editions, imitations and translations, and to see which parts stood the test of time.

Reading through them all at a pace of one a day would take years. Instead, the researchers are training computers to do it for them.

This summer, Glass’ team in the Data+ summer research program used computer algorithms and machine learning techniques to sift through 1,482 full-text versions of Robinson Crusoe, compiled from online archives.

“A lot of times we think of a book as set in stone,” Glass said. “But a project like this shows you it’s messy. There’s a lot of variance to it.”

“When you pick up a book it’s important to know what copy it is, because that can affect the way you think about the story,” Glass said.

Just getting the texts into a form that a computer could process proved half the battle, said undergraduate team member Orgil Batzaya, a Duke double major in math and computer science.

The books were already scanned and posted online, so the students used software to download the scans from the internet, via a process called “scraping.” But processing the scanned pages of old printed books, some of which had smudges, specks or worn type, and converting them to a machine-readable format proved trickier than they thought.

The software struggled to decode the strange spellings (“deliver’d,” “wish’d,” “perswasions,” “shore” versus “shoar”), different typefaces between editions, and other quirks.

Special characters unique to 18th century fonts, such as the curious f-shaped version of the letter “s,” make even humans read “diftance” and “poffible” with a mental lisp.

Their first attempts came up with gobbledygook. “The resulting optical character recognition was completely unusable,” said team member and Duke senior Gabriel Guedes.

At a Data+ poster session in August, Guedes, Batzaya and history and computer science double major Lucian Li presented their initial results: a collection of colorful scatter plots, maps, flowcharts and line graphs.

Guedes pointed to clusters of dots on a network graph. “Here, the red editions are American, the blue editions are from the U.K.,” Guedes said. “The network graph recognizes the similarity between all these editions and clumps them together.”

Once they turned the scanned pages into machine-readable texts, the team fed them into a machine learning algorithm that measures the similarity between documents.

The algorithm takes in chunks of texts — sentences, paragraphs, even entire novels — and converts them to high-dimensional vectors.

Creating this numeric representation of each book, Guedes said, made it possible to perform mathematical operations on them. They added up the vectors for each book to find their sum, calculated the mean, and looked to see which edition was closest to the “average” edition. It turned out to be a version of Robinson Crusoe published in Glasgow in 1875.

They also analyzed the importance of specific plot points in determining a given edition’s closeness to the “average” edition: what about the moment when Crusoe spots a footprint in the sand and realizes that he’s not alone? Or the time when Crusoe and Friday, after leaving the island, battle hungry wolves in the Pyrenees?

The team’s results might be jarring to those unaccustomed to seeing 300 years of publishing reduced to a bar chart. But by using computers to compare thousands of books at a time, “digital humanities” scholars say it’s possible to trace large-scale patterns and trends that humans poring over individual books can’t.

“This is really something only a computer can do,” Guedes said, pointing to a time-lapse map showing how the Crusoe story spread across the globe, built from data on the place and date of publication for 15,000 editions.

“It’s a form of ‘distant reading’,” Guedes said. “You use this massive amount of information to help draw conclusions about publication history, the movement of ideas, and knowledge in general across time.”

This project was organized in collaboration with Charlotte Sussman (English) and Astrid Giugni (English, ISS). Check out the team’s results at https://orgilbatzaya.github.io/pirating-texts-site/

Data+ is sponsored by Bass Connections, the Information Initiative at Duke, the Social Science Research Institute, the departments of Mathematics and Statistical Science and MEDx. This project team was also supported by the Duke Office of Information Technology.

Other Duke sponsors include DTECH, Duke Health, Sanford School of Public Policy, Nicholas School of the Environment, Development and Alumni Affairs, Energy Initiative, Franklin Humanities Institute, Duke Forge, Duke Clinical Research, Office for Information Technology and the Office of the Provost, as well as the departments of Electrical & Computer Engineering, Computer Science, Biomedical Engineering, Biostatistics & Bioinformatics and Biology.

Government funding comes from the National Science Foundation.

Outside funding comes from Lenovo, Power for All and SAS.

Community partnerships, data and interesting problems come from the Durham Police and Sheriff’s Department, Glenn Elementary PTA, and the City of Durham.

Videos by Paschalia Nsato and Julian Santos; writing by Robin Smith

Can’t Decide What Clubs to Join Outside of Class? There’s a Web App for That

With 400-plus student organizations to choose from, Duke has more co-curriculars than you could ever hope to take advantage of in one college career. Navigating the sheer number of options can be overwhelming. So how do you go about finding your niche on campus?

Now there’s a Web app for that: the Duke CoCurricular Eadvisor. With just a few clicks it comes up with a personalized ranked list of student clubs and programs based on your interests and past participation compared to others.

“We want it to be like the activity fair, but online,” said  Duke computer science major Dezmanique Martin, who was part of a team of Duke undergrads in the Data+ summer research program who developed the “recommendation engine.”

“The goal is to make a web app that recommends activities like Netflix recommends movies,” said team member Alec Ashforth.

The project is still in the testing stage, but you can try it out for yourself, or add your student organization to the database, at https://eadvisorduke.shinyapps.io/login/

A “co-curricular” can be just about any learning experience that takes place outside of class and doesn’t count for credit, be it a student magazine, Science Olympiad or community service. Research shows that students who get involved on campus are more likely to graduate and thrive in the workplace post-graduation.

For the pilot version, the team compiled a list of more than 150 student programs related to technology. Each program was tagged with certain attributes.

Students start by entering a Net ID, major, and expected graduation date. Then they enter all the programs they have participated in at Duke so far, submit their profile, and hit “recommend.”

The e-advisor algorithm generates a ranked list of activities recommended just for the user.

The e-advisor might recognize that a student who did DataFest and HackDuke in their first two years likes computer science, research, technology and competitions. Based on that, the Duke Robotics Club might be highly recommended, while the Refugee Health Initiative would be ranked lower.

A new student can just indicate general interests by selecting a set of keywords from a drop-down menu. Whether it’s literature and humanities, creativity, competition, or research opportunities, the student and her advisor won’t have to puzzle over the options — the e-advisor does it for them.

The tool comes up with its recommendations using a combination of approaches. One, called content-based filtering, finds activities you might like based on what you’ve done in the past. The other, collaborative filtering, looks for other students with similar histories and tastes, and recommends activities they tried.

This could be a useful tool for advisors, too, noted Vice Provost for Interdisciplinary Studies Edward Balleisen, while learning about the EAdvisor team at this year’s Data+ Poster Session.

“With sole reliance on the app, there could be a danger of some students sticking with well-trodden paths, at the expense of going outside their comfort zone or trying new things,” Balleisen said.

But thinking through app recommendations along with a knowledgeable advisor “might lead to more focused discussions, greater awareness about options, and better decision-making,” he said.

Led by statistics Ph.D. candidate Lindsay Berry, so far the team has collected data from more than 80 students. Moving forward they’d like to add more co-curriculars to the database, and incorporate more features, such as an upvote/downvote system.

“It will be important for the app to include inputs about whether students had positive, neutral, or negative experiences with extra-curricular activities,” Balleisen added.

The system also doesn’t take into account a student’s level of engagement. “If you put Duke machine learning, we don’t know if you’re president of the club, or just a member who goes to events once a year,” said team member Vincent Liu, a rising sophomore majoring in computer science and statistics.

Ultimately, the hope is to “make it a viable product so we can give it to freshmen who don’t really want to know what they want to do, or even sophomores or juniors who are looking for new things,” said Brooke Keene, rising junior majoring in computer science and electrical and computer engineering.

Video by Paschalia Nsato and Julian Santos; writing by Robin Smith

Data+ is sponsored by Bass Connections, the Information Initiative at Duke, the Social Science Research Institute, the departments of Mathematics and Statistical Science and MEDx. This project team was also supported by the Duke Office of Information Technology.

Other Duke sponsors include DTECH, Duke Health, Sanford School of Public Policy, Nicholas School of the Environment, Development and Alumni Affairs, Energy Initiative, Franklin Humanities Institute, Duke Forge, Duke Clinical Research, Office for Information Technology and the Office of the Provost, as well as the departments of Electrical & Computer Engineering, Computer Science, Biomedical Engineering, Biostatistics & Bioinformatics and Biology.

Government funding comes from the National Science Foundation.

Outside funding comes from Lenovo, Power for All and SAS.

Community partnerships, data and interesting problems come from the Durham Police and Sheriff’s Department, Glenn Elementary PTA, and the City of Durham.

Becoming the First: Nick Carnes

Editor’s Note: In the “Becoming the First” series,  first-generation college student and Rubenstein Scholar Lydia Goff explores the experiences of Duke researchers who were the first in their families to attend college.

A portrait of Duke Professor Nick Carnes

Nick Carnes

Should we care that we are governed by professionals and millionaires? This is one of the questions Nick Carnes, an assistant professor in the Sanford School of Public Policy, seeks to answer with his research. He explores unequal social class representation in the political process and how it affects policy making. But do any real differences even exist between politicians from lower socioeconomic classes and those from the upper classes? Carnes believes they do, not only because of his research but also because of his personal experiences.

When Carnes entered Princeton University as a political science graduate student, he was the only member of his cohort who had done restaurant, construction or factory work. While obtaining his undergraduate degree from the University of Tulsa, he worked twenty hours a week and during the summer clocked in at sixty to seventy hours a week between two jobs. He considered himself and his classmates “similar on paper,” just like how politicians from a variety of socioeconomic classes can also appear comparable. However, Carnes noticed that he approached some problems differently than his classmates and wondered why. After attributing his distinct approach to his working class background, without the benefits of established college graduate family members (his mother did go to college while he was growing up), he began developing his current research interests.

Carnes considers “challenging the negative stereotypes about working class people” the most important aspect of his research. When he entered college, his first meeting with his advisor was filled with confusion as he tried to decipher what a syllabus was. While his working class status did restrict his knowledge of college norms, he overcame these limitations. He is now a researcher, writer, and professor who considers his job “the best in the world” and whose own story proves that working class individuals can conquer positions more often inhabited by the experienced. As Carnes states, “There’s no good reason to not have working class people in office.” His research seeks to reinforce that.

His biggest challenge is that the data he needs to analyze does not exist in a well-documented manner. Much of his research involves gathering data so that he can generate results. His published book, White-Collar Government: The Hidden Role of Class in Economic Policy Making, and his book coming out in September, The Cash Ceiling: Why Only the Rich Run for Office–and What We Can Do About It, contain the data and results he has produced. Presently, he is beginning a project on transnational governments because “cash ceilings exist in every advanced democracy.” Carnes’ research proves we should care that professionals and millionaires run our government. Through his story, he exemplifies that students who come from families without generations of college graduates can still succeed.    

 

Post by Lydia Goff

 

Artificial Intelligence Knows How You Feel

Ever wondered how Siri works? Afraid that super smart robots might take over the world soon?

On April 3rd researchers from Duke, NCSU and UNC came together for Triangle Machine Learning Day to provoke everyone’s curiosities about the complex field that is Artificial Intelligence. A.I. is an overarching term for smart technologies, ranging from self-driving cars to targeted advertising. We can arrive at artificial intelligence through what’s known as “machine learning.” Instead of explicitly programming a machine with the basic capabilities we want it to have, we can make it so that its code is flexible and adapts based on information it’s presented with. Its knowledge grows as a result of training it. In other words, we’re teaching a computer to learn.

Matthew Philips is working with Kitware to get computers to “see,” also known as “machine vision.” By providing thousands and thousands of images, a computer with the right coding can learn to actually make sense of what an image is beyond different colored pixels.

Machine vision has numerous applications. An effective way to search satellite imagery for arbitrary objects could be huge in the advancement of space technology – a satellite could potentially identify obscure objects or potential lifeforms that stick out in those images. This is something we as humans can’t do ourselves just because of the sheer amount of data there is to go through. Similarly, we could teach a machine to identify cancerous or malignant cells in an image, thus giving us a quick diagnosis if someone is at risk of developing a disease.

The problem is, how do you teach a computer to see? Machines don’t easily understand things like similarity, depth or orientation — things that we as humans do automatically without even thinking about. That’s exactly the type of problem Kitware has been tackling.

One hugely successful piece of Artificial Intelligence you may be familiar with is IBM’s Watson. Labeled as “A.I. for professionals,” Watson was featured on Sixty Minutes and even played Jeopardy on live television. Watson has visual recognition capabilities, can work as a translator, and can even understand things like tone, personality or emotional state. And obviously it can answer crazy hard questions. What’s even cooler is that it doesn’t matter how you ask the question – Watson will know what you mean. Watson is basically Siri on steroids, and the world got a taste of its power after watching it smoke its competitors on Jeopardy. However, Watson is not to be thought of as a physical supercomputer. It is a collection of technologies that can be used in many different ways, depending on how you train it. This is what makes Watson so astounding – through machine learning, its knowledge can adapt to the context it’s being used in.

Source: CBS News.

IBM has been able to develop such a powerful tool thanks to data. Stacy Joines from IBM noted, “Data has transformed every industry, profession, and domain.” From our smart phones to fitness devices, data is being collected about us as we speak (see: digital footprint). While it’s definitely pretty scary, the point is that a lot of data is out there. The more data you feed Watson, the smarter it is. IBM has utilized this abundance of data combined with machine learning to produce some of the most sophisticated AI out there.

Sure, it’s a little creepy how much data is being collected on us. Sure, there are tons of movies and theories out there about how intelligent robots in the future will outsmart humans and take over. But A.I. isn’t a thing to be scared of. It’s a beautiful creation that surpasses all capabilities even the most advanced purely programmable model has. It’s joining the health care system to save lives, advising businesses and could potentially find a new inhabitable planet. What we choose to do with A.I. is entirely up to us.

Post by Will Sheehan

Will Sheehan

Page 1 of 6

Powered by WordPress & Theme by Anders Norén