Duke Research Blog

Following the people and events that make up the research community at Duke.

Category: Statistics (Page 1 of 3)

Who Gets Sick and Why?

During his presentation as part of the Chautauqua lecture series, Duke sociologist Dr. Tyson Brown explained his research exploring the ways racial inequalities affect a person’s health later in life. His project mainly looks at the Baby Boomer generation, Americans born between 1946 and 1964.

With incredible increases in life expectancy, from 47 years in 1900 to 79 today, elderly people are beginning to form a larger percentage of the population. However among black people, the average life expectancy is three and a half years shorter.

“Many of you probably do not think that three and half years is a lot,” Brown said. “But imagine how much less time that is with your family and loved ones. In the end, I think all of us agree we want those extra three and a half years.”

Not only does the black population in America have shorter lives on average but they also tend to have sicker lives with higher blood pressures, greater chances of stroke, and higher probability of diabetes. In total, the number of deaths that would be prevented if African-American people had the same life expectancy as white people is 880,000 over a nine-year span. Now, the question Brown has challenged himself with is “Why does this discrepancy occur?”

Brown said he first concluded that health habits and behaviors do not create this life expectancy gap because white and black people have similar rates of smoking, drinking, and illegal drug use. He then decided to explore socioeconomic status. He discovered that as education increases, mortality decreases. And as income increases, self-rated health increases. He said that for every dollar a white person makes, a black person makes 59 cents.

This inequality in income points to the possible cause for the racial inequality in health, he said.  Additionally, in terms of wealth instead of income, a black person has 6 cents compared to the white person’s dollar. Possibly even more concerning than this inconsistency is the fact that it has gotten worse, not better, over time. Before the 2006 recession, blacks had 10-12 cents of wealth for every white person’s dollar.

Brown believes that this financial stress forms one of many stressors in black lives including chronic stressors, everyday discrimination, traumatic events, and neighborhood disorder which affect their health.

Over time, these stressors create something called physiological dysregulation, otherwise known as wear and tear, through repeated activation of  the stress response, he said. Recognition of the prevalence of these stressors in black lives has lead to Brown’s next focus on the extent of the effect of stressors on health. For his data, he uses the Health and Retirement Study and self-rated health (proven to predict mortality better than physician evaluations). For his methods, he employs structural equation modeling. Racial inequalities in socioeconomic resources, stressors and biomarkers of physiological dysregulation collectively explain 87% of the health gap with any number of causes capable of filling the remaining percentage.

Brown said his next steps include using longitudinal and macro-level data on structural inequality to understand how social inequalities “get under the skin” over a person’s lifetime. He suggests that the next steps for society, organizations, and the government to decrease this racial discrepancy rest in changing economic policy, increasing wages, guaranteeing work, and reducing residential segregation.

Post by Lydia Goff

New Blogger Daniel Egitto: Freshman and Aspiring Journalist

Hi, I’m Daniel Egitto, a freshman at Duke with an intended major in English. I’m from Florida, and I spent the better part of my childhood growing up in some small, quiet suburbs surrounded by pretty much nothing but farms, rivers and untouched forest for acres and acres around. Out where I lived, it was nearly impossible to ever get more than a few miles from the wilderness that still covers a huge chunk of Florida today. Mazes of pine and oak forests made up my backyard, crisscrossed with bubbling springs and dotted with the occasional deer, coyote or alligator peeking out of the trees. It was there in those Florida woods, kayaking and hiking through some of America’s last wild places, that I first fell in love with the natural world and the conservationist issues facing our country today.

Daniel Egitto in a tree

Incoming freshman Daniel Egitto is pursuing an English major for a future career in journalism.

Because despite its treasure trove of both scientific and recreational gems, Florida has a truly terrible history of protecting natural heritage. Governor Rick Scott, for example, brought in a gag rule on the words “climate change” appearing in any state environmental document, while at the same time the well-being of those springs I came to know and love in my childhood has faced rising challenges due to unsustainable farming practices and water use policies. An unacceptable number of Americans are still unaware of both the struggles and opportunities this country’s biodiversity has always offered, and because of this I have come to develop a passion for both science education and topical journalism in general.

In high school my experiences led me to reach out into my community, engaging with children about basic scientific concepts at a local robotics camp and “Science Saturdays” series. I also became heavily involved with my school’s newly-founded newspaper, where I helped shift its focus onto important yet poorly-publicized struggles of both our society and our world as a whole.

As I enter into my first year on Duke campus, I hope to work with the Duke Research Blog to further both my interests and my goals. I’m currently pursuing a future career in journalism, and by working with Duke Research I hope we can all help nurture a more informed and understanding world.

In addition to my work with this blog, I also intend to get involved with the Chronicle and Me Too Monologues on campus.

Pinpointing Where Durham’s Nicotine Addicts Get Their Fix

DURHAM, N.C. — It’s been five years since Durham expanded its smoking ban beyond bars and restaurants to include public parks, bus stops, even sidewalks.

While smoking in the state overall may be down, 19 percent of North Carolinians still light up, particularly the poor and those without a high school or college diploma.

Among North Carolina teens, consumption of electronic cigarettes in particular more than doubled between 2013 and 2015.

Now, new maps created by students in the Data+ summer research program show where nicotine addicts can get their fix.

Studies suggest that tobacco retailers are disproportionately located in low-income neighborhoods.

Living in a neighborhood with easy access to stores that sell tobacco makes it easier to start young and harder to quit.

The end result is that smoking, secondhand smoke exposure, and smoking-related diseases such as lung cancer, are concentrated among the most socially disadvantaged communities.

If you’re poor and lack a high school or college diploma, you’re more likely to live near a store that sells tobacco.

If you’re poor and lack a high school or college diploma, you’re more likely to live near a store that sells tobacco. Photo from Pixabay.

Where stores that sell tobacco are located matters for health, but for many states such data are hard to come by, said Duke statistics major James Wang.

Tobacco products bring in more than a third of in-store sales revenue at U.S. convenience stores — more than food, beverages, candy, snacks or beer. Despite big profits, more than a dozen states don’t require businesses to get a special license or permit to sell tobacco. North Carolina is one of them.

For these states, there is no convenient spreadsheet from the local licensing agency identifying all the businesses that sell tobacco, said Duke undergraduate Nikhil Pulimood. Previous attempts to collect such data in Virginia involved searching for tobacco retail stores by car.

“They had people physically drive across every single road in the state to collect the data. It took three years,” said team member and Duke undergraduate Felicia Chen.

Led by UNC PhD student in epidemiology Mike Dolan Fliss, the Duke team tried to come up with an easier way.

Instead of collecting data on the ground, they wrote an automated web-crawler program to extract the data from the Yellow Pages websites, using a technique called Web scraping.

By telling the software the type of business and location, they were able to create a database that included the names, addresses, phone numbers and other information for 266 potential tobacco retailers in Durham County and more than 15,500 statewide, including chains such as Family Fare, Circle K and others.

Map showing the locations of tobacco retail stores in Durham County, North Carolina.

Map showing the locations of tobacco retail stores in Durham County, North Carolina.

When they compared their web-scraped data with a pre-existing dataset for Durham County, compiled by a nonprofit called Counter Tools, hundreds of previously hidden retailers emerged on the map.

To determine which stores actually sold tobacco, they fed a computer algorithm data from more than 19,000 businesses outside North Carolina so it could learn how to distinguish say, convenience stores from grocery stores. When the algorithm received store names from North Carolina, it predicted tobacco retailers correctly 85 percent of the time.

“For example we could predict that if a store has the word “7-Eleven” in it, it probably sells tobacco,” Chen said.

As a final step, they also crosschecked their results by paying people a small fee to search for the stores online to verify that they exist, and call them to ask if they actually sell tobacco, using a crowdsourcing service called Amazon Mechanical Turk.

Ultimately, the team hopes their methods will help map the more than 336,000 tobacco retailers nationwide.

“With a complete dataset for tobacco retailers around the nation, public health experts will be able to see where tobacco retailers are located relative to parks and schools, and how store density changes from one neighborhood to another,” Wang said.

The team presented their work at the Data+ Final Symposium on July 28 in Gross Hall.

Data+ is sponsored by Bass Connections, the Information Initiative at Duke, the Social Science Research Institute, the departments of mathematics and statistical science and MEDx. This project team was also supported by Counter Tools, a non-profit based in Carrboro, NC.

Writing by Robin Smith; video by Lauren Mueller and Summer Dunsmore

Data Geeks Go Head to Head

For North Carolina college students, “big data” is becoming a big deal. The proof: signups for DataFest, a 48-hour number-crunching competition held at Duke last weekend, set a record for the third time in a row this year.

DataFest 2017

More than 350 data geeks swarmed Bostock Library this weekend for a 48-hour number-crunching competition called DataFest. Photo by Loreanne Oh, Duke University.

Expected turnout was so high that event organizer and Duke statistics professor Mine Cetinkaya-Rundel was even required by state fire code to sign up for “crowd manager” safety training — her certificate of completion is still proudly displayed on her Twitter feed.

Nearly 350 students from 10 schools across North Carolina, California and elsewhere flocked to Duke’s West Campus from Friday, March 31 to Sunday, April 2 to compete in the annual event.

Teams of two to five students worked around the clock over the weekend to make sense of a single real-world data set. “It’s an incredible opportunity to apply the modeling and computing skills we learn in class to actual business problems,” said Duke junior Angie Shen, who participated in DataFest for the second time this year.

The surprise dataset was revealed Friday night. Just taming it into a form that could be analyzed was a challenge. Containing millions of data points from an online booking site, it was too large to open in Excel. “It was bigger than anything I’ve worked with before,” said NC State statistics major Michael Burton.

DataFest 2017

The mystery data set was revealed Friday night in Gross Hall. Photo by Loreanne Oh.

Because of its size, even simple procedures took a long time to run. “The dataset was so large that we actually spent the first half of the competition fixing our crushed software and did not arrive at any concrete finding until late afternoon on Saturday,” said Duke junior Tianlin Duan.

The organizers of DataFest don’t specify research questions in advance. Participants are given free rein to analyze the data however they choose.

“We were overwhelmed with the possibilities. There was so much data and so little time,” said NCSU psychology major Chandani Kumar.

“While for the most part data analysis was decided by our teachers before now, this time we had to make all of the decisions ourselves,” said Kumar’s teammate Aleksey Fayuk, a statistics major at NCSU.

As a result, these budding data scientists don’t just write code. They form theories, find patterns, test hunches. Before the weekend is over they also visualize their findings, make recommendations and communicate them to stakeholders.

This year’s participants came from more than 10 schools, including Duke, UNC, NC State and North Carolina A&T. Students from UC Davis and UC Berkeley also made the trek. Photo by Loreanne Oh.

“The most memorable moment was when we finally got our model to start generating predictions,” said Duke neuroscience and computer science double major Luke Farrell. “It was really exciting to see all of our work come together a few hours before the presentations were due.”

Consultants are available throughout the weekend to help with any questions participants might have. Recruiters from both start-ups and well-established companies were also on site for participants looking to network or share their resumes.

“Even as late as 11 p.m. on Saturday we were still able to find a professor from the Duke statistics department at the Edge to help us,” said Duke junior Yuqi Yun, whose team presented their results in a winning interactive visualization. “The organizers treat the event not merely as a contest but more of a learning experience for everyone.”

Caffeine was critical. “By 3 a.m. on Sunday morning, we ended initial analysis with what we had, hoped for the best, and went for a five-hour sleep in the library,” said NCSU’s Fayuk, whose team DataWolves went on to win best use of outside data.

By Sunday afternoon, every surface of The Edge in Bostock Library was littered with coffee cups, laptops, nacho crumbs, pizza boxes and candy wrappers. White boards were covered in scribbles from late-night brainstorming sessions.

“My team encouraged everyone to contribute ideas. I loved how everyone was treated as a valuable team member,” said Duke computer science and political science major Pim Chuaylua. She decided to sign up when a friend asked if she wanted to join their team. “I was hesitant at first because I’m the only non-stats major in the team, but I encouraged myself to get out of my comfort zone,” Chuaylua said.

“I learned so much from everyone since we all have different expertise and skills that we contributed to the discussion,” said Shen, whose teammates were majors in statistics, computer science and engineering. Students majoring in math, economics and biology were also well represented.

At the end, each team was allowed four minutes and at most three slides to present their findings to a panel of judges. Prizes were awarded in several categories, including “best insight,” “best visualization” and “best use of outside data.”

Duke is among more than 30 schools hosting similar events this year, coordinated by the American Statistical Association (ASA). The winning presentations and mystery data source will be posted on the DataFest website in May after all events are over.

The registration deadline for the next Duke DataFest will be March 2018.

DataFest 2017

Bleary-eyed contestants pose for a group photo at Duke DataFest 2017. Photo by Loreanne Oh.

s200_robin.smith

Post by Robin Smith

Young Scientists, Making the Rounds

“Can you make a photosynthetic human?!” an 8th grader enthusiastically asks me while staring at a tiny fern in a jar.

He’s not the only one who asked me that either — another student asked if Superman was a plant, since he gets his power from the sun.

These aren’t the normal questions I get about my research as a Biology PhD candidate studying how plants get nutrients, but they were perfect for the day’s activity –A science round robin with Durham eighth-graders.

Biology grad student Leslie Slota showing Durham 8th graders some fun science.

After seeing a post under #scicomm on Twitter describing a public engagement activity for scientists, I put together a group of Duke graduate scientists to visit local middle schools and share our science with kids. We had students from biomedical engineering, physics, developmental biology, statistics, and many others — a pretty diverse range of sciences.

With help from David Stein at the Duke-Durham Neighborhood Partnership, we made connections with science teachers at the Durham School of the Arts and Lakewood Montessori school, and the event was in motion!

The outreach activity we developed works like speed dating, where people pair up, talk for 3-5 mins, and then rotate. We started out calling it “Science Speed Dating,” but for a middle school audience, we thought “Science Round-Robin” was more appropriate. Typically, a round-robin is a tournament where every team plays each of the other teams. So, every middle schooler got to meet each of us graduate students and talk to us about what we do.

The topics ranged from growing back limbs and mapping the brain, to using math to choose medicines and manipulating the different states of matter.

The kids were really excited for our visit, and kept asking their teachers for the inside scoop on what we did.

After much anticipation, and a little training and practice with Jory Weintraub from the Science & Society Initiative, two groups of 7-12 graduate students armed themselves with photos, animals, plants, and activities related to our work and went to visit these science classes full of eager students.

First-year MGM grad student Tulika Singh (top right) brought cardboard props to show students how antibodies match up with cell receptors.

“The kids really enjoyed it!” said Alex LeMay, middle- and high-school science teacher at the Durham School of the Arts. “They also mentioned that the grad students were really good at explaining ideas in a simple way, while still not talking down to them.”

That’s the ultimate trick with science communication: simplifying what we do, but not talking to people like they’re stupid.

I’m sure you’ve heard the old saying, “dumb it down.” But it really doesn’t work that way. These kids were bright, and often we found them asking questions we’re actively researching in our work. We don’t need to talk down to them, we just need to talk to them without all of the exclusive trappings of science. That was one thing the grad students picked up on too.

“It’s really useful to take a step back from the minutia of our projects and look at the big picture,” said Shannon McNulty, a PhD candidate in Molecular Genetics and Microbiology.

The kids also loved the enthusiasm we showed for our work! That made a big difference in whether they were interested in learning more and asking questions. Take note, fellow scientists: share your enthusiasm for what you do, it’s contagious!

Another thing that worked really well was connecting with the students in a personal way. According to Ms. LeMay, “if the person seemed to like them, they wanted to learn more.” Several of the grad students would ask each student their names and what they were passionate about, or even talk about their own passions outside of their research, and these simple questions allowed the students to connect as people.

There was one girl who shared with me that she didn’t know what she wanted to do when she grew up, and I told her that’s exactly where I was when I was in 8th grade too. We then bonded over our mutual love of baking, and through that interaction she saw herself reflected in me a little bit; making a career in science seem like a possibility, which is especially important for a young girl with a growing interest in science.

Making the rounds in these science classrooms, we learned just as much from the students we spoke to as they did from us. Our lesson being: science outreach is a really rewarding way to spend our time, and who knows, maybe we’ll even spark someone who loves Superman to figure out how to make the first photosynthesizing super-person!

Guest post by Ariana Eily , PhD Candidate in Biology, shown sharing her floating ferns at left.

 

Would You Expect a 'Real Man' to Tweet "Cute" or Not?

There’s nothing cute about stereotypes, but as a species, we seem to struggle to live without them.

In a clever new study led by Jordan Carpenter, who is now a postdoctoral fellow at Duke, a University of Pennsylvania team of social psychologists and computer scientists figured out a way to test just how accurate our stereotypes about language use might be, using a huge collection of real tweets and a form of artificial intelligence called “natural language processing.”

Wordclouds show the words in tweets that raters mistakenly attributed to Female authors (left) or Males (right).

Word clouds show the words in tweets that raters mistakenly attributed to Female authors (left) or Males (right). The larger the word appears, the more often the raters were fooled by it. Word color indicates the frequency of the word; gray is least frequent, then blue, and dark red is the most frequent. <url> means they used a link in their tweet.

Starting with a data set that included the 140-character bon mots of more than 67,000 Twitter users, they figured out the actual characteristics of 3,000 of the authors. Then they sorted the authors into piles using four criteria – male v. female; liberal v. conservative; younger v. older; and education (no college degree, college degree, advanced degree).

A random set of 100 tweets by each author over 12 months was loaded into the crowd-sourcing website Amazon Mechanical Turk. Intertubes users were then invited to come in and judge what they perceived about the author one characteristic at a time, like age, gender, or education, for 2 cents per rating. Some folks just did one set, others tried to make a day’s wage.

The raters were best at guessing politics, age and gender. “Everybody was better than chance,” Carpenter said. When guessing at education, however, they were worse than chance.

Jordan Carpenter is a newly-arrived Duke postdoc working with Walter Sinnott-Armstrong in philosophy and brain science.

Jordan Carpenter is a newly-arrived Duke postdoc working with Walter Sinnott-Armstrong in philosophy and brain science.

“When they saw the word S*** [this is a family blog folks, work with us here] they most often thought the author didn’t have a college degree. But where they went wrong was they overestimated the importance of that word,” Carpenter said. Raters seemed to believe that a highly-educated person would never tweet the S-word or the F-word. Unfortunately, not true! “But it is a road to people thinking you’re not a Ph.D.,” Carpenter wisely counsels.

The raters were 75 percent correct on gender, by assuming women would be tweeting words like Love, Cute, Baby and My, interestingly enough. But they got tricked most often by assuming women would not be talking about News, Research or Ebola or that the guys would not be posting Love, Life or Wonderful.

Female authors were slightly more likely to be liberal in this sample of tweets, but not as much as the raters assumed. Conservatism was viewed by raters as a male trait. Again, generally true, but not as much as the raters believed.

Youthful authors were correctly perceived to be more likely to namedrop a @friend, or say Me and Like and a few variations on the F-bomb, but they could throw the raters for a loop by using Community, Our and Original.

And therein lies the social psychology takeaway from all this: “An accurate stereotype should be one with accurate social judgments of people,” but clearly every stereotype breaks down at some point, leading to “mistaken social judgement,” Carpenter said. Just how much stereotypes should be used or respected is a hot area of discussion within the field right now, he said.

The other value of the paper is that it developed an entirely new way to apply the tools of Big Data analysis to a social psychology question without having to invite a bunch of undergraduates into the lab with the lure of a Starbucks gift card. Using tweets stripped of their avatars or any other identifier ensured that the study was testing what people thought of just the words, nothing else, Carpenter said.

The paper is “Real Men Don’t Say “Cute”: Using Automatic Language Analysis To Isolate Inaccurate Aspects Of Stereotypes.”  You can see the paper in Social Psychology and Personality Science, if you have a university IP address and your library subscribes to Sage journals. Otherwise, here’s a press release from the journal. (DOI: 10.1177/1948550616671998 )

Karl Leif BatesPost by Karl Leif Bates

Diabetes — and Privacy — Meet 'Big Data'

“Click here to consent forever.”

If consent to participate in medical research were that simple, Joanna Radin of Yale University would have to find a new focus for her research, and I would never have found the Trent Center for Bioethics, Humanities & History of Medicine.

Luckily for us both, this is not the case. Medical consent is a very complex issue that can, as Radin’s research attests, traverse generations.

joanna-radin-headshot

Joanna Radin’s reserach focuses on the intersection of medical history, anthropology and ethics at Yale University. Source: Yale School of Medicine

Radin is an Associate Professor of Medical History at Yale, the perfect fit for the Humanities in Medicine Lecture Series taking place this month at the Trent Center. Her research nails the narrow intersection of medical history, anthropology, bioethics and data analytics. In fact, Radin’s appeal is so broad that her visit to Duke was sponsored by no less than six Duke departments, including the Departments of Computer Science, History, Electrical and Computer Engineering, Cultural Anthropology and Statistical Science.

Radin’s lecture honed in on a well-known case in the realm of bioethics and medical history: the Pima Native American tribe in Arizona, which is known for unusually high rates of diabetes and obesity. The Pima were the first Native American tribe to be granted a reservation in Arizona—30,000 acres—at the beginning of the California Gold Rush. In 1963, following nearly half a century of mass famine among the Pima, the National Institute of Health (NIH) conducted a survey for rheumatoid arthritis in the Pima tribe, instead discovering a frighteningly high frequency of diabetes.

In 1965, the NIH initiated a long-term observational study of the Pima that continued for about 40 years, though it was meant to last no more than 10. The goal of the study was to learn about diabetes in the “natural laboratory” of sorts that the Pima reservation unwittingly provided. The data collected in this study came to be known as the Pima Indian Diabetes Data set (PIDD).

Machine learning enters the story around 1987, when David Aha and colleagues at the University of California, Irvine (UCI) created the UCI Machine Learning Repository, an archive containing thousands of data sets, databases and data generators. The repository is still active today, virtually a gold mine for researchers in machine learning to test their algorithms. The PIDD is one of the oldest data sets on file in the UCI archive, “a standard for testing data mining algorithms for accuracy in predicting diabetes,” according to Radin.

pima_indian_man_miguel_a_farmer_pima_arizona_ca-1900_chs-3625

A Pima farmer in Pima, Arizona, circa 1900. Source: Wikimedia Commons

Generations’ worth of data on the Pima tribe have been publicly accessible in the UCI archive for over two decades, creating ethical controversy around the accessibility of information as personal as blood pressure, body mass index (BMI) and number of pregnancies of Pima Native Americans. Though the PIDD can help refine machine learning algorithms that could accurately predict—and prevent—diabetes, the privacy issues provoked by the publicness of the data are impossible to ignore.

This is where “eternal” medical consent enters the equation: no researcher can realistically inform a study participant of what their medical data will be used for 40 years in the future.

These are the interdisciplinary questions that Radin brought forth in her lecture, weaving together seemingly opposite fields of study in an engaging, thought-provoking presentation. No one who left that room will look at the Apple Terms & Conditions the same way again.

 

Post by Maya Iskandarani iskandarani_maya_100hed

Walla Scores Grand Prize at 17th Annual Start-Up Challenge

The finalists of Duke’s 17th Annual Start-Up Challenge have found time between classes, homework, and West Union runs to research and develop pitches aiming to solve real-world problems with entrepreneurship. The event, hosted last week at the Fuqua School of Business, featured a Trinity alum as the keynote speaker. Beating out the other seven start-up pitches for the $50,000 Grand Prize was Walla, an app founded by Judy Zhu, a Pratt senior.

Judy Zhu and the Walla team pose with their $50,000 check, which is giant in more ways than one.

Judy Zhu and the Walla team pose with their $50,000 check, which is giant in more ways than one.

Walla aims to create a social health platform for college students by addressing widespread loneliness and creating a more inclusive campus community. The app’s users post open invitations to activities, from study groups to pick-up sports, allowing students to connect over shared interests.

Walla is closely tied with Duke Medicine by providing data from user activity to medical researchers. User engagement is analyzed to supply valuable information on mental health in young adults to professionals. The app currently features 700 monthly active users, with 3000 anticipated within the next month, and many more as the app opens to other North Carolina colleges.

Tatiana Birgisson returned to Duke to talk about her own experiences creating a business while an undergrad that won the Start-Up Challenge in 2013. Birgisson’s venture, MATI energy drink, was born out of her Central Campus dorm room and, through the support of Duke I&E resources, became the major energy drink contender it is today, as a healthy alternative to Monster or Red Bull.

The $2,500 Audience Choice award went to Ebb, an app designed to empower women on their periods by keeping them informed of physical and emotional symptoms throughout the course of their cycles, and creating a community through which menstruating women can receive support from those they choose to share information with.

Tatiana Birgisson won the 2013 startup challenge with an energy drink brewed in her dorm room, now sold as MATI.

Tatiana Birgisson won the 2013 startup challenge with an energy drink brewed in her dorm room, now sold as MATI.

Other finalists included BioMetrix, a wearable platform for injury prevention; GoGlam, an application to connect working women with beauticians in Latin America; Grow With Nigeria, which provides engaging STEM experiences for students in Nigeria; MedServe; Tiba Health; Teraphic.

This year’s Start-Up Challenge was a major success, with innovative entrepreneurs coming together to share their projects on changing the world. Be sure to come out next year; I’ll post an invite on Walla!

devin_nieusma_100Post by Devin Nieusma

Students Mine Parking Data to Help You Find a Spot

No parking spot? No problem.

A group of students has teamed up with Duke Parking and Transportation to explore how data analysis and visualization can help make parking on campus a breeze.

As part of the Information Initiative’s Data+ program, students Mitchell Parekh (’19) and Morton Mo (’19) along with IIT student Nikhil Tank (’17), spent 10 weeks over the summer poring over parking data collected at 42 of Duke’s permitted lots.

Under the mentorship of graduate student Nicolas-Aldebrando Benelli, they identified common parking patterns across the campus, with the goal of creating a “redirection” tool that could help Duke students and employees figure out the best place to park if their preferred lot is full.

A map of parking patterns at Duke

To understand parking patterns at Duke, the team created “activity” maps, where each circle represents one of Duke’s parking lots. The size of the circle indicates the size of the lot, and the color of the circle indicates how many people entered and exited the lot within a given hour.

“We envision a mobile app where, before you head out for work, you could check your lot on your phone,” Mo said, speaking with Parekh at the Sept. 23 Visualization Friday Forum. “And if the lot is full, it would give you a pass for an alternate lot.”

Starting with parking data gathered in Fall 2013, which logged permit holders “swiping” in and out from each lot, they set out to map some basic parking habits at Duke, including how full each lot is, when people usually arrive, and how long they stay.

However, the data weren’t always very agreeable, Mo said.

“One of the things we got was a historical occupancy count, which is exactly what we wanted – the number of cars in the facility at a given time – but we were seeing negative numbers,” said Mo. “So we figured that table might not be as trustworthy as we expected it to be.”

Other unexpected features, such as “passback,” which occurs when two cars enter or exit under the same pass, also created challenges with interpreting the data.

However, with some careful approximations, the team was able to estimate the occupancy of lot on campus at different times throughout an average weekday.

They then built an interactive, Matlab-based tool that would suggest up to three alternative parking locations based on the users’ location and travel time plus the utilization and physical capacity of each lot.

“Duke Parking is really happy with the interface that we built, and they want us to keep working on it,” Parekh said.

“The data team worked hard on real world challenges, and provided thoughtful insights to those challenges,” said Kyle Cavanaugh, Vice President of Administration at Duke. “The team was terrific to work with and we look forward to future collaboration.”

Hectic class schedules allowing, the team hopes to continue developing their application into a more user-friendly tool. You can watch a recording of Mo and Parekh’s Sept. 23 presentation here.

The team's algorithm recommends up to three alternative lots if a commuter's preferred lot is full. In this video, suggested alternatives to the blue lot are updated throughout the day to reflect changing traffic and parking patterns. Video courtesy of Nikhil Tank.

Kara J. Manke, PhD

Post by Kara Manke

 

Is Durham's Revival Pricing Some Longtime Residents Out?

When a 2015 national report on gentrification released its results for the nation’s 50 largest cities, both Charlotte and Raleigh — North Carolina’s top two biggest cities — made the list.

The result was a collection of maps and tables indicating whether various neighborhoods in each city had gentrified or not, based on changes in home values and other factors from 1990 to the present.

Soon Durham residents, business owners, policy wonks and others will have easy access to similar information about their neighborhoods too, thanks to planned updates to a web-based mapping tool called Durham Neighborhood Compass.

Two Duke students are part of the effort. For ten weeks this summer, undergraduates Anna Vivian and Vinai Oddiraju worked with Neighborhood Compass Project Manager John Killeen and Duke economics Ph.D. student Olga Kozlova to explore real-world data on Durham’s changing neighborhoods as part of a summer research program called Data+.

As a first step, they looked at recent trends in the housing market and business development.

Photo by Mark Moz.

Durham real estate and businesses are booming. A student mapping project aims to identify the neighborhoods at risk of pricing longtime residents out. Photo by Mark Moz.

Call it gentrification. Call it revitalization. Whatever you call it, there’s no denying that trendy restaurants, hotels and high-end coffee shops are popping up across Durham, and home values are on the rise.

Integrating data from the Secretary of State, the Home Mortgage Disclosure Act and local home sales, the team analyzed data for all houses sold in Durham between 2010 and 2015, including list and sale prices, days on the market, and owner demographics such as race and income.

They also looked at indicators of business development, such as the number of business openings and closings per square mile.

A senior double majoring in physics and art history, Vivian brought her GIS mapping skills to the project. Junior statistics major Oddiraju brought his know-how with computer programming languages.

To come up with averages for each neighborhood or Census block group, they first converted every street address in their dataset into latitude and longitude coordinates on a map, using a process called geocoding. The team then created city-wide maps of the data using GIS mapping software.

One of their maps shows the average listing price of homes for sale between 2014 and 2015, when housing prices in the area around Duke University’s East Campus between Broad Street and Buchanan Boulevard went up by $40,000 in a single year, the biggest spike in the city

Their web app shows that more businesses opened in downtown and in south Durham than in other parts of the city.

Duke students are developing a web app that allows users to see the number of new businesses that have been opening across Durham. The data will appear in future updates to a web-based mapping tool called Durham Neighborhood Compass.

They also used a programming language called “R” to build an interactive web app that enables users to zoom in on specific neighborhoods and see the number of new businesses that opened, compare a given neighborhood to the average for Durham county as a whole, or toggle between years to see how things changed over time.

The Durham Neighborhood Compass launched in 2014. The tool uses data from local government, the Census Bureau and other state and federal agencies to monitor nearly 50 indicators related to quality of life and access to services.

When it comes to gentrification, users can already track neighborhood-by-neighborhood changes in race, household income, and the percentage of households that are paying 30 percent or more of their income for housing — more than many people can afford.

Vivian and Oddiraju expect the scripts and methods they developed will be implemented in future updates to the tool.

When they do, the team hopes users will be able to compare the average initial asking price to the final sale price to identify neighborhoods where bidding has been the highest, or see how fast properties sell once they go on the market — good indicators of how hot they are.

Visitors will also be able to compare the median income of people buying into a neighborhood to that of the people that already live there. This will help identify neighborhoods that are at risk of pricing out residents, especially renters, who have called the city home.

Vivian and Oddiraju were among more than 60 students who shared preliminary results of their work at a poster session on Friday, July 29 in Gross Hall.

Vivian plans to continue working on the project this fall, when she hopes to comb through additional data sets they didn’t get to this summer.

“One that I’m excited about is the data on applications for renovation permits and historic tax credits,” Vivian said.

She also hopes to further develop the web app to make it possible to look at multiple variables at once. “If sale prices are rising in areas where people have also filed lots of remodeling permits, for example, that could mean that they’re flipping those houses,” Vivian said.

Data+ is sponsored by the Information Initiative at Duke, the Social Sciences Research Institute and Bass Connections. Additional funding was provided by the National Science Foundation via a grant to the departments of mathematics and statistical science.

groupshot

 

 

 

 

Writing by Robin Smith; video by Sarah Spencer and Ashlyn Nuckols

Page 1 of 3

Powered by WordPress & Theme by Anders Norén