Following the people and events that make up the research community at Duke

Category: Data Page 1 of 8

The Black Wealth Gap in Modern Day America

Sticky post

“White Americans have been provided with up escalators they can ride to reach their goals without hurdles. Meanwhile, Black Americans have been forced onto down escalators which they must run-up to reach their destination.”

The Samuel Dubois Cook Center on Social Equity at Duke University recently released a striking report on Black wealth in America, entitled “Still Running Up the Down Escalator: How Narratives Shape our Understanding of Racial Wealth Inequality,” This 36-page report, written by Natasha Hicks, Fenaba Addo, Anne Prince, and William Darity examines the stark inequalities in the economic situation of Black Americans.

The cover page of the 36-page, in-depth report, published earlier this fall.

“Despite a decade of philanthropic investment and renewed attention from progressive elected officials, policymakers, and advocates, we have yet to make discernible progress in ensuring Black families have the power and freedom wealth bestows,” the report says (page 1).

“The typical Black household’s wealth (in 2019) was $24,100; for White households, it was $188,200. This translates into the typical Black household holding about 12 cents for every dollar of wealth held by the typical White family– a disparity that has remained largely unchanged since 1989 (Kent and Ricketts, 2020).” ( page 6)

Black families are disproportionately shut out of access to opportunities that would improve generational wealth, such as home loans, business loans/ownership, and financial assets. Because of the long history of these inequalities, Black wealth in America has improved little in the last 10 years.

The report continues by analyzing how Covid-19, the worst Pandemic in US History, has widened the wealth gap in America.

“Racial wealth inequality remains a persistent defining American issue, particularly in the wake of the COVID-19 pandemic’s disproportionate toll on the physical and financial health of Black people,” the report says. “The COVID-19 pandemic and the corresponding economic crisis have only exacerbated what was already a collective failing by policymakers and elected officials, who continue to invest in solutions focused on individual behavior instead of systems change.”

Covid-19 placed over 114 million people into unemployment over the course of the pandemic, with an overrepresentation of Black Americans in these figures. The figures below were published in the report to highlight the number of liquid assets and wealth available to white families versus black families in 2019, just one year before the pandemic.

This figure taken from the report shows the median liquid assets by race and income. ( figure 1, page 8)
This figure taken from the report shows the median wealth accumulated by race and wealth quintiles. (figure 2, page 8)

As illustrated by these figures, the average White family in America maintains a leg up financially through both income and assets, which is why when the pandemic hit, black Americans were the ones disproportionately affected. Without access to high wealth modules or liquid assets to lean on, the economic wealth gap in America grew bigger.

The next part of the report talked about how false narratives in America regarding economic inequality is leading to unsuccessful aims of correction. In America, it’s a common theme to assume the problems faced by Black Americans are a cultural or personal issue, instead of a systemic one.

“Harmful narratives that characterize Black Americans as unintelligent, lazy, and criminal reinforce the notion that racial wealth disparities between Black and White households arise from differences in culture, values, skills, and behavior.” (page 10) Themes of anti-Blackness and personal responsibility, or a bootstrap mentality, were key systemic factors noted in the report. These factors impacted almost every aspect of Black America, including education, homeownership, entrepreneurship, family structure, and income and employment.

The report concludes by bringing up tangible solutions for these structural problems.

“The past year of crises is exposing the fact that we created systems, rules, and policies that actively and intentionally harm Black people. In order to truly address racial wealth inequality and the impact of the COVID-19 crisis, policymakers and funders must move away from solutions focused on behavioral changes and individual choices. Rather, they must take bold actions (backed by large scale financial investments) to shift dominant narratives and reimagine economic structures that support, uplift and protect Black people.” (page 23)

The authors make four broad proposals: shift harmful narratives, eliminate the racial wealth gap, dismantle extractive policies, and design programs to seed intergenerational wealth.

Economic disparities in America are a systemic issue, not a cultural or personal one. This report examines the interplay between this issue and the current pandemic, maintaining that the only way to create tangible change is through systemic solutions.

“America offers a false promise of equal opportunity and individual agency. For Black Americans, making all the right choices does not equal all the right outcomes. Just as wealth-building for White people in America was by design and government action, we need intentional and structural wealth-building strategies for Black Americans with investments compared to those given to White Americans. This requires a paradigm shift to truly tackle racial wealth inequality.” (page 36)

Written by Skylar Hughes, Class of 2025

Dr. Laura Richman is Defining Health by its Social Determinates

Sticky post

In 2010, the Affordable Care Act sparked a nationwide debate on the extent of responsibility the American government has over our healthcare. But Dr. Laura Richman has been asking that question since long before that. 

Richman is a health psychologist. “I examine psychosocial factors that have an impact on health behaviors and health outcomes,” she explains, sitting across from me at the Law School café. (Neither of us were wearing a cardigan. It was rather hot outside). 

Laura Richman Ph.D. is an associate professor in population health sciences. (image: Scholars@Duke)

Richman is an associate professor at Duke in the Population Health Sciences, an associate of the Duke Initiative for Science & Society, and, coincidentally, my professor in the Science & the Public FOCUS cluster. She co-teaches the course Science, Law, and Policy with Dr. Yousef Zafar, in which we examine the social determinants of health through the lens of cancer screening, diagnosis, and treatment.

After graduating from the University of Virginia in 1997 with a Ph.D. in social psychology, Richman worked at a sort of think-tank for health professionals collaborating on social issues. This inspired her to pursue health research through the lens of social determinants.

“There was a lot of work on substance use, on mental health, on behavioral disorders. That certainly contributed to my continued interest in factors that have an influence on these [health] outcomes,” she said. 

Continuing in this work, she became a research associate at the School of Public Health at Harvard University; Richman described her time at Harvard as “exciting,” which is not a word used by many to describe empirical research environments. “Certainly there’s that really robust relationship between low income, low education, low job status and poor health outcomes, but a lot of those pathways— like the ones we talk about in class, Olivia— had not been studied.” 

She’s referring to the public health concept of ‘upstream’ and ‘downstream’ solutions. (The river parable goes as follows: when you observe a trend in people drowning in a certain river, you are presented with different ways of solving the problem. You can start pulling people out of the river and saving them one at a time, which is called a “downstream” solution in public health. You can also prevent people from falling into the river, which is called an “upstream” solution.)

(courtesy of SaludAmerica!)

Richman’s professional research explores another crucial social determinant of health we discussed in class: perceived versus actual discrimination. She asked whether marginalization — objectively or subjectively — can affect functioning, “both psychologically and cognitively. Like, how does it affect their thought processes? Their decision-making? Then, how does that affect their health?” You can read her study here

One thing I noted immediately was Richman’s affinity for creative research design. In a lab she headed at Duke, she conducted one experiment with a student that tested the aforementioned effect of marginalization on health decisions. They provided subjects with a choice between unhealthy and healthy snack options after watching a video of, reading a passage about, or imagining members of their community experience discrimination.

In one study we read for Science, Law, and Policy, the stress effect of discrimination towards Arabic-named individuals after 9/11 was measured through the birth outcomes of Arabic-named mothers pregnant during that time. When I asked her about this, she said, “Particularly working with students, I think that they just bring so much energy and creativity to the research. Surveys serve their purpose — I think they’re really important, but I think there are just lots of opportunities to do more with research designs and research questions. I like trying to approach things from a different angle.” 

Richman is also working on a book. She is studying relational health — health as determined by the opioid epidemic, the obesity crisis, and social isolation associated with aging. She hopes her project will be used in classrooms (and by the interested layman), and that the value of social determinants of health is reflected in increased funding dollars, more people interested in health disparities, more focus in medical education on the screening and referral system, and stimulating dialogue among people in positions of power on a policy level.

Post by Olivia Ares, Class of 2025

New Blogger Shariar Vaez-Ghaemi: Arts and Artificial Intelligence

Sticky post

Hi! My name is Shariar. My friends usually pronounce that as Shaw-Ree-Awr, and my parents pronounce it as a Share-Ee-Awr, but feel free to mentally process my name as “Sher-Rye-Eer,” “Shor-yor-ior-ior-ior-ior,” or whatever phonetic concoction your heart desires. I always tell people that there’s no right way to interpret language, especially if you’re an AI (which you might be).

Speaking of AI, I’m excited to study statistics and mathematics at Duke! This dream was born out of my high school research internship with New York Times bestselling author Jonah Berger, through which I immersed myself in the applications of machine learning to the social sciences. Since Dr. Berger and I completed our ML-guided study of the social psychology of communicative language, I’ve injected statistical learning techniques into my investigations of political science, finance, and even fantasy football.

Unwinding in the orchestra room after a performance

When I’m not cramped behind a Jupyter Notebook or re-reading a particularly long research abstract for the fourth time, I’m often pursuing a completely different interest: the creative arts. I’m an orchestral clarinetist and quasi-jazz pianist by training, but my proudest artistic endeavours have involved cinema. During high school, I wrote and directed three short films, including a post-apocalyptic dystopian comedy and a silent rendition of the epic poem “Epopeya de la Gitana.”

I often get asked whether there’s any bridge between machine learning and the creative arts*, to which the answer is yes! In fact, as part of my entry project for Duke-based developer team Apollo Endeavours, I created a statistical language model that writes original poetry. Wandering
Mind, as I call the system, is just one example of the many ways that artificial intelligence can do what we once considered exclusively-human tasks. The program isn’t quite as talented as Frost or Dickinson, but it’s much better at writing poetry than I am.

In a movie production (I’m the one wearing a Totoro onesie)

I look forward to presenting invigorating research topics to blog readers for the next year or more. Though machine learning is my scientific expertise, my investigations could transcend all boundaries of discipline, so you may see me passionately explaining biology experiments, environmental studies, or even macroeconomic forecasts. Go Blue Devils!

(* In truth, I almost never get asked this question by real people unless I say, “You know, there’s actually a connection between machine learning and arts.”)

By Shariar Vaez-Ghaemi, Class of 2025

‘Anonymous Has Viewed Your Profile’: All Networks Lead to Re-Identification

Sticky post

For half an hour this rainy Wednesday, October 6th, I logged on to a LinkedIn Live series webinar with Dr. Jiaming Xu from the Fuqua School of Business. I sat inside the bridge between Perkins and Bostock, my laptop connected to DukeBlue wifi. I had Instagram open on my phone and was tapping through friends’ stories while I waited for the broadcast to start. I had Google Docs open in another tab to take notes. 

The title of the webinar was “Can Anyone Truly Be Anonymous Online?” 

Xu spoke about “network privacy,” which is “the intersection of network analysis and data privacy.” When you make an account, connect to wifi, share your location, search something online, or otherwise hint at your personal information, you are creating a “user profile”: a network of personal data that hints at your identity. 

You are probably familiar with how social media companies track your decisions to curate a more engaging experience for you (i.e. the reason I scroll through TikTok for 5 minutes, then 30 minutes, then… Oh no! Two hours have gone by). Other companies track other kinds of data— data that isn’t always just for algorithmic manipulation or creepy-accurate Amazon ads (i.e. “Hey! I was just thinking about buying cat litter. How did Mr. Bezos know?”). Your name, work history, date of birth, address, location, and other critical identifying factors can be collected even if you think your profile is scrubbed clean. In a rather on-the-nose anecdote to his LinkedIn audience on Wednesday, Xu explained that in April 2021, over 500 million user profiles on LinkedIn were hacked. Valuable, “sensitive, work-related data,” he noted, was made vulnerable. 

Image courtesy of Flickr

So, what do you have to worry about? I know I tend to not worry about my personal information online; letting companies collect my data benefits me. I can get targeted Google ads about things I’m interested in and cool filters on Snapchat. In a medical setting, Xu said, prediction algorithms may help patients’ health in the long run. But even anonymized and sanitized data can be traced back to you. For further reading: in an essay published in July 2021, philosophers Evan Selinger and Judy Rhee elaborate on the dangers of “normalizing surveillance.”

The meat of Xu’s talk was how your data can be traced back to you. Xu gave three examples. 

The first was a study conducted by researchers at the University of Texas- Austin attempting to identify users submitting “anonymous” reviews for movies on Netflix (keep in mind this was 2007, so picture the red Netflix logo on the DVD box accordingly). To achieve this, they cross-referenced the network of reviews published by Netflix with the network of individuals signed up on IMDB; they matched those who reviewed movies similarly on both platforms with their public profiles on IMDB. You can read more about that specific study here. (For those unafraid of the full research paper, click here). 

Let’s take a pause to learn a new vocab word! “Signatures.” In this example, the signature was users’ movie ratings. See if you can name the signature in the other two examples.

The second example was conducted by the same researchers; to identify users on Twitter who shared their data anonymously, it was simply a matter of cross-referencing the network of Twitter users with Flickr users. If you know a guy who knows a guy who knows a guy who knows a guy, you and that group of people are likely to initiate that same chain of following each other on every social media platform you have (it may remind you of the theory that you are connected by “six degrees of separation” from every person on the planet, which, as it turns out, is also supported by social media data). The researchers were able to identify the correct users 30.8% of the time. 

Time for another vocab break! Those users who connect groups of people who know a guy who know a guy who know a guy are called “seeds.” Speaking of which, did you identify the signature in this example? 

Image courtesy of Flickr

The third and final example was my personal favorite because it was the funkiest and creative. Facebook user data— also “scrubbed clean” before being sold to third-party advertisers— was overlain with LinkedIn user data to reveal a network of connections that are repeated. How did they match up those networks, you ask? First, the algorithm assigned a computed score to every individual user based on how many Facebook friends they have and one for every user based on how many LinkedIn connections they have. Then, each user was assigned a list of integers based on their friends’ popularity score. Bet you weren’t expecting that. 

This method sort of improves upon the Twitter/Flickr example, but in addition to overlaying networks and chains of users, it better matches who is who. Since you are likely to know a guy who knows a guy who knows a guy, but you are also likely to know all of those guys down the line, following specific chains does not always accurately convey who is who. Unlike the seeds signature, the friends’ popularity signature was able to correctly re-identify users most of the time. 

Sitting in the bridge Wednesday, I was connected to many networks that I wouldn’t think could be used to identify me through my limited public data. Now, I’m not so sure.

So, what’s the lesson here? At the least, it was fun to learn about, even if the ultimate realization leaves us powerless against big data analytics. Your data has monetary value, and it is not as secure as you think: but it may be worth asking whether or not we even have the ability to protect our anonymity.

A New Algorithm for “In-Betweening” images applied to Covid, Aging and Continental Drift

Collaborating with a colleague in Shanghai, we recently published an article that explains the mathematical concept of ‘in-betweening,’in images – calculating intermediate stages of changes in appearance from one image to the next.

Our equilibrium-driven deformation algorithm (EDDA) was used to demonstrate three difficult tasks of ‘in-betweening’ images: Facial aging, coronavirus spread in the lungs, and continental drift.

Part I. Understanding Pneumonia Invasion and Retreat in COVID-19

The pandemic has influenced the entire world and taken away nearly 3 million lives to date. If a person were unlucky enough to contract the virus and COVID-19, one way to diagnose them is to carry out CT scans of their lungs to visualize the damage caused by pneumonia.

However, it is impossible to monitor the patient all the time using CT scans. Thus, the invading process is usually invisible for doctors and researchers.

To solve this difficulty, we developed a mathematical algorithm which relies on only two CT scans to simulate the pneumonia invasion process caused by COVID-19.

We compared a series of CT scans of a Chinese patient taken at different times. This patient had severe pneumonia caused by COVID-19 but recovered after a successful treatment. Our simulation clearly revealed the pneumonia invasion process in the patient’s lungs and the fading away process after the treatment.

Our simulation results also identify several significant areas in which the patient’s lungs are more vulnerable to the virus and other areas in which the lungs have better response to the treatment. Those areas were perfectly consistent with the medical analysis based on this patient’s actual, real-time CT scan images. The consistency of our results indicates the value of the method.

The COVID-19 pneumonia invading (upper panel) and fading away (lower panel) process from the data-driven simulations. Red circles indicate four significant areas in which the patient’s lungs were more vulnerable to the pneumonia and blue circles indicate two significant areas in which the patient’s lungs had better response to the treatment. (Image credit: Gao et al., 2021)
We also applied this algorithm to simulate human facial changes over time, in which the aging processes for different parts of a woman’s face were automatically created by the algorithm with high resolution. (Image credit: Gao et al., 2021. Video)

Part II. Solving the Puzzle of Continental Drift

It has always been mysterious how the continents we know evolved and formed from the ancient single supercontinent, Pangaea. But then German polar researcher Alfred Wegener proposed the continental drift hypothesis in the early 20th century. Although many geologists argued about his hypothesis initially, more sound evidence such as continental structures, fossils and the magnetic polarity of rocks has supported Wegener’s proposition.

Our data-driven algorithm has been applied to simulate the possible evolution process of continents from Pangaea period.

The underlying forces driving continental drift were determined by the equilibrium status of the continents on the current planet. In order to describe the edges that divide the land to create oceans, we proposed a delicate thresholding scheme.

The formation and deformation for different continents is clearly revealed in our simulation. For example, the ‘drift’ of the Antarctic continent from Africa can be seen happening. This exciting simulation presents a quick and obvious way for geologists to establish more possible lines of inquiry about how continents can drift from one status to another, just based on the initial and equilibrium continental status. Combined with other technological advances, this data-driven method may provide a path to solve Wegener’s puzzle of continental drift.

The theory of continental drift reconciled similar fossil plants and animals now found on widely separated continents. The southern part after Pangaea breaks (Gondwana) is shown here evidence of Wegener’s theory. (Image credit: United States Geological Survey)
The continental drift process of the data-driven simulations. Black arrow indicates the formation of the Antarctic. (Image credit: Gao et al., 2021)

The study was supported by the Department of Mathematics and Physics, Duke University.

CITATION: “Inbetweening auto-animation via Fokker-Planck dynamics and thresholding,” Yuan Gao, Guangzhen Jin & Jian-Guo Liu. Inverse Problems and Imaging, February, 2021, DOI: 10.3934/ipi.2021016. Online: http://www.aimsciences.org/article/doi/10.3934/ipi.2021016

Yuan Gao

Yuan Gao is the William W. Elliot Assistant Research Professor in the department of mathematics, Trinity College of Arts & Sciences.

Jian-Guo Liu is a Professor in the departments of mathematics and physics, Trinity College of Arts & Sciences.

Jian-Guo Liu

Using Data Science for Early Detection of Autism

Autism Spectrum Disorder can be detected as early as six to twelve months old and the American Academy of Pediatrics recommends all children be screened between twelve and eighteen months of age.

But most diagnoses happen after the age of 4, and later detection makes it more difficult and expensive to treat.

One in 40 children is diagnosed with Autism Spectrum Disorder and Duke currently serves about 3,000 ASD patients per year. To improve care for patients with ASD, Duke researchers have been working to develop a data science approach to early detection.

Geraldine Dawson, the William Cleland Distinguished Professor in the Department of Psychiatry & Behavioral Sciences and Director of the Duke Center for Autism and Brain Development, and Dr. Matthew Engelhard, a Conners Fellow in Digital Health in Psychiatry & Behavioral Sciences, recently presented on the advances being made to improve ASD detection and better understand symptoms.

The earlier ASD is detected, the easier and less expensive it is to treat. Children with ASD face challenges in learning and social environments.

ASD differs widely from case to case, however. For most people, ASD makes it difficult to navigate the social world, and those with the diagnosis often struggle to understand facial expressions, maintain eye contact, and develop strong peer relations.

However, ASD also has many positive traits associated with it and autistic children often show unique skills and talents. Receiving a diagnosis is important for those with ASD so that they can receive learning accommodations and ensure that their environment helps promote growth. 

Because early detection is so helpful researchers began to ask:

“Can digital behavioral assessments improve our ability to screen for neurodevelopmental disorders and monitor treatment outcomes?”

Dr. geraldine DawsoN

The current approach for ASD detection is questionnaires given to parents. However, there are many issues in this method of detection such as literacy and language barriers as well as requiring caregivers to have some knowledge of child development. Recent studies have demonstrated that digital assessments could potentially address these challenges by allowing for direct observation of the child’s behavior as well as the ability to capture the dynamic nature of behavior, and collect more data surrounding autism.

“Our goal is to reduce disparities in access to screening and enable earlier detection of ASD by developing digital behavioral screening tools that are scalable, feasible, and more accurate than current paper-and-pencil questionnaires that are standard of care.”

Dr. Geraldine Dawson

Guillermo Sapiro, a James B. Duke Distinguished Professor of Electrical and Computer Engineering, and his team have developed an app to do just this.

On the app, videos are shown to the child on an iPad or iPhone that prompt the child’s reaction through various stimuli. These are the same games and stimuli typically used in ASD diagnostic evaluations in the clinic. As they watch and interact, the child’s behavior is measured with the iPhone/iPad’s selfie camera. Some behavioral symptoms can be detected as early as six months of age are, such as: not paying as much attention to people, reduced affective expression, early motor differences, and failure to orient to name.

In the proof-of-concept study, computers were programmed to detect a child’s response to hearing their name called. The child’s name was called out by the examiner three times while movies were shown. Toddlers with ASD demonstrated about a second of latency in their responses. 

Another study used gaze monitoring on an iPhone. Nearly a thousand toddlers were presented with a split screen where a person was on one side of the screen and toys were on the other. Typical toddlers shifted their gaze between the person and toy, whereas the autistic toddlers focused more on the toys. Forty of the toddlers involved in the study received an ASD diagnosis. Using eye gaze, researchers were also able to look at how toddlers responded to speech sounds as well as to observe early motor differences because toddlers with ASD frequently show postural sway (a type of head movement).

“The idea behind the app is to begin to combine all of these behaviors to develop a much more robust ASD algorithm. We do believe no one feature will allow us to detect ASD in developing children because there is so much variation”

DR. GERALDINE DAWSON

The app has multiple features and will allow ASD detection to be done in the home. Duke researchers are now one step away from launching an at-home study. Other benefits of this method include the ability to observe over time with parents collecting data once a month. In the future, this could be used in a treatment study to see if symptoms are improving.

Duke’s ASD researchers are also working to integrate information from the app with electronic health records (EHR) to see if information collected from routine medical care before age 1 can help with detection.

Post by Anna Gotskind

The SolarWinds Attack and the Future of Cybersecurity

Cybersecurity is the protection of computer systems and networks in order to prevent theft of or damage to their hardware, software, or electronic data. While cybersecurity has been around since the 1970s, its importance and relevance in mainstream media as well as politics is growing as an increased amount of information is stored electronically. In 1986, approximately 1% of the world’s information was stored in a digital format; by 2006, just twenty years later, this had increased to 94%.

Cyber Hacking has also become more prominent with the advent of the Digital Revolution and the start of the Information Era which began in the 1980s and rapidly grew in the early 2000s. It became an effective political form of attack to acquire confidential information from foreign countries. 

In mid-December of 2020, it was revealed that several U.S. companies and even government agencies were victims of a cyberattack that began in September of 2019. 

The Sanford School of Public Policy hosted a leading cybersecurity reporter Sean Lyngaas to lead a discussion on the national security implications of the SolarWinds hack with Sanford Professor David Hoffman as well as Visiting Scholar and Journalist Bob Sullivan. Lyngaas graduated from Duke in 2007 and majored in Public Policy at the Sanford School. 

Lyngaas did not have a direct route into cybersecurity journalism. After completing his Masters in International Relations from The Fletcher School of Law and Diplomacy at Tufts University he moved to Washington D.C. to pursue a career as a policy analyst. However, at night when he was not applying for jobs he began pitching stories to trade journals. Despite not being a “super technical guy” Lyngaas ended up becoming passionate about cybersecurity and reporting on the increasing amounts of news surrounding the growing topic. Since 2012 Lyngaas has done extensive reporting on cybersecurity breaches and recently has published several detailed reports on the SolarWinds incident. 

Sean Lyngaas

The SolarWinds attack is considered one of the most impactful cybersecurity events in history as a result of its intricacy and the number of government and private sector victims. Lyngaas explained that most people had not heard of SolarWinds until recently, but the company nevertheless, provides software to a multitude of fortune 500 companies and government agencies. One of the software products they sell is Orion, an IT performance monitoring platform that helps businesses manage and optimize their IT infrastructure. The Hackers infiltrated Orion’s update software and over several months sent out malicious updates to 18,000 companies and government agencies. Among the victims of this espionage campaign were the U.S. Justice Department and Microsoft. As a result of the campaign, countless email accounts were infiltrated and hacked.

“A perfect example of someone robbing a bank by knocking out the security guard and putting on his outfit to have access.” 

Bob Sullivan

Sullivan added that this hack is particularly concerning because the target was personal information whereas previous large-scale hacks have been centered around breaching data. Additionally, SolarWind’s core business is not cybersecurity, however, they work with and provide software to many cybersecurity companies. The attack was revealed by FireEye, a cybersecurity company that announced they had been breached.

“FireEye got breached and they are the ones usually investigating the breaches”

Sean lyngaas

This situation has prompted both those involved in the cybersecurity industry as well as the public to reconsider the scope of cyberhacking and what can be done to prevent it.

“Computer spying by nation states has been going on for decades but we talk about it more openly now.” Lyngass stated. 

Lyngaas added that the public is now expecting more transparency especially if there are threats to their information. He feels we need to have better standards for companies involved in cyber security. Solarwinds arguably was not using cybersecurity best practices and had recently made price cuts which may have contributed to their vulnerability. Hoffman explained that SolarWinds had been using an easy-to-guess password to their internal systems which allowed hackers access to the software update as well as the ability to sign a digital signature. 

“We are not going to prevent these breaches; we are not going to prevent the Russians from cyber espionage.” Lyngaas stated

However, he believes by using best practices we can uncover these breaches earlier and react in a timely manner to reduce damage. Additionally, he thinks there needs to be a shift in government spending in terms of the balance between cyber defense and offense. Historically, there has been a lack of transparency in government cyber spending, however, it is known that there has been more spent on offense in the last several years.

Changes are starting to be made in the cybersecurity landscape that hopefully should aid in reducing attacks or at least the severity of their impacts. California recently created a law centered around publicizing breaches which will increase transparency. The panelists added that the increasing amount of news and information available to the public about cybersecurity is aiding efforts to understand and prevent it. President Biden was openly speaking about cybersecurity in relation to protecting the election from hackers and continues to consider it an urgent issue as it is crucial in order to protect confidential U.S. information. 

As Lyngaas explained, it is practically impossible to completely prevent cyber attacks, however, through increasing transparency and using best practices, incidents like the SolarWinds hack will hopefully not have effects of the same scale again.

Post by Anna Gottskind

Increasing Access to Care with the Help of Big Data

Artificial intelligence (AI) and data science have the potential to revolutionize global health. But what exactly is AI and what hurdles stand in the way of more widespread integration of big data in global health? Duke’s Global Health Institute (DGHI) hosted a Think Global webinar Wednesday, February 17th to dive into these questions and more.  

The webinar’s panelists were Andy Tatem (Ph.D), Joao Vissoci (Ph.D.), and Eric Laber (Ph.D.), moderated by DGHI’s Director of Research Design and Analysis Core, Liz Turner (Ph.D.).  Tatem is a professor of spatial demography and epidemiology at the University of South Hampton and director of WorldPop. Vissoci is an assistant professor of surgery and global health at Duke University. Laber is a professor of statistical science and bioinformatics at Duke.

Panelist moderator, Lisa Turner

Tatem, Vissoci, and Laber all use data science to address issues in the global health realm. Tatem’s work largely utilizes geospatial data sets to help inform global health decisions like vaccine distribution within a certain geographic area. Vissoci, who works with the GEMINI Lab at Duke (Global Emergency Medicine Innovation and Implementation Research), tries to leverage secondary data from health systems in order to understand issues of access to and distribution of care, as well as care delivery. Laber is interested in improving decision-making processes in healthcare spaces, attempting to help health professionals synthesize very complex data via AI.

All of their work is vital to modern biomedicine and healthcare, but, Turner said, “AI means a lot of different things to a lot of different people.” Laber defined AI in healthcare simply as using data to make healthcare better. “From a data science perspective,” Vissoci said, “[it is] synthesizing data … an automated way to give us back information.” This returned info is digestible trends and understandings derived from very big, very complex data sets. Tatem stated that AI has already “revolutionized what we can do” and said it is “powerful if it is directed in the right way.”

A screenshot from worldpop.org

We often get sucked into a science-fiction version of AI, Laber said, but in actuality it is not some dystopian future but a set of tools that maximizes what can be derived from data.

However, as Tatem stated, “[AI] is not a magic, press a button” scenario where you get automatic results. A huge part of work for researchers like Tatem, Vissoci, and Laber is the “harmonization” of working with data producers, understanding data quality, integrating data sets, cleaning data, and other “back-end” processes.

This comes with many caveats.

“Bias is a huge problem,” said Laber. Vissoci reinforced this, stating that the models built from AI and data science are going to represent what data sources they are able to access – bias included. “We need better work in getting better data,” Vissoci said.

Further, there must be more up-front listening to and communication with “end-users from the very start” of projects, Tatem outlined. By taking a step back and listening, tools created through AI and data science may be better met with actual uptake and less skepticism or distrust. Vissoci said that “direct engagement with the people on the ground” transforms data into meaningful information.

Better structures for meandering privacy issues must also be developed. “A major overhaul is still needed,” said Laber. This includes things like better consent processes for patients’ to understand how their data is being used, although Tatem said this becomes “very complex” when integrating data.

Nonetheless the future looks promising and each panelist feels confident that the benefits will outweigh the difficulties that are yet to come in introducing big data to global health. One cool example Vissoci gave of an ongoing project deals with the influence of environmental change through deforestation in the Brazilian Amazon on the impacts of Indigenous populations. Through work with “heavy multidimensional data,” Vissoci and his team also have been able to optimize scarcely distributed Covid vaccine resource “to use in areas where they can have the most impact.”

Laber envisions a world with reduced or even no clinical trials if “randomization and experimentation” are integrated directly into healthcare systems. Tatem noted how he has seen extreme growth in the field in just the last 10 to 15 years, which seems only to be accelerating.

A lot of this work has to do with making better decisions about allocating resources, as Turner stated in the beginning of the panel. In an age of reassessment about equity and access, AI and data science could serve to bring both to the field of global health.

Post by Cydney Livingston

Student Team Quantifies Housing Discrimination in Durham

Home values and race have an intimate connection in Durham, NC. From 1940 to 2020, if mean home values in Black-majority Census tracts had appreciated at rates equal to those in white Census tracts, the mean home value for homes in Black tracts would be $94,642 higher than it is.

That’s the disappointing, but perhaps not shocking, finding of a Duke Data+ team.

Because housing accounts for the biggest portion of wealth for families that fall outside of the top 10% of wealth in the U.S., this figure on home values represents a pervasive racial divide in wealth.

What started as a Data+ project in the summer of 2020 has expanded into an ongoing exploration of the connection between persistent wealth disparities across racial lines through housing. Omer Ali (Ph.D.), a postdoctoral associate with The Samuel Dubois Cook Center on Social Equity, is leading undergraduates Nicholas Datto and Pei Yi Zhuo in the continuation of their initial work. The trio presented an in-depth analysis of their work and methods Friday, February 5th during a Data Dialogue.

The team used a multitude of data to conduct their analyses, including the 1940 Census, Durham County records, CoreLogic data for home sales and NC voter registrations. Aside from the nearly $100,000 difference between mean home values between Black census tracts (defined as >50% Black homeowners from 1940-2020) and white census tracts (defined as >50% white homeowners from 1940-2020), Ali, Datto, and Zhou also found that over the last 10 years, home values have risen in Black neighborhoods as they have been losing Black residents. Within Census tracts, the team said that Black home-buyers in Durham occupy the least valuable homes.

Home Owners Loan Corporation data

Datto introduced the concept of redlining — systemic housing discrimination — and explained how this historic issue persists. From 1930-1940, the Home Owners’ Loan Corporation (HOLC) and Federal Housing Administration (FHA) designated certain neighborhoods unsuitable for mortgage lending. Neighborhoods were given a desirability grade from A to D, with D being the lowest.

In 1940, no neighborhoods with Black residents were designated as either A or B districts. That meant areas with non-white residents were considered more risky and thus less likely to receive FHA-guaranteed mortgages.

Datto explained that these historic classifications persist because the team found significant differences in the amount of accumulated home value over time by neighborhood rating. We are “seeing long-lasting effects of these redlined maps on homeowners in Durham, “ said Datto, with even “significant differences between white [and non-white] homeowners, even in C and D neighborhoods.”

Zhou explained the significance of tracking the changes of each Census tract – Black, white, or integrated – over the last 50 years. The “white-black disparity [in home value] has grown by 287%” in this time period, he said. Homes of comparable structural design and apparent worth are much less valuable for simply existing in Black neighborhoods and being owned by Black people. And the problem has only expanded.

Along with differences in home value, both Black and white neighborhoods have seen a decline in Black homeowners in the 21st Century, pointing to a larger issue at hand. Though the work done so far merely documents these trends, rather than looking for correlation that may get at the underlying causes of the home-value disparity, the trends pair closely with other regions across the country being impacted by gentrification.

“Home values are going up in Black neighborhoods, but the number of Black people in those neighborhoods is going down,” said Datto.

Ali pointed out that there are evaluation practices that include evaluation of the neighborhood “as opposed to the structural properties of the home.” When a house is being evaluated, he said a home of similar structure owned by white homeowners would never be chosen as a comparator for a Latinx- or Black-owned home. This perpetuates historical disparities, as “minority neighborhoods have been historically undervalued” it is a compounding, systemic cycle.

The team hopes to export their methodology to a much larger scale. Thus far, this has presented some back-end issues with data and computer science, however “there is nothing in the analysis itself that couldn’t be [applied to other geographical locations,” they said.

Large socioeconomic racial disparities prevail in the U.S., from gaps in unemployment to infant mortality to incarceration rates to life expectancy itself. Though it should come as no surprise that home-values represent another area of inequity, work like Ali, Datto, and Zhou are conducting needs more traction, support, and expansion.

Post by Cydney Livingston

Cybersecurity for Autonomous Systems

Over the past decades, we have adopted computers into virtually every aspect of our lives, but in doing so, we’ve made ourselves vulnerable to malicious interference or hacking. I had the opportunity to talk about this with Miroslav Pajic, the Dickinson Family associate professor in Duke’s electrical and computer engineering department. He has worked on cybersecurity in self-driving cars, medical devices, and even US Air Force hardware.

Miroslav Pajic is an electrical engineer

Pajic primarily works in “assured autonomy,” computers that do most things by themselves with “high-level autonomy and low human control and oversight.” “You want to build systems with strong performance and safety guarantees every time, in all conditions,” Pajic said. Assured Autonomy ensures security in “contested environments” where malicious interference can be expected. The stakes of this work are incredibly high. The danger of attacks on military equipment goes without saying, but cybersecurity on a civilian level can be just as dangerous. “Imagine,” he told me, “that you have a smart city coordinating traffic and that… all of (the traffic controls), at the same time, start doing weird things. There can be a significant impact if all cars stop, but imagine if all of them start speeding up.”

Pajic and some of his students with an autonomous car.

Since Pajic works with Ph.D. students and postdocs, I wanted to ask him how COVID-19 has affected his work. As if on cue, his wifi cut out, and he dropped from our zoom call. “This is a perfect example of how fun it is to work remotely,” he said when he returned. “Imagine that you’re debugging a fleet of drones… and that happens.” 

In all seriousness, though, there are simulators created for working on cybersecurity and assured autonomy. CARLA, for one, is an open-source simulator of self-driving vehicles made by Intel. Even outside of a pandemic, these simulators are used extensively in the field. They’ve become very useful in returning accurate and cheap results without any actual risk, before graduating to real tests.

“If you’re going to fail,” Pajic says, “you want to fail quickly.”

Guest Post by Riley Richardson, Class of 2021, NC School of Science and Math

Page 1 of 8

Powered by WordPress & Theme by Anders Norén