Following the people and events that make up the research community at Duke

Category: Computers/Technology Page 1 of 16

‘Anonymous Has Viewed Your Profile’: All Networks Lead to Re-Identification

Sticky post

For half an hour this rainy Wednesday, October 6th, I logged on to a LinkedIn Live series webinar with Dr. Jiaming Xu from the Fuqua School of Business. I sat inside the bridge between Perkins and Bostock, my laptop connected to DukeBlue wifi. I had Instagram open on my phone and was tapping through friends’ stories while I waited for the broadcast to start. I had Google Docs open in another tab to take notes. 

The title of the webinar was “Can Anyone Truly Be Anonymous Online?” 

Xu spoke about “network privacy,” which is “the intersection of network analysis and data privacy.” When you make an account, connect to wifi, share your location, search something online, or otherwise hint at your personal information, you are creating a “user profile”: a network of personal data that hints at your identity. 

You are probably familiar with how social media companies track your decisions to curate a more engaging experience for you (i.e. the reason I scroll through TikTok for 5 minutes, then 30 minutes, then… Oh no! Two hours have gone by). Other companies track other kinds of data— data that isn’t always just for algorithmic manipulation or creepy-accurate Amazon ads (i.e. “Hey! I was just thinking about buying cat litter. How did Mr. Bezos know?”). Your name, work history, date of birth, address, location, and other critical identifying factors can be collected even if you think your profile is scrubbed clean. In a rather on-the-nose anecdote to his LinkedIn audience on Wednesday, Xu explained that in April 2021, over 500 million user profiles on LinkedIn were hacked. Valuable, “sensitive, work-related data,” he noted, was made vulnerable. 

Image courtesy of Flickr

So, what do you have to worry about? I know I tend to not worry about my personal information online; letting companies collect my data benefits me. I can get targeted Google ads about things I’m interested in and cool filters on Snapchat. In a medical setting, Xu said, prediction algorithms may help patients’ health in the long run. But even anonymized and sanitized data can be traced back to you. For further reading: in an essay published in July 2021, philosophers Evan Selinger and Judy Rhee elaborate on the dangers of “normalizing surveillance.”

The meat of Xu’s talk was how your data can be traced back to you. Xu gave three examples. 

The first was a study conducted by researchers at the University of Texas- Austin attempting to identify users submitting “anonymous” reviews for movies on Netflix (keep in mind this was 2007, so picture the red Netflix logo on the DVD box accordingly). To achieve this, they cross-referenced the network of reviews published by Netflix with the network of individuals signed up on IMDB; they matched those who reviewed movies similarly on both platforms with their public profiles on IMDB. You can read more about that specific study here. (For those unafraid of the full research paper, click here). 

Let’s take a pause to learn a new vocab word! “Signatures.” In this example, the signature was users’ movie ratings. See if you can name the signature in the other two examples.

The second example was conducted by the same researchers; to identify users on Twitter who shared their data anonymously, it was simply a matter of cross-referencing the network of Twitter users with Flickr users. If you know a guy who knows a guy who knows a guy who knows a guy, you and that group of people are likely to initiate that same chain of following each other on every social media platform you have (it may remind you of the theory that you are connected by “six degrees of separation” from every person on the planet, which, as it turns out, is also supported by social media data). The researchers were able to identify the correct users 30.8% of the time. 

Time for another vocab break! Those users who connect groups of people who know a guy who know a guy who know a guy are called “seeds.” Speaking of which, did you identify the signature in this example? 

Image courtesy of Flickr

The third and final example was my personal favorite because it was the funkiest and creative. Facebook user data— also “scrubbed clean” before being sold to third-party advertisers— was overlain with LinkedIn user data to reveal a network of connections that are repeated. How did they match up those networks, you ask? First, the algorithm assigned a computed score to every individual user based on how many Facebook friends they have and one for every user based on how many LinkedIn connections they have. Then, each user was assigned a list of integers based on their friends’ popularity score. Bet you weren’t expecting that. 

This method sort of improves upon the Twitter/Flickr example, but in addition to overlaying networks and chains of users, it better matches who is who. Since you are likely to know a guy who knows a guy who knows a guy, but you are also likely to know all of those guys down the line, following specific chains does not always accurately convey who is who. Unlike the seeds signature, the friends’ popularity signature was able to correctly re-identify users most of the time. 

Sitting in the bridge Wednesday, I was connected to many networks that I wouldn’t think could be used to identify me through my limited public data. Now, I’m not so sure.

So, what’s the lesson here? At the least, it was fun to learn about, even if the ultimate realization leaves us powerless against big data analytics. Your data has monetary value, and it is not as secure as you think: but it may be worth asking whether or not we even have the ability to protect our anonymity.

In Drawers of Old Bones, New Clues to the Genomes of Lost Giants

DNA extracted from a 1,475-year-old jawbone reveals genetic blueprint for one of the largest lemurs ever.

By teasing trace amounts of DNA from this partially fossilized jawbone, nearly 1,500 years after the creature’s death, scientists have managed to reconstruct the first giant lemur genome. Credit: University of Antananarivo and George Perry, Penn State

If you’ve been to the Duke Lemur Center, perhaps you’ve seen these cute mouse- to cat-sized primates leaping through the trees. Now imagine a lemur as big as a gorilla, lumbering its way through the forest as it munches on leaves.

It may sound like a scene from a science fiction thriller, but from skeletal remains we know that at least 17 supersized lemurs once roamed the African island of Madagascar. All of them were two to 20 times heftier than the average lemur living today, some weighing up to 350 pounds.

Then, sometime after humans arrived on the island, these creatures started disappearing.

The reasons for their extinction remain a mystery, but by 500 years ago all of them had vanished.

Coaxing molecular clues to their lives from the bones and teeth they left behind has proved a struggle, because after all this time their DNA is so degraded.

But now, thanks to advances in our ability to read ancient DNA, a giant lemur that may have fallen into a cave or sinkhole near the island’s southern coast nearly 1,500 years ago has had much of its DNA pieced together again. Researchers believe it was a slow-moving 200-pound vegetarian with a pig-like snout, long arms, and powerful grasping feet for hanging upside down from branches.

A single jawbone, stored at Madagascar’s University of Antananarivo, was all the researchers had. But that contained enough traces of DNA for a team led by George Perry and Stephanie Marciniak at Penn State to reconstruct the nuclear genome for one of the largest giant lemurs, Megaladapis edwardsi, a koala lemur from Madagascar.

Ancient DNA can tell stories about species that have long since vanished, such as how they lived and what they were related to. But sequencing DNA from partially fossilized remains is no small feat, because DNA breaks down over time. And because the DNA is no longer intact, researchers have to take these fragments and figure out their correct order, like the pieces of a mystery jigsaw puzzle with no image on the box.

Bones like these are all that’s left of Madagascar’s giant lemurs, the largest of which weighed in at 350 pounds — 20 times heftier than lemurs living today. Credit: Matt Borths, Curator of the Division of Fossil Primates at the Duke Lemur Center

Hard-won history lessons

The first genetic study of M. edwardsi, published in 2005 by Duke’s Anne Yoder, was based on DNA stored not in the nucleus — which houses most of our genes — but in another cellular compartment called the mitochondria that has its own genetic material. Mitochondria are plentiful in animal cells, which makes it easier to find their DNA.

At the time, ancient DNA researchers considered themselves lucky to get just a few hundred letters of an extinct animal’s genetic code. In the latest study they managed to tease out and reconstruct some one million of them.

“I never even dreamed that the day would come that we could produce whole genomes,” said Yoder, who has been studying ancient DNA in extinct lemurs for over 20 years and is a co-author of the current paper.

For the latest study, the researchers tried to extract DNA from hundreds of giant lemur specimens, but only one yielded enough useful material to reconstitute the whole genome.

Once the creature’s genome was sequenced, the team was able to compare it to the genomes of 47 other living vertebrate species, including five modern lemurs, to identify its closest living relatives. Its genetic similarities with other herbivores suggest it was well adapted for grazing on leaves.

Despite their nickname, koala lemurs weren’t even remotely related to koalas. Their DNA confirms that they belonged to the same evolutionary lineage as lemurs living today.

To Yoder it’s another piece of evidence that the ancestors of today’s lemurs colonized Madagascar in a single wave.

Since the first ancient DNA studies were published, in the 1980s, scientists have unveiled complete nuclear genomes for other long-lost species, including the woolly mammoth, the passenger pigeon, and even extinct human relatives such as Neanderthals.

Most of these species lived in cooler, drier climates where ancient DNA is better preserved. But this study extends the possibilities of ancient DNA research for our distant primate relatives that lived in the tropics, where exposure to heat, sunlight and humidity can cause DNA to break down faster.

“Tropical conditions are death to DNA,” Yoder said. “It’s so exciting to get a deeper glimpse into what these animals were doing and have that validated and verified.”

See them for yourself

Assembled in drawers and cabinets cases in the Duke Lemur Center’s Division of Fossil Primates on Broad St. are the remains of at least eight species of giant lemurs that you can no longer find in the wild. If you live in Durham, you may drive by them every day and have no idea. It’s the world’s largest collection.

In one case are partially fossilized bits of jaws, skulls and leg bones from Madagascar’s extinct koala lemurs. Nearby are the remains of the monkey-like Archaeolemur edwardsi, which was once widespread across the island. There’s even a complete skeleton of a sloth lemur that would have weighed in at nearly 80 pounds, Palaeopropithecus kelyus, hanging upside down from a branch.

Most of these specimens were collected over 25 years between 1983 and 2008, when Duke Lemur Center teams went to Madagascar to collect fossils from caves and ancient swamps across the island.

“What is really exciting about getting better and better genetic data from the subfossils, is we may discover more genetically distinct species than only the fossil record can reveal,” said Duke paleontologist Matt Borths, who curates the collection. “That in turn may help us better understand how many species were lost in the recent past.”

They plan to return in 2022. “Hopefully there is more Megaladapis to discover,” Borths said.

A fossil site in Madagascar. Courtesy of Matt Borths, Duke Lemur Center Division of Fossil Primates

CITATION: “Evolutionary and Phylogenetic Insights From a Nuclear Genome Sequence of the Extinct, Giant, ‘Subfossil’ Koala Lemur Megaladapis Edwardsi,” Stephanie Marciniak, Mehreen R. Mughal, Laurie R. Godfrey, Richard J. Bankoff, Heritiana Randrianatoandro, Brooke E. Crowley, Christina M. Bergey, Kathleen M. Muldoon, Jeannot Randrianasy, Brigitte M. Raharivololona, Stephan C. Schuster, Ripan S. Malhi, Anne D. Yoder, Edward E. Louis Jr, Logan Kistler, and George H. Perry. PNAS, June 29, 2021. DOI: 10.1073/pnas.2022117118.

A New Algorithm for “In-Betweening” images applied to Covid, Aging and Continental Drift

Collaborating with a colleague in Shanghai, we recently published an article that explains the mathematical concept of ‘in-betweening,’in images – calculating intermediate stages of changes in appearance from one image to the next.

Our equilibrium-driven deformation algorithm (EDDA) was used to demonstrate three difficult tasks of ‘in-betweening’ images: Facial aging, coronavirus spread in the lungs, and continental drift.

Part I. Understanding Pneumonia Invasion and Retreat in COVID-19

The pandemic has influenced the entire world and taken away nearly 3 million lives to date. If a person were unlucky enough to contract the virus and COVID-19, one way to diagnose them is to carry out CT scans of their lungs to visualize the damage caused by pneumonia.

However, it is impossible to monitor the patient all the time using CT scans. Thus, the invading process is usually invisible for doctors and researchers.

To solve this difficulty, we developed a mathematical algorithm which relies on only two CT scans to simulate the pneumonia invasion process caused by COVID-19.

We compared a series of CT scans of a Chinese patient taken at different times. This patient had severe pneumonia caused by COVID-19 but recovered after a successful treatment. Our simulation clearly revealed the pneumonia invasion process in the patient’s lungs and the fading away process after the treatment.

Our simulation results also identify several significant areas in which the patient’s lungs are more vulnerable to the virus and other areas in which the lungs have better response to the treatment. Those areas were perfectly consistent with the medical analysis based on this patient’s actual, real-time CT scan images. The consistency of our results indicates the value of the method.

The COVID-19 pneumonia invading (upper panel) and fading away (lower panel) process from the data-driven simulations. Red circles indicate four significant areas in which the patient’s lungs were more vulnerable to the pneumonia and blue circles indicate two significant areas in which the patient’s lungs had better response to the treatment. (Image credit: Gao et al., 2021)
We also applied this algorithm to simulate human facial changes over time, in which the aging processes for different parts of a woman’s face were automatically created by the algorithm with high resolution. (Image credit: Gao et al., 2021. Video)

Part II. Solving the Puzzle of Continental Drift

It has always been mysterious how the continents we know evolved and formed from the ancient single supercontinent, Pangaea. But then German polar researcher Alfred Wegener proposed the continental drift hypothesis in the early 20th century. Although many geologists argued about his hypothesis initially, more sound evidence such as continental structures, fossils and the magnetic polarity of rocks has supported Wegener’s proposition.

Our data-driven algorithm has been applied to simulate the possible evolution process of continents from Pangaea period.

The underlying forces driving continental drift were determined by the equilibrium status of the continents on the current planet. In order to describe the edges that divide the land to create oceans, we proposed a delicate thresholding scheme.

The formation and deformation for different continents is clearly revealed in our simulation. For example, the ‘drift’ of the Antarctic continent from Africa can be seen happening. This exciting simulation presents a quick and obvious way for geologists to establish more possible lines of inquiry about how continents can drift from one status to another, just based on the initial and equilibrium continental status. Combined with other technological advances, this data-driven method may provide a path to solve Wegener’s puzzle of continental drift.

The theory of continental drift reconciled similar fossil plants and animals now found on widely separated continents. The southern part after Pangaea breaks (Gondwana) is shown here evidence of Wegener’s theory. (Image credit: United States Geological Survey)
The continental drift process of the data-driven simulations. Black arrow indicates the formation of the Antarctic. (Image credit: Gao et al., 2021)

The study was supported by the Department of Mathematics and Physics, Duke University.

CITATION: “Inbetweening auto-animation via Fokker-Planck dynamics and thresholding,” Yuan Gao, Guangzhen Jin & Jian-Guo Liu. Inverse Problems and Imaging, February, 2021, DOI: 10.3934/ipi.2021016. Online: http://www.aimsciences.org/article/doi/10.3934/ipi.2021016

Yuan Gao

Yuan Gao is the William W. Elliot Assistant Research Professor in the department of mathematics, Trinity College of Arts & Sciences.

Jian-Guo Liu is a Professor in the departments of mathematics and physics, Trinity College of Arts & Sciences.

Jian-Guo Liu

Using Data Science for Early Detection of Autism

Autism Spectrum Disorder can be detected as early as six to twelve months old and the American Academy of Pediatrics recommends all children be screened between twelve and eighteen months of age.

But most diagnoses happen after the age of 4, and later detection makes it more difficult and expensive to treat.

One in 40 children is diagnosed with Autism Spectrum Disorder and Duke currently serves about 3,000 ASD patients per year. To improve care for patients with ASD, Duke researchers have been working to develop a data science approach to early detection.

Geraldine Dawson, the William Cleland Distinguished Professor in the Department of Psychiatry & Behavioral Sciences and Director of the Duke Center for Autism and Brain Development, and Dr. Matthew Engelhard, a Conners Fellow in Digital Health in Psychiatry & Behavioral Sciences, recently presented on the advances being made to improve ASD detection and better understand symptoms.

The earlier ASD is detected, the easier and less expensive it is to treat. Children with ASD face challenges in learning and social environments.

ASD differs widely from case to case, however. For most people, ASD makes it difficult to navigate the social world, and those with the diagnosis often struggle to understand facial expressions, maintain eye contact, and develop strong peer relations.

However, ASD also has many positive traits associated with it and autistic children often show unique skills and talents. Receiving a diagnosis is important for those with ASD so that they can receive learning accommodations and ensure that their environment helps promote growth. 

Because early detection is so helpful researchers began to ask:

“Can digital behavioral assessments improve our ability to screen for neurodevelopmental disorders and monitor treatment outcomes?”

Dr. geraldine DawsoN

The current approach for ASD detection is questionnaires given to parents. However, there are many issues in this method of detection such as literacy and language barriers as well as requiring caregivers to have some knowledge of child development. Recent studies have demonstrated that digital assessments could potentially address these challenges by allowing for direct observation of the child’s behavior as well as the ability to capture the dynamic nature of behavior, and collect more data surrounding autism.

“Our goal is to reduce disparities in access to screening and enable earlier detection of ASD by developing digital behavioral screening tools that are scalable, feasible, and more accurate than current paper-and-pencil questionnaires that are standard of care.”

Dr. Geraldine Dawson

Guillermo Sapiro, a James B. Duke Distinguished Professor of Electrical and Computer Engineering, and his team have developed an app to do just this.

On the app, videos are shown to the child on an iPad or iPhone that prompt the child’s reaction through various stimuli. These are the same games and stimuli typically used in ASD diagnostic evaluations in the clinic. As they watch and interact, the child’s behavior is measured with the iPhone/iPad’s selfie camera. Some behavioral symptoms can be detected as early as six months of age are, such as: not paying as much attention to people, reduced affective expression, early motor differences, and failure to orient to name.

In the proof-of-concept study, computers were programmed to detect a child’s response to hearing their name called. The child’s name was called out by the examiner three times while movies were shown. Toddlers with ASD demonstrated about a second of latency in their responses. 

Another study used gaze monitoring on an iPhone. Nearly a thousand toddlers were presented with a split screen where a person was on one side of the screen and toys were on the other. Typical toddlers shifted their gaze between the person and toy, whereas the autistic toddlers focused more on the toys. Forty of the toddlers involved in the study received an ASD diagnosis. Using eye gaze, researchers were also able to look at how toddlers responded to speech sounds as well as to observe early motor differences because toddlers with ASD frequently show postural sway (a type of head movement).

“The idea behind the app is to begin to combine all of these behaviors to develop a much more robust ASD algorithm. We do believe no one feature will allow us to detect ASD in developing children because there is so much variation”

DR. GERALDINE DAWSON

The app has multiple features and will allow ASD detection to be done in the home. Duke researchers are now one step away from launching an at-home study. Other benefits of this method include the ability to observe over time with parents collecting data once a month. In the future, this could be used in a treatment study to see if symptoms are improving.

Duke’s ASD researchers are also working to integrate information from the app with electronic health records (EHR) to see if information collected from routine medical care before age 1 can help with detection.

Post by Anna Gotskind

Meet a Duke Senior at the Intersection Of Computation, Neuroscience and T-Pain

As Daniel Sprague ‘21 prepares to graduate from Duke this Spring with a double major in Computer Science and Neuroscience, I had the opportunity to interview him on his undergraduate research experience. In his final semester, Sprague reflects on what he accomplished and learned in the three research labs he was a part of over his four years at Duke.

Outside of the lab, Sprague is also active in the arts community at Duke. He has been a member of Hoof ‘n’ Horn since his freshman year and has performed in four student-run musical theater productions. He is also a part of Speak of the Devil, one of Duke’s acapella groups that he was the president of during his Junior year. Recently, a video they uploaded more than two years ago has picked up speed and acquired over 150,000 views on YouTube. I think it’s fair to say Sprague is even more than a triple threat.

Sprague was interested in neuroscience and biology before he came to Duke and knew he wanted to participate in undergraduate research when he arrived. His first year, planning on pursuing pre-med, he joined Rima Fathi Kaddurah-Daouk’s lab where he worked with metabolomics, the large-scale study of small molecules within cells, biofluids, tissues, or organisms as it relates to neuropsychiatric disorders. While he learned a lot and enjoyed working in this lab, Sprague was eager to explore more.

The summer after his first year, Sprague was accepted to the Huang Fellows Program run by Duke’s Science & Society initiative. 

Sprague described their focus as, “The way that research, science, communication, and medicine interact with social issues and ethics.”

As a part of the program, Sprague was matched and placed in Ornit Chiba-Falek’s lab. There he conducted work in genomics and neuroscience, centered around neurodegenerative diseases, specifically, Parkinson’s and Alzheimer’s. His job involved processing mouse brains to extract neurons for genomic sequencing. From there, the lab would conduct genome-wide association studies to correlate specific human or animal genotypes with genetic markers.

“We were trying to identify SNPs (Single-nucleotide polymorphism) which are single base pair variations in a genome that correlated with Alzheimer’s” Sprague explained

Along with working in a lab, Sprague also attended research seminars, learned about how science publishing works, and participated in a  science symposium at the culmination of the summer experience.

Daniel Sprague presents his research at Duke Science and Society’s Huang Fellows Program Poster Session

“Research is a slow iterative process and it rarely ever works how you expect it to.”

Daniel sprague

Sprague continued working in the Chiba-Falek lab through his sophomore year and contributed to the publication of two research papers: Shared genetic etiology underlying Alzheimer’s disease and major depressive disorder and Bioinformatics strategy to advance the interpretation of Alzheimer’s disease GWAS discoveries: The roads from association to causation. However, partway through the year, he realized he missed math and computational thinking. He began taking more math and computer science classes. After learning more, he realized he really wanted to find a lab doing research at the intersection of computation, math, and neuroscience.

Junior year brought Sprague to the John Pearson’s Lab where they build modeling and analysis tools for brain data.

He also began taking courses in machine learning which he brought into his lab work. His role involved working on the lab’s code base and aiding in the development of a software system for analyzing the brain. He was specifically looking at calcium imaging data. Sprague explained that there are a lot of different ways to do neuroimaging and visualize brain cell function. His work involved using fluorescent calcium.

“When brain cells spike, they release a fluorescent calcium trace that we can visualize with a camera to detect brian cell function with a high degree of temporal and spatial specificity,” Sprague said. “This allows us to accurately detect brain cell function on a millisecond and single cell scale.”

In many neuroscience studies, a stimulus is presented to an organism and the response is observed. The Pearson lab wants to be able to dynamically adjust which stimulus they present based on the intermediary results during the experiment.

“A big limitation in neuroscience research is it just has an absurd amount of data, even for a very small organism,” Sprague said. “Even a couple thousand brain cells will provide so much data that it can’t be visualized or analyzed quick enough to adjust the experiment in ways that would improve it.”

As a result of this limitation, they are trying to adapt conventional computational neuroscience methods to be used in an “online fashion,” which means working with the data as it comes in. Ultimately, they are developing methods to analyze data that traditionally would take hours due to computational time and trying to condense it to a millisecond.

“There are a lot of similar problems that computer scientists work on, but they focus on theoretical analyses of types of functions and how mathematical functions work. What’s cool about this is that it’s very applied with the constraints of a biological system and also requires knowledge of multiple disciplines.”

daniel sprague

Sprague will continue to apply these skills as he begins working next year as an associate consultant at Bain & Company in San Francisco. He is very interested in the connection between science, tech, and society.

Additionally, he is hoping to learn more about how artificial intelligence and machine learning are used in industry as well as their future directions, ethical dilemmas, and legal considerations. Consulting is becoming an increasingly data-driven industry and Sprague hopes to continue developing his domain knowledge and work with these ideas in an applied setting.

As Sprague prepares to leave Duke he reflects on his time here and the research he has had the opportunity to participate in. 

“One thing I’m grateful for is having the chance to have different experiences but still settle into one lab for two years. Don’t be afraid to get involved early, and don’t feel like you have to stay in the same lab for four years.”

daniel Sprague

Post by Anna Gotskind

The SolarWinds Attack and the Future of Cybersecurity

Cybersecurity is the protection of computer systems and networks in order to prevent theft of or damage to their hardware, software, or electronic data. While cybersecurity has been around since the 1970s, its importance and relevance in mainstream media as well as politics is growing as an increased amount of information is stored electronically. In 1986, approximately 1% of the world’s information was stored in a digital format; by 2006, just twenty years later, this had increased to 94%.

Cyber Hacking has also become more prominent with the advent of the Digital Revolution and the start of the Information Era which began in the 1980s and rapidly grew in the early 2000s. It became an effective political form of attack to acquire confidential information from foreign countries. 

In mid-December of 2020, it was revealed that several U.S. companies and even government agencies were victims of a cyberattack that began in September of 2019. 

The Sanford School of Public Policy hosted a leading cybersecurity reporter Sean Lyngaas to lead a discussion on the national security implications of the SolarWinds hack with Sanford Professor David Hoffman as well as Visiting Scholar and Journalist Bob Sullivan. Lyngaas graduated from Duke in 2007 and majored in Public Policy at the Sanford School. 

Lyngaas did not have a direct route into cybersecurity journalism. After completing his Masters in International Relations from The Fletcher School of Law and Diplomacy at Tufts University he moved to Washington D.C. to pursue a career as a policy analyst. However, at night when he was not applying for jobs he began pitching stories to trade journals. Despite not being a “super technical guy” Lyngaas ended up becoming passionate about cybersecurity and reporting on the increasing amounts of news surrounding the growing topic. Since 2012 Lyngaas has done extensive reporting on cybersecurity breaches and recently has published several detailed reports on the SolarWinds incident. 

Sean Lyngaas

The SolarWinds attack is considered one of the most impactful cybersecurity events in history as a result of its intricacy and the number of government and private sector victims. Lyngaas explained that most people had not heard of SolarWinds until recently, but the company nevertheless, provides software to a multitude of fortune 500 companies and government agencies. One of the software products they sell is Orion, an IT performance monitoring platform that helps businesses manage and optimize their IT infrastructure. The Hackers infiltrated Orion’s update software and over several months sent out malicious updates to 18,000 companies and government agencies. Among the victims of this espionage campaign were the U.S. Justice Department and Microsoft. As a result of the campaign, countless email accounts were infiltrated and hacked.

“A perfect example of someone robbing a bank by knocking out the security guard and putting on his outfit to have access.” 

Bob Sullivan

Sullivan added that this hack is particularly concerning because the target was personal information whereas previous large-scale hacks have been centered around breaching data. Additionally, SolarWind’s core business is not cybersecurity, however, they work with and provide software to many cybersecurity companies. The attack was revealed by FireEye, a cybersecurity company that announced they had been breached.

“FireEye got breached and they are the ones usually investigating the breaches”

Sean lyngaas

This situation has prompted both those involved in the cybersecurity industry as well as the public to reconsider the scope of cyberhacking and what can be done to prevent it.

“Computer spying by nation states has been going on for decades but we talk about it more openly now.” Lyngass stated. 

Lyngaas added that the public is now expecting more transparency especially if there are threats to their information. He feels we need to have better standards for companies involved in cyber security. Solarwinds arguably was not using cybersecurity best practices and had recently made price cuts which may have contributed to their vulnerability. Hoffman explained that SolarWinds had been using an easy-to-guess password to their internal systems which allowed hackers access to the software update as well as the ability to sign a digital signature. 

“We are not going to prevent these breaches; we are not going to prevent the Russians from cyber espionage.” Lyngaas stated

However, he believes by using best practices we can uncover these breaches earlier and react in a timely manner to reduce damage. Additionally, he thinks there needs to be a shift in government spending in terms of the balance between cyber defense and offense. Historically, there has been a lack of transparency in government cyber spending, however, it is known that there has been more spent on offense in the last several years.

Changes are starting to be made in the cybersecurity landscape that hopefully should aid in reducing attacks or at least the severity of their impacts. California recently created a law centered around publicizing breaches which will increase transparency. The panelists added that the increasing amount of news and information available to the public about cybersecurity is aiding efforts to understand and prevent it. President Biden was openly speaking about cybersecurity in relation to protecting the election from hackers and continues to consider it an urgent issue as it is crucial in order to protect confidential U.S. information. 

As Lyngaas explained, it is practically impossible to completely prevent cyber attacks, however, through increasing transparency and using best practices, incidents like the SolarWinds hack will hopefully not have effects of the same scale again.

Post by Anna Gottskind

Increasing Access to Care with the Help of Big Data

Artificial intelligence (AI) and data science have the potential to revolutionize global health. But what exactly is AI and what hurdles stand in the way of more widespread integration of big data in global health? Duke’s Global Health Institute (DGHI) hosted a Think Global webinar Wednesday, February 17th to dive into these questions and more.  

The webinar’s panelists were Andy Tatem (Ph.D), Joao Vissoci (Ph.D.), and Eric Laber (Ph.D.), moderated by DGHI’s Director of Research Design and Analysis Core, Liz Turner (Ph.D.).  Tatem is a professor of spatial demography and epidemiology at the University of South Hampton and director of WorldPop. Vissoci is an assistant professor of surgery and global health at Duke University. Laber is a professor of statistical science and bioinformatics at Duke.

Panelist moderator, Lisa Turner

Tatem, Vissoci, and Laber all use data science to address issues in the global health realm. Tatem’s work largely utilizes geospatial data sets to help inform global health decisions like vaccine distribution within a certain geographic area. Vissoci, who works with the GEMINI Lab at Duke (Global Emergency Medicine Innovation and Implementation Research), tries to leverage secondary data from health systems in order to understand issues of access to and distribution of care, as well as care delivery. Laber is interested in improving decision-making processes in healthcare spaces, attempting to help health professionals synthesize very complex data via AI.

All of their work is vital to modern biomedicine and healthcare, but, Turner said, “AI means a lot of different things to a lot of different people.” Laber defined AI in healthcare simply as using data to make healthcare better. “From a data science perspective,” Vissoci said, “[it is] synthesizing data … an automated way to give us back information.” This returned info is digestible trends and understandings derived from very big, very complex data sets. Tatem stated that AI has already “revolutionized what we can do” and said it is “powerful if it is directed in the right way.”

A screenshot from worldpop.org

We often get sucked into a science-fiction version of AI, Laber said, but in actuality it is not some dystopian future but a set of tools that maximizes what can be derived from data.

However, as Tatem stated, “[AI] is not a magic, press a button” scenario where you get automatic results. A huge part of work for researchers like Tatem, Vissoci, and Laber is the “harmonization” of working with data producers, understanding data quality, integrating data sets, cleaning data, and other “back-end” processes.

This comes with many caveats.

“Bias is a huge problem,” said Laber. Vissoci reinforced this, stating that the models built from AI and data science are going to represent what data sources they are able to access – bias included. “We need better work in getting better data,” Vissoci said.

Further, there must be more up-front listening to and communication with “end-users from the very start” of projects, Tatem outlined. By taking a step back and listening, tools created through AI and data science may be better met with actual uptake and less skepticism or distrust. Vissoci said that “direct engagement with the people on the ground” transforms data into meaningful information.

Better structures for meandering privacy issues must also be developed. “A major overhaul is still needed,” said Laber. This includes things like better consent processes for patients’ to understand how their data is being used, although Tatem said this becomes “very complex” when integrating data.

Nonetheless the future looks promising and each panelist feels confident that the benefits will outweigh the difficulties that are yet to come in introducing big data to global health. One cool example Vissoci gave of an ongoing project deals with the influence of environmental change through deforestation in the Brazilian Amazon on the impacts of Indigenous populations. Through work with “heavy multidimensional data,” Vissoci and his team also have been able to optimize scarcely distributed Covid vaccine resource “to use in areas where they can have the most impact.”

Laber envisions a world with reduced or even no clinical trials if “randomization and experimentation” are integrated directly into healthcare systems. Tatem noted how he has seen extreme growth in the field in just the last 10 to 15 years, which seems only to be accelerating.

A lot of this work has to do with making better decisions about allocating resources, as Turner stated in the beginning of the panel. In an age of reassessment about equity and access, AI and data science could serve to bring both to the field of global health.

Post by Cydney Livingston

Invisible No More, the Cervix

How many people have seen their cervix? Obscured from view and stigmatized socially, the cervix is critical to women’s, transgender-men’s, and non-binary folks’ health — and potential reproductive health issues. A team formed through Duke’s Center for Global Women’s Health Technologies (GWHT) has created a device that not only holds immense medical potential but the potential to empower people with cervixes across the globe: It makes visible a previously invisible organ. 

Nimmi Ramanujam (Ph.D.), founder of GWHT and Professor of Engineering at Duke University, heads the team. Mercy Asiedu (Ph.D.), Gita Suneja (M.D.) Wesley Hogan (Ph.D.), and Andrea Kim have all been integral members of the interdisciplinary collaboration. Dr. Suneja is Associate Professor of Radiation Oncology at the University of Utah School of Medicine and a clinical researcher. Asiedu, former PhD student with Dr. Ramanujam and current postdoc at MIT, was integral to the development of Callascope.

The Callascope allows women and others who have cervixes, along with health professionals, to perform cervical exams without use of traditional examination tools that are larger, cannot be used for self-examinations, and often scary-looking.

When Wesley Hogan, director of Duke’s Center for Documentary Studies and research professor, heard about the idea “she was hooked.” Andrea Kim graduated from Duke University in 2018. Her senior thesis was a 12 minute documentary focused on the Callascope and its potential uses. Following graduation, over the last two years, she expanded the film to a 50-minute piece titled  “The (In)visible Organ” that was screened January 14, 2021. Kim moderated a panel with Ramanujam, Asiedu, Suneja and Hogan January 28th, 2021. 

Callascope: A handheld device that can be used to conduct cervical screenings. All that’s needed is a smart phone.

The Callascope addresses a dire global health need for better women’s reproductive health. Further, it empowers women as self-advocates of their own gynecological and reproductive health through reinvention of gynecological examination. Cervical cells have an “orderly progression,” says Suneja, we have a “great idea” of how cells become cancerous over time, “with multiple places to intervene.” Cervical examinations, however, are necessary for assessing cervical health and potential disease progression.

Originally from Ghana, Dr. Asiedu was interested in using her engineering skills to develop technology to “improve health outcomes,” particularly in countries like her own, which may lack adequate access to preventative healthcare and could benefit most from Callascope. Many women in underserved countries, as well as underserved areas of the United States, suffer disproportionately from cervical cancer — a preventable disease. 

Dr. Ramanujam, who served as a voluntary test-subject for Asiedu’s Callascope prototypes, says that it’s a really important tool “in actually changing [the cervix’s] narrative in a positive way” — it is an organ “that is indeed invisible.”

The hope is that with more awareness about and use of Callascope, cervical screenings, and vaginal health, cervixes may become more de-stigmatized and cultural norms surrounding them may shift to become more positive and open. Dr. Hogan stated that when Ramanujam pitched her the Callascope idea they were in a public restaurant. Hearing Ramanujam say words like “vagina” and “cervix” loud enough for others to hear made Hogan recognize her own embarrassment surrounding the topic and underscored the importance of the project. 

The project and the team serve as a wonderful example of intersectional work that bridges the sciences and humanities in effective, inspiring ways. One example was the Spring 2019 art exhibit, developed in conjunction with the team’s work, presented at the Nasher Museum which exposed the cervix through various mediums of art.

Multidisciplinary Bass Connections research teams contributed to this work and other interdisciplinary projects focused on the Callascope. Dr. Asiedu believes documentaries like Kim’s are “really powerful ways to communicate global health issues.” Kim who directed and produced “The (In)visble Organ” hopes to continue exploring how “we can create more cultures of inclusion …when it comes to reproductive health.” 

A piece of artwork from the (In)visible Organ art exhibit at Duke’s Nasher Museum in the spring of 2019.

Ramanujam emphasized the need to shift biomedical engineering focus to create technologies that center on “the stakeholders for whom [they] really [matter].” It is multi-dimensional thinkers like Ramanujam, Asiedu, Hogan, and Kim who are providing integrative and inventive ways to address health disparities of the 21st century — both the obvious and the invisible. 

Post by Cydney Livingston

Cybersecurity for Autonomous Systems

Over the past decades, we have adopted computers into virtually every aspect of our lives, but in doing so, we’ve made ourselves vulnerable to malicious interference or hacking. I had the opportunity to talk about this with Miroslav Pajic, the Dickinson Family associate professor in Duke’s electrical and computer engineering department. He has worked on cybersecurity in self-driving cars, medical devices, and even US Air Force hardware.

Miroslav Pajic is an electrical engineer

Pajic primarily works in “assured autonomy,” computers that do most things by themselves with “high-level autonomy and low human control and oversight.” “You want to build systems with strong performance and safety guarantees every time, in all conditions,” Pajic said. Assured Autonomy ensures security in “contested environments” where malicious interference can be expected. The stakes of this work are incredibly high. The danger of attacks on military equipment goes without saying, but cybersecurity on a civilian level can be just as dangerous. “Imagine,” he told me, “that you have a smart city coordinating traffic and that… all of (the traffic controls), at the same time, start doing weird things. There can be a significant impact if all cars stop, but imagine if all of them start speeding up.”

Pajic and some of his students with an autonomous car.

Since Pajic works with Ph.D. students and postdocs, I wanted to ask him how COVID-19 has affected his work. As if on cue, his wifi cut out, and he dropped from our zoom call. “This is a perfect example of how fun it is to work remotely,” he said when he returned. “Imagine that you’re debugging a fleet of drones… and that happens.” 

In all seriousness, though, there are simulators created for working on cybersecurity and assured autonomy. CARLA, for one, is an open-source simulator of self-driving vehicles made by Intel. Even outside of a pandemic, these simulators are used extensively in the field. They’ve become very useful in returning accurate and cheap results without any actual risk, before graduating to real tests.

“If you’re going to fail,” Pajic says, “you want to fail quickly.”

Guest Post by Riley Richardson, Class of 2021, NC School of Science and Math

A Computer Scientist Investigating the Source Code of Life

We are all born with defining physical characteristics. Whether it be piercing blue eyes or jet black hair, these traits distinguish us throughout our entire lives. However, there is something that all of our attributes have in common, a shared origin: genes.

Beyond dictating our individual features, genes instruct cells to create proteins that are essential for a variety of processes, from controlling muscle function to managing digestive systems. Despite their importance in the workings of our body, genes can also code for detrimental diseases, such as Huntington’s disease or Duchenne muscular dystrophy.

Raluca Gordân, Ph.D.

These types of diseases are exactly what Raluca Gordân, Ph.D. is battling through her research. She and her group are trying to figure out how to decode the non-coding genome, the DNA apart from protein-coding genes. They are deepening their understanding of the role non-coding areas of the genome play in the expression of the coding genes and the production of proteins.

Gordân, an associate professor in biostatistics and bioinformatics at Duke, said a majority of disease-causing genetic mutations derive from the genome outside of genes.

“That is a huge search space,” she says, chuckling. “Genes only make up about 2% of the genome. If we don’t understand what those non-coding regions are doing, it’s hard to make predictions about what the mutation in those regions would be doing and how to connect that to the development of a disease.”

Gordân recently published a paper, entitled “DNA mismatches reveal conformational penalties in protein–DNA recognition,” which focuses on transcription factors and their exceptional ability to bind to mispaired DNA, misspellings that occur as DNA is copied. During regular replication, nucleotide bases (the building blocks of our DNA) are paired correctly, where adenine pairs with thymine and cytosine goes with guanine. However, when an error occurs during replication, mispairs start to appear, as adenine may pair with guanine instead.

“Normally, those are mistakes that get repaired by specific mismatch repair pathways but that repair might not happen if one of these transcription factors sits on the replication error and doesn’t allow the repair mechanism to see it,” Gordân explains. “Normally, one would expect the transcription factors not to bind to those errors. But we found that they can bind way better than their actual genomic targets.”

Modeling of the binding between mismatched DNA and transcription factors.

To expand on her computational discovery, Gordân is now following up with a study of transcription factor binding to mismatches in living cells, observing whether they adopt their usual role of regulating gene expression or contribute to the development of mutations.

Gordân’s research is a product of her passion and desire to make change. It also can be attributed to a series of realizations she made during college and inspirational mentors who guided her along the way.

While pursuing her undergraduate degree, Gordân was a purely computer science major, concentrating on cryptography. However, as she was nearing the end of her four years of college, she soon found herself yearning for the opportunity to do more. She began looking into machine learning applications and enrolled in a course based around genetic algorithms which she credits for launching her career path.

At that point, she attained what she describes as her “first taste of genetics” and her interest in bioinformatics was irrevocably piqued. Thereafter, Gordân applied for a PhD at Duke, where she worked with advisor Alex Hartemink investigating transcription factor proteins in regulatory genomics. At Duke, her work was primarily computational.  But with her postdoctoral advisor Martha Bulyk of Harvard Medical School, Gordan was exposed to the more experimental aspects of biology.

Today, she recognizes these experiences as integral to her ongoing research, which requires her to frequently iterate between observational approaches and computational work.

Gordân is acclimating to the newly quarantined world. While she strives to continue her research, in the pandemic, it has changed her routine.

“I think what was affected a lot since the pandemic started is the fact that we don’t meet in person,” she says. “A lot of the quick progress was being made when we were in the same physical space and were able to get feedback immediately, with students learning about each other’s results in the lab, in real time. That was replaced with Zoom meetings, where students get to see the other students’ results mainly at lab meetings, weeks or months later. Those continuous discussions that were going on in the lab all the time. We’re missing that.”

Gordân offered some thoughtful parting advice to aspiring computational biologists, like me.

“I was trained as a computer scientist, so I wasn’t really sure about experimental work. But after actually doing the experimental work, I realized how much value there is in doing both,” she said. “You have to pick what you’re strongest at, either the computational or experimental part, but you should not be afraid of the other side.”

Guest Post by Akshra Paimagam, Class of 2021, NC School of Science and Math

Page 1 of 16

Powered by WordPress & Theme by Anders Norén