What if there were a scientific way to come up with the most interesting sports headlines? With the development of computational journalism, this could be possible very soon.

Dr. Jun Yang is a database and data-intensive computing researcher and professor of Computer Science at Duke. One of his latest projects is computational journalism, in which he and other computer science researchers are considering how they can contribute to journalism with new technological advances and the ever-increasing availability of data.

An exciting and very relevant part of his project is based on raw data from Duke men’s basketball games. With computational journalism, Yang and his team of researchers have been able to generate diverse player or team factoids using the statistics of the games.

Grayson Allen headed for the hoop.

Grayson Allen headed for the hoop.

An example factoid might be that, in the first 8 games of this season, Duke has won 100% of its games when Grayson Allen has scored over 20 points. While this fact is obvious, since Duke is undefeated so far this season, Yang’s programs will also be able to generate very obscure factoids about each and every player that could lead to unique and unprecedented headlines.

While these statistics relating player and team success can only imply correlation, and not necessarily causation, they definitely have potential to be eye-catching sports headlines.

Extracting factoids hasn’t been a particularly challenging part of the project, but developing heuristics to choose which factoids are the most relevant and usable has been more difficult.

Developing these heuristics so far has involved developing scoring criteria based on what is intuitively impressive to the researcher. Another possible measure of evaluating the strength of a factoid is ranking the types of headlines that are most viewed. Using this method, heuristics could, in theory, be based on past successes and less on one researcher’s human intuition.

Something else to consider is which types of factoids are more powerful. For example, what’s better: a bolder claim in a shorter period of time, or a less bold claim but over many games or even seasons?

The ideal of this project is to continue to analyze data from the Duke men’s basketball team, generate interesting factoids, and put them on a public website about 10-15 minutes after the game.

Looking forward, computational journalism has huge potential for Duke men’s basketball, sports in general, and even for generating other news factoids. Even further, computational journalism and its scientific methodology might lead to the ability to quickly fact-check political claims.

Right now, however, it is fascinating to know that computer science has the potential to touch our lives in some pretty unexpected ways. As our current men’s basketball beginning-of-season winning streak continues, who knows what unprecedented factoids Jun Yang and his team are coming up with.

By Nina Cervantes