One topic I’ve been tracking for a while is addressed by Stern’s article “Statistical Issues in Forensic Science.” In the past decade, there have been many challenges to accepted forensic science—such as fingerprint analysis. Fingerprints are used by the author to demonstrate how statistical methods could be used to address current problems:
Another concern expressed in the 2009 NRC report is whether the forensic practitioner community has a full appreciation of the role of uncertainty in forensic examinations. For many years (through the early 2000s), it was common for latent print examiners to support a claimed identification by noting that the process they followed had zero error rate. Another popular claim was that the source of a print was identified to the exclusion of all other people that had ever lived or ever would live. Those who work in scientific disciplines and appreciate the role of uncertainty know that such claims are not credible. Recent studies have demonstrated a low but nonzero misidentification error rate for latent fingerprint examiners. In other forensic disciplines, it is common to have examiners testify to a “reasonable degree of scientific certainty.” This language was recently criticized by the NCFS because it does not have a standard definition and might confuse or mislead jurors (NCFS 2016).
Jeffrey T. Leek & Leah R. Jager tackle another much discussed topic in their article “Is Most Published Research Really False?” This has been a concern in many fields for the past several years, and because a lot of the discussion has involved some high-level statistics, I was very glad to find more information from this perspective. The introduction is especially good at laying out some of the possible concerns:
But this system was invented before modern computing, data generation, scientific software, email, the Internet, and social media. Each of these inventions has placed strain on the scientific publication infrastructure. These modern developments have happened during the careers of practicing scientists. Many laboratory leaders received their training before the explosion of cheap data generation, before the widespread use of statistics and computing, and before there was modern data analytic infrastructure. At the same time, there has been increasing pressure from review panels, hiring committees, and funding agencies to publish positive and surprising results in the scientific literature. These trends have left scientists with a nagging suspicion that some fraction of published results are at minimum exaggerated and at worst outright false.
Statistics articles ripped from the headlines? We have one! For example, Dwork et al.’s article “Exposed! A Survey of Attacks on Private Data” offers an introduction to an interesting facet of the privacy discussion: How do we use information publicly while protecting a sensitive dataset:
We focus on the simple scenario in which there is a dataset x containing sensitive information, and the goal is to release statistics about the dataset to the public. These statistics may be fixed in advance or may be chosen by the analyst, who queries the dataset. Speaking intuitively (because we have not yet defined privacy), the goal in privacy-preserving data analysis is to protect the privacy of the individual records in the dataset, even if the analyst maliciously chooses queries according to an attack strategy designed to compromise privacy.
Suzanne K. Moses is Annual Reviews’ Senior Electronic Content Coordinator. For 15+ years, she has played a central role in the publication of Annual Reviews’ online articles. Not a single page is posted online without first being proofed and quality checked by Suzanne.