Last week, in the case of State v. Nieves, the New Jersey state Supreme Court became the first in the country to ban prosecutors from introducing “shaken baby syndrome” or “abusive head trauma” (SBS/AHT) claims in criminal cases, at least where the only claim of abuse is the shaking itself (as opposed to, say, shaking alongside physical blows to the head). The case came closely on the heels of the Texas Court of Criminal Appeals pointing to the state’s “junk science” law when it stayed the execution of Robert Roberson, who was convicted of murder in an SBS/AHT case in 2002.
The outcome in Nieves reflects a growing backlash against SBS/AHT claims in criminal cases. While New Jersey’s was the first state supreme court to exclude the evidence, the justices in Nieves note that courts in other states have become increasingly critical of such evidence (while also acknowledging that other courts continue to broadly approve it). And while numerous medical groups continue to insist that it is a valid medical diagnosis, a growing literature, especially in biomechanics, has cast doubt, arguing, for example, that the amount of force needed to cause the injuries associated with SBS/AHT would likely cause serious injuries to the neck first.
The case highlights the challenge that empirical evidence poses for a legal system that has very few practitioners — lawyers or judges alike — with scientific or statistical training.
Given that the criminal legal system is supposed to err in favor of defendants, the skepticism shown by the court in the Nieves case toward SBS/AHT strikes me as the right way to balance things, given the growing uncertainty about the claim. More importantly, the Nieves case provides a useful way to highlight at least two broader challenges posed by forensic evidence — and scientific evidence more broadly — in the legal system.
To start, the case is a good reminder that a tremendous amount of the forensic evidence used in criminal cases rests on empirical support that is thin to almost nonexistent.
In 2009, for example, the National Academy of Sciences’ National Research Council released a compelling report on forensic evidence that argued that entire swaths of forensic evidence — such as bite-mark matches, blood-splatter patterning, handwriting comparisons and so on — had little to no empirical validation, no way to calculate things such as false positive or false negative rates and often rested primarily on instinct and intuition. It also found that even more established practices such as fingerprint evidence were likely far less reliable than conventional wisdom suggests.
Although that report did not look at eyewitness testimony, a later one did, and it documented widespread concerns with that evidence, too. According to the Innocence Project, mistaken eyewitness testimony played a role in roughly 70% of all convictions that have been later overturned by DNA evidence. And that DNA evidence itself may be fairly well-validated in general, but it is only as reliable as the labs that test it, which sadly are not always run well. Even video evidence now faces an existential threat in the form of deep fakes and other artificial intelligence advances.
But perhaps more important, the case highlights the challenge that empirical evidence poses for a legal system that has very few practitioners — lawyers or judges alike — with scientific or statistical training (perhaps under 10% of law students have a STEM background, and only a few law schools provide any statistical training to their students).
In many cases, it is not easy to get clean evidence about the issues the law cares about.
In many cases, it is not easy to get clean evidence about the issues the law cares about. It’s simply not possible to run a randomized clinical trial to see what happens when you shake a baby, so we have to rely on indirect approaches. For SBS/AHT, the seminal study wasn’t about humans at all but, rather, about the trauma suffered by sedated monkeys experiencing whiplash at 30 mph. Subsequent studies then extrapolated these results to humans, often looking at nonrandom samples of babies already suspected of being abused.
There are lots of scientific jumps being made here — like in this context, are sedated monkeys similar enough to human babies, and how comparable is a 30 mph accident? Many such jumps may be justifiable, but all raise concerns — concerns that it may be hard for lay judges to parse.
To address this obvious limitation in judicial skill, many states have adopted a rule of evidence called the Frye standard, which calls on judges to generally defer to the consensus views of scientific professions. This reflects an understandable scientific humility on the part of judges but immediately introduces some serious problems.
First, judges have to figure out which groups count as the relevant ones, which was a debate at the heart of Nieves itself: Many medical groups defended SBS/AHT, but biomechanic groups did not, which led the majority to say it wasn’t supported by the professions. (While New Jersey no longer uses the Frye standard, it did when the Nieves case began, so the old rule got grandfathered in.)
What professional group is going to say “our studies are bunk, and you should stop hiring us”?
There’s also the risk of capture. The National Research Council report on forensics, for example, noted that there was almost no empirical support for “forensic odontology” (bite-mark evidence) but that “a majority of forensic odontologists are satisfied” that the system works. Of course they are! What professional group is going to say “our studies are bunk, and you should stop hiring us”? Frye understandably defers to experts, but those experts have their own vested interests.
In response, the federal government and a majority of the states now operate under the Daubert standard, which empowers judges to take on a much more active gatekeeping role. This can help avoid the risk of capture, and in theory it allows courtroom science to adapt more quickly to cutting-edge research. But that means it also calls on scientifically untrained judges to make tough scientific decisions in high-pressure adversarial fights and allows them to inject their own ideology as well.
The challenge of complex scientific evidence in courts has been a growing problem for years, and the volume of empirical arguments in courts only seems to be accelerating. In Nieves, the New Jersey Supreme Court appears to have gotten things right. But lawyers, judges and law school still seem far too complacent in adapting to our increasingly data- and quantitative-science-driven world.
John Pfaff
John Pfaff is a professor of law at the Fordham University School of Law. He is the author of "Locked In: The True Causes of Mass Incarceration and How to Achieve Real Reform."









