Using Data Sets to Settle Cases

Daily Journal

July 6, 2021


By Daniel Garrie and Gail Andler

Suppose you have a consumer class action involving thousands, or millions, of consumers who ingested a tainted supplement. How do you settle a case of that scope? Or a data breach incident that exposed the personal identifying information of thousands of consumers? How can you calculate unpaid wages for hundreds or thousands of employees without examining the payroll records of each individual employee? And if you are not able to settle these cases, how do you present proof at trial of these large data sets?

The discovery and use of large data sets for mediation and trial often take place through surveying, sampling and extrapolating. However, trial courts have been cautioned that statistical methods alone “cannot entirely substitute for common proof.” California Judges Benchbook: Civil Proceedings Before Trial Section 11.29. Extrapolating from existing data to produce new data is common in science and in law.

Extrapolation is the process by which information that is already known (the “sample”) is used to predict the outcome for a larger group. That is, a sample of data is used to make inferences about the larger, general group. For such inferences to be properly drawn from the known facts of the subset to the larger relevant population, it is essential that the sample be statistically valid. This means that the underlying methodology must be designed to yield a representative result. However, this does not necessarily mean that the sample itself has to be representative.

The California Supreme Court discussed the concepts of sampling, extrapolation, and the need for sound methodology in Duran v. U.S. Bank Nat’l Assn., 59 Cal. 4th 1 (2014). In Duran, the trial court was found to have improperly extrapolated the amount of overtime pay from a sample to the class as a whole, where the sample was devised without expert input which allowed the parties to “impeach the model or otherwise show its liability is reduced.” Although the Supreme Court recognized the appropriateness of the use of sampling and surveys for proof of liability of damages, it found problems with the methodology employed.

