This work designs and evaluates a machine learning pipeline for estimation of battery capacity fade—a metric of battery health—on 179 cells cycled under various conditions, and provides insights into the design of scalable data-driven models for battery SOH estimation, emphasizing the value of confidence bounds around the prediction.