The Evidence a Jury Will See

Karen Pendergrass

Heavy-Metal Litigation After Palmquist, and What Defense Counsel Can Build Before the Complaint

Two rulings in the space of ten weeks reset the heavy-metal docket for infant and child food. In December 2025, the court overseeing the federal multidistrict litigation excluded five of the plaintiffs’ six general-causation experts under Rule 702. In February 2026, a unanimous Supreme Court vacated a manufacturer’s defense verdict and sent the case back to state court. The first ruling made the plaintiffs’ science harder to admit. The second made the defendant’s preferred forum harder to reach. Read together, they point to the same place: more of these cases will be decided by state juries, on the strength of the documented record each side can put in front of them. This briefing is written for the counsel who will try those cases. It explains what that record now has to look like, what the Heavy Metal Tested & Certified program was built to produce, and where the program defers to your judgment rather than substituting for it.

Two rulings, opposite directions

The defense had a good December. In In re: Baby Food Products Liability Litigation, MDL No. 3101 [1], the court held a multi-day Rule 702 hearing on general causation and then excluded five of the plaintiffs’ six experts [2]. The opinions, the court found, rested on exposure estimates built from litigation-driven hypothetical menus rather than documented real-world consumption, and no published study links baby-food consumption to autism or ADHD. The analytical gap the experts had to bridge, they did not bridge. For the proposition that diet-level heavy-metal exposure causes neurodevelopmental injury, the plaintiffs’ proof is, for now, badly weakened.

The defense had a worse February. In Hain Celestial Group, Inc. v. Palmquist [3], a unanimous Court, in an opinion by Justice Sotomayor, held that a district court’s erroneous dismissal of the in-state retailer did not cure the diversity defect that existed at removal, and so the federal judgment — a defense verdict the manufacturer had already won at trial — had to be vacated. The case returns to Texas state court to be tried again, against both the manufacturer and the retailer. The lesson is forum. The plaintiff’s move is durable: sue the manufacturer, join the in-state retailer that sold the product, and so long as the claims against the retailer are colorable, the case stays in state court. The traditional counter-move — remove, then drop the retailer as improperly joined — is now affirmatively hazardous, because a judgment entered after a wrongful dismissal can be vacated years later with the clock reset to zero. The retailer is no longer a party to be removed and forgotten; it is an anchor defendant, and a private-label program carries that exposure twice.

The synthesis is the part that matters for counsel. The contested ground has shifted. The fight is moving away from “can the plaintiff’s general-causation science come in at all” — where the defense is currently winning — and toward “what does each party’s own documented record show,” in front of a state jury, under state evidence rules, with the retailer in the room. That second contest is not won by an expert at trial. It is won, or lost, by the record the company built years before the complaint.

Why this lands on counsel’s desk before the complaint

Begin with the argument the plaintiff will actually make. It does not depend on proving that one lot was extraordinarily contaminated. It depends on a narrative with three beats: the metals were present, the company knew or should have known, and the company did not do enough. The congressional staff report that opened this litigation wave in 2021 supplied the template when it noted that one major manufacturer had not tested its finished products for heavy metals until 2019 [4]. “They didn’t test” is a sentence a jury understands without an expert’s help, and there is no cross-examination that makes it go away after the fact.

That is the structural problem. A defense to the “knew-or-should-have-known, did-nothing” narrative cannot be assembled after the complaint is served. Evidence created once litigation is in view looks like what it is. The defense has to already exist on the day the complaint arrives, or it does not exist at all. By the time the matter reaches counsel, the most consequential decisions — whether the company tested, how, how often, and who held the data — have already been made or already been neglected.

The evidence that answers it: contemporaneous, independent, continuous

A favorable testing record is not generically useful. It is useful when it has three specific properties, and most companies in these categories satisfy at most one. The table below sets each property against the plaintiff’s argument when it is missing and the defense posture when it is present.

Property	Plaintiff argues, when missing	Defense posture, when present
Contemporaneous	The company assembled this record after the complaint was filed. It is litigation-prepared evidence built to support a defense, not a quality-control record built to detect a problem.	Testing was generated routinely, before any litigation or regulatory inquiry, under a documented schedule. The record cannot have been built for the courtroom because the courtroom was not in view when it began.
Independent	Testing was commissioned, sampled, and selectively reported by the party that benefits from a clean result. A competent expert takes it apart in front of the jury, lot by lot, choice by choice.	Samples were drawn by a party independent of the brand’s production team and analyzed by ISO/IEC 17025-accredited laboratories [5]. Chain of custody is documented; a break in custody invalidates the result.
Continuous	This single passing result is a snapshot. The tested lot was not the lot the child consumed. The favorable result is the anomaly, not the rule.	Testing was repeated on a defined schedule, across lots and across time, against published limits. The product the child consumed sits inside a demonstrated pattern of compliance, not outside a single fortunate data point.

Most certification, and most internal quality testing, produces snapshots. The litigation Palmquist has pushed into state court will be won and lost on the difference between a snapshot and a record. The remainder of this briefing is about what it takes to hold a record of the third kind, and the trade-offs counsel should weigh before advising a client to build one.

What HMTc is, and what it produces

A word on the program itself, because the case above only holds if the certification behind it is real. Heavy Metal Tested & Certified is a voluntary, fee-based certification operated by the Paleo Foundation. It is kept deliberately separate from the Heavy Metal Index (heavymetalindex.com), the curated synthesis of the heavy-metals-in-food literature from which the program’s limits are derived: the Index reports what the literature says, and the certification program sets thresholds that reference it. That separation is structural, and it matters in litigation — the standard is not the certifier’s self-published justification but an independent evidence base the certifier happens to operate, with every claim traceable to a source.

What the program certifies is a finished infant or child food product, tested against per-category limits for eight priority metals through a ten-analyte panel (the eight totals, plus speciated inorganic arsenic, methylmercury, and hexavalent chromium where the product or a reflex trigger requires it). The metals are tiered by toxicology, and the tiering is the defensible part. Lead, cadmium, inorganic arsenic, and methylmercury are Tier 1 — carcinogenic, neurotoxic, or bioaccumulative with no established safe threshold, and subject to zero tolerance for exceedance. Nickel, tin, aluminum, and chromium are Tier 2, with established tolerable intakes and a limited transitional allowance. If a plaintiff’s expert asks why a nickel result received transitional treatment while a smaller lead exceedance triggered probation, the answer traces to the comparative toxicology of the two metals, not to administrative convenience. That traceability is engineered in on purpose.

A brand earns and keeps the mark through a lifecycle, not a single test. It establishes a three-lot baseline, receives a status — A (full compliance), B (transitional), C (probation), D (suspension), or E (revocation) — and then submits to ongoing risk-based surveillance testing. A confirmed exceedance triggers corrective and preventive action; the published limits ratchet tighter over time as a category cleans up, and never loosen. A brand whose baseline does not yet qualify can enter confidentially through the Confidential Remediation Track, discussed below. The limits themselves are set from the published-literature occurrence distribution for each product subcategory and then capped so they can never exceed the lowest applicable government maximum, with the methodology published and reproducible rather than decided case by case. Infant and child food is the first of a planned twenty-three-category framework; the 2026 Program Manual is its first bound volume [6].

Now strip the certification framing away and look at the program as an evidence-generation protocol, because that is how it matters in a courtroom.

Testing is independent and continuous by design. Certified products are tested on a defined schedule — monthly for the highest-exposure categories such as infant formula and infant cereal, less frequently for lower-exposure categories — with frequency tied to how much of a product a child consumes relative to body weight. Samples are drawn by a party independent of the brand’s production team, analyzed by ISO/IEC 17025-accredited laboratories, and moved under documented chain of custody, where a break in custody invalidates the result. Results are submitted as structured, machine-readable data, so the testing history exists as an analyzable record rather than a drawer of unconnected PDFs [6].

The decision rules are written for cross-examination. A result is judged against the limit using the laboratory’s expanded measurement uncertainty: a lot passes only if the result plus uncertainty is at or below the limit, and fails only if the result minus uncertainty exceeds it; results in between are borderline and require confirmatory action before any status is assigned [5, 7]. The rule is biased neither toward the brand nor toward the regulator but toward analytical accuracy, which is precisely why it survives a hostile expert. Paired with it is a prohibition the defense bar will appreciate immediately: the program forbids “testing into compliance.” Undocumented resampling, lab-shopping, and selective reporting are data-integrity violations that cost a brand its status. The point of that rule is evidentiary — a defense attorney pointing to a certified SKU’s test history must be able to say the record reflects the program’s actual analytical findings, not a curated subset chosen to reach a predetermined number. The anti-retesting discipline is what makes the record admissible.

Why the limits sit below the FDA’s, and why that helps the defense

A reflexive objection to any private standard is that it is arbitrary — stricter than the government’s numbers for marketing reasons. HMTc’s answer is quantitative and, for counsel, useful. A single day’s consumption at the FDA action level across four common categories — ready-to-feed formula, dry infant cereal, a fruit purée, and a root-vegetable purée — yields roughly 10.4 µg of lead per day, about 4.7 times the FDA’s own Interim Reference Level of 2.2 µg/day for young children [6]. Full compliance with every single-product federal action level still produces aggregate infant exposure several times over the health-based benchmark, because a child eats across categories. That is why HMTc limits sit below the action levels rather than at them.

For defense counsel the value is twofold. First, it rebuts the “they just rubber-stamped the regulatory floor” characterization of the certification. Second, the deviation is honestly labeled: where an HMTc limit is tighter than the literature or the regulatory floor, the program records the rationale as precautionary, market-ratcheting, feasibility-driven, or regulatory-alignment, rather than papering over the gap. A standard that states its own reasoning is a standard that holds up when an expert reads it aloud.

The Confidential Remediation Track, and its honest limits

The program’s most litigation-relevant feature is also the one that most requires your judgment. Most brands that need a structured improvement program will not join one that forces immediate public disclosure of elevated results, because public acknowledgment hands plaintiffs’ counsel a factual predicate before any improvement has occurred. The Confidential Remediation Track is the answer: a brand whose baseline does not yet qualify for certification can enroll confidentially — provided every tested lot still meets applicable legal limits, since a regulatory exceedance is a hard stop regardless of status — with enrollment and test data treated as confidential business information, the certification mark withheld until at least one product qualifies, and the whole engagement framed as a good-faith quality-improvement initiative. The intended effect is to convert the plaintiff’s strongest sentence. Instead of “the brand knew and did nothing,” the record reads “the brand knew, enrolled in an independent program before any inquiry, and has third-party-verified documentation of continuous improvement.”

Here is where the program is candid in a way counsel should hold it to, and where it stops short of your role. The Track does not create privilege. Its data is held by the program operator as a service provider and is potentially compellable under subpoena; the operator commits to notifying the licensee and cooperating with its defense, but the testing data carries no attorney-client privilege or work-product protection [6]. The record is admissible for the brand and discoverable by the plaintiff, both at once. The program’s own position is that brands and their counsel must weigh whether that evidentiary posture — contemporaneous, externally verified, good-faith due diligence, generated before litigation was foreseeable — is preferable to the realistic alternative, which for most brands is no documented program at all. It explicitly disclaims being a substitute for a counsel-directed quality program that may carry stronger privilege in a specific matter. That is the genuine strategic question this briefing exists to put in front of you, and it is not one the program can or should answer for your client.

The mark is disciplined against overclaiming

A certification mark is itself a representation, and a representation is something a plaintiff can attack as deceptive. The program is built against that exposure. A brand using the mark is prohibited from claiming “heavy metal free,” “no toxic metals,” “100% safe,” or any language implying the absence of metals or the elimination of risk; the mark certifies testing against defined action levels under ongoing surveillance, and the manual states plainly that it “does not certify safety.” Only fully or transitionally compliant products may bear it, and brands control whether their certified products appear in any public registry. The discipline matters for counsel because the misrepresentation theory in these cases feeds on absolute claims. A program that forbids the absolute claim removes the hook rather than supplying a new one.

The case beyond the courtroom

The litigation posture is the sharpest reason to act, but it is not the only one, and a colleague weighing this for a client will want the rest of the picture.

The first additional driver is regulatory, and it is no longer optional. California’s AB 899 already requires monthly lot testing of baby food and public disclosure of the results; Maryland’s Rudy’s Law requires testing and website disclosure with a sale prohibition above FDA limits; the FDA’s Closer to Zero action levels set the federal direction of travel [6, 7, 8]. A brand selling nationally faces a growing patchwork of testing-and-disclosure mandates. The point for counsel is that the testing record is going to exist regardless — the only question is whether it is generated inside one disciplined program that maps to these obligations, or assembled ad hoc, differently, in each jurisdiction. Certification consolidates a compliance burden the client is already carrying.

The second is commercial. Major retailers increasingly condition shelf space and marketplace listing on heavy-metal testing, and product-liability underwriters are repricing this exposure; a documented third-party program is the kind of thing that moves both a buying decision and a premium. For a client whose growth depends on a national retailer’s private-label or premium tier, channel access alone can dwarf the cost of certification.

The third is the cost of inaction, which is quantifiable and large. The program’s own modeling places the one-year expected loss from a single contamination-triggered event — a recall, a delisting, a serious lawsuit cluster — at roughly $1.6 Mt o$ 6.6M for a brand at $5 M in re v e n u e, an d a t$ 57M to $187 M or m ore f or ab r an d a t$ 500M; defense costs in a major matter routinely exceed $2 M, w i t h d i sco v ery a l o n eo f t e n o v er$ 600,000 [6]. Against numbers like these, the question a client should ask is not what certification costs, but what the absence of a documented program is already costing in risk it cannot see. That is the frame to leave a colleague with.

The objections you will raise

This program was built to be read by hostile reviewers, and the manual carries a program-level adversarial read — eighteen attack categories with the program’s responses — that is the right place to stress-test it. The short version of the objections most likely to occur to defense counsel:

Discoverability. Addressed above and conceded honestly: enrolling generates a record that can be compelled. The counter is that for baby food sold in California and Maryland, lot testing is already mandated and the record will exist regardless [7, 8]; the live choice is whether it is generated inside a disciplined program or ad hoc. For a client with no mandatory-testing footprint, this is a deliberate decision to make with counsel, not a default.
Independence / pay-to-play. The standard is set from the published literature, not from certified brands’ own data; the methodology is published and reproducible; and the evidence base (the Heavy Metal Index) is operated as a property architecturally separate from the certifier, so the limits are not the certifier’s self-published justification. The independence is engineered, in part, for exactly this challenge.
AI-assisted standard-setting. The limit values were produced using AI instruments under a named human principal investigator, with cross-vendor verification, and the manual discloses this openly. Whether that disclosure reads to a judge or jury as rigor or as a vulnerability is a question on which your view would be valuable; it is flagged here rather than glossed.
Antitrust / standard-setting. At current scale the risk is low, but as the mark approaches a retailer gating function it warrants the usual voluntary-standard hygiene. Worth naming so it is not a surprise later.
It is not immunity. The program says so itself. It is risk mitigation and an evidentiary posture, not a safe harbor and not a liability shield. Client expectations should be set there.

Raising it with a client, and with a colleague

The register that works with this audience is risk counseling, not salesmanship, and the relationship should be disclosed plainly: a lawyer flagging a program operated by a client or principal must say so, both as a matter of candor and because undisclosed interest is the fastest way to lose a peer’s trust. Guiding a client through certification and the testing-and-disclosure patchwork — California’s AB 899, Maryland’s Rudy’s Law, the FDA action levels, Proposition 65 — is advisory work that expands the relationship rather than shrinking the docket. The ask should be small: a short call, and an introduction.

The following are drafting starting points, to be adapted and sent under your own name and disclosure.

To a peer who defends food brands. [Name] — flagging this peer to peer; disclosure first, I work with the Paleo Foundation, which runs the Heavy Metal Tested & Certified program. After Palmquist pushed these cases into state court and the MDL’s Daubert ruling narrowed the causation fight, I’ve started thinking about third-party certification less as a marketing seal and more as an evidence-generation protocol: contemporaneous, independent, continuous testing that produces the kind of record a state jury actually weighs. It maps to the testing your infant-food clients already owe under AB 899 and Maryland’s law. If any of your matters touch this, happy to share what I’ve learned — fifteen minutes whenever suits.

To in-house or regulatory counsel at a brand. [Name] — disclosure up front: I work with the Paleo Foundation, which operates the HMTc program. Between AB 899, Maryland, and the FDA action levels you’re already building a testing-and-disclosure operation; HMTc standardizes it into one program, with limits tied to the published literature and capped at the regulatory ceiling, and produces an independent, contemporaneous record that supports a due-care narrative if you’re ever challenged. It is documentation and risk mitigation, not a liability shield, and I’d characterize it to you honestly on both counts. Worth a short call, and I can bring their team in.

A short referral. [Name] — flagging the HMTc program for [client]. (Disclosure: I work with the Paleo Foundation, which runs it.) Independent heavy-metal certification, literature-backed limits, useful for both the compliance patchwork and the evidentiary record post-Palmquist. Glad to make an introduction if it’s relevant.

The first step

The program offers a confidential evidentiary-posture review for a brand or retailer: does the current testing record meet the contemporaneous, independent, and continuous test; if it does, what is the path to certification; if it does not, what is the path through the Confidential Remediation Track. For counsel, the more useful framing is the one this briefing has tried to earn: the record that decides a post-Palmquist state-court trial is built before the complaint, or it is not built at all. The brands and retailers in the strongest position three years from now are the ones generating that record today, while it can still be the kind that helps.

Karen Pendergrass is the Standards Architect of the Heavy Metal Tested & Certified program at the Paleo Foundation. She can be reached at karen@paleofoundation.com. This briefing is informational and is not legal advice; the legal characterizations in it are the Foundation’s and are offered for counsel’s independent evaluation.

References

[1] In re: Baby Food Products Liability Litigation, MDL No. 3101, U.S. District Court for the Northern District of California.

[2] Rule 702 / general-causation ruling, In re: Baby Food Products Liability Litigation (N.D. Cal., Dec. 2025) (excluding five of plaintiffs’ six general-causation experts). Counsel should consult the order directly for its exact holdings and any post-ruling developments.

[3] Hain Celestial Group, Inc. v. Palmquist, 607 U.S. ___ (No. 24-724), decided Feb. 24, 2026 (Sotomayor, J.; unanimous; Thomas, J., concurring).

[4] U.S. House of Representatives, Committee on Oversight and Reform, Subcommittee on Economic and Consumer Policy, “Baby Foods Are Tainted with Dangerous Levels of Arsenic, Lead, Cadmium, and Mercury,” Staff Report, Feb. 4, 2021.

[5] International Organization for Standardization, ISO/IEC 17025:2017, General Requirements for the Competence of Testing and Calibration Laboratories; EURACHEM/CITAC and ILAC-G8 guidance on decision rules under measurement uncertainty.

[6] K. Pendergrass, HMTc Infant and Child Foods Program Manual, 2026 Edition, the Paleo Foundation, 2026. doi: 10.5281/zenodo.20270512. Confidential Remediation Track and discovery dynamics: Part 2.4; decision rules under measurement uncertainty: Part 2.6; aggregate-exposure framing and analyte tiering: Part 2.2; program-level adversarial read: Part 5.2.

[7] California Assembly Bill 899 (2023), amending the California Health and Safety Code (baby-food heavy-metal testing and public disclosure).

[8] Maryland “Rudy’s Law,” HB 97 / SB 723 (2024) (baby-food heavy-metal testing and labeling).

Heavy Metal Certified

Index