RadiantGraph Data Science Team

RadiantGraph Data Science Team

A/B Testing at RadiantGraph

A/B Testing at RadiantGraph

Did the campaign work, or was it going to happen anyway?

Why A/B Test?

Did the campaign work, or was it going to happen anyway?

Members schedule appointments, refill scripts, and complete screenings for all kinds of reasons. Without a comparison, there's no way to know what changes were driven by the campaign.

That's what A/B tests are for. Control members continue with the status quo, whether no outreach or standard messaging. Treatment members get the new approach. The difference between the two is the campaign's impact, separated from everything else driving behavior.

In practice:

  • On a voice outreach program, Treatment members get the call and Control members don't, so the gap in follow-through reveals what the call is worth.

  • On an email campaign, Treatment gets the personalized version and Control gets the standard one, so the gap shows whether personalization earns its keep.

The same framework behind clinical trials underpins every RadiantGraph campaign.


Key Steps in the Process

Step 1: Determine the Right Sample Size

An A/B test starts with the question: do we have enough people to get a reliable answer? Running a test with too few participants is like flipping a coin twice and concluding it's not fair, the result isn't meaningful.

To estimate a sufficient sample size, RadiantGraph uses statistical power analysis: a standard technique that helps us determine how large a test needs to be in order to detect a real effect with confidence.

For a deeper dive, see this overview from Statsig or the Wikipedia article on statistical power.

Running a power analysis requires three inputs:

  • Total eligible population: How many people could potentially be reached?

  • Baseline event rate: What percentage of eligible contacts currently take the target action?

  • Expected improvement: How much might that rate improve under the new intervention?

With these three inputs, power analysis tells us the minimum number of participants needed in each group and helps us choose the right Treatment/Control split ratio.

While a 50/50 split is statistically optimal, practical considerations, like cost or prior confidence  can warrant an unequal split like 70/30.

For more on sample size calculation, see this guide from Qualtrics.

Step 2: Assign Treatment and Control Groups

Once the target sample size and split ratio are confirmed, RadiantGraph assigns individuals to groups using one of two approaches:

  • Rule-based assignment: A simple, deterministic method. For example, using the last digit of a member ID or phone number to assign group membership. This is consistent and auditable: a given individual will always land in the same group unless the split ratio changes.

  • Algorithm-based assignment: For more complex scenarios, we can use a stratified approach. This ensures key characteristics (e.g. age or health status) are balanced across Treatment and Control groups, reducing the risk that a confounding variable is driving an outcome.

Step 3: Run the Test and Measure Results

With groups assigned, the test runs over a defined period. RadiantGraph evaluates performance along two dimensions: a statistical check, followed by the metrics.

Statistical significance

We apply well-established statistical tests, including the chi-squared test, Fisher's exact test, or the two-proportion Z-test, to confirm that observed differences reflect a real effect, not random chance. 

These tests produce a p-value. A low p-value (typically below 0.05) tells us the result is unlikely to be a fluke. 

Statistical significance is a sanity check: an important filter, but not the final word on whether a campaign was worth running.

Business Impact.

The question that matters most is: did the intervention meaningfully move the needle?A result can pass the statistical test while producing an effect so small it has no practical value. 

What we care about are concrete outcomes: did more members enroll? Did engagement improve? Did the campaign generate meaningful return? 

Tracking these metrics makes it easy to compare performance across campaigns and experiments over time, giving your team a consistent basis for prioritization.

Metric

Description

Enrollments (Treatment)

Contacts in the campaign who completed the target action (enrollment, registration, appointment, etc.)

Enrollments (Control)

Contacts not in the campaign who completed the same action

Conversion Rate

Enrollments / total contacts

Lift (Treatment vs. Control)

Difference in conversion rate between Treatment and Control groups

Relative Lift

Lift as a percentage of the Control rate

LTV per Conversion

Client-defined lifetime value per enrollment

Total LTV Generated

LTV per conversion × Treatment group enrollments

KPI Targets

Client-defined success thresholds (e.g., >1% conversion rate, >10% lift)


From Learning to Action

Once the test has run long enough to generate reliable results, it's time to act on what we've learned. The next step depends on what the data shows. There are three scenarios to consider:

  • Treatment outperforms Control: We scale up, shifting more (or all) eligible contacts into the Treatment group to maximize impact.

  • Treatment underperforms Control: We pause or revise the intervention, an equally valuable outcome, since it protects resources and prevents a suboptimal experience from being deployed at scale.

  • Results are inconclusive: We keep the test running and define a clear timeline for re-evaluation, ensuring a decision is made based on sufficient data.


A Shared Commitment to Rigor

Clinical teams don't skip the control group because guessing is too expensive. Healthcare outreach shouldn't either. Every untested campaign is a guess about what's actually working, and those guesses pile up across quarters.

RadiantGraph brings that same standard to outreach: tests big enough to trust, simple enough to act on, each one sharpening the next.

Join our newsletter.

Platform

Solutions

Resources