Online education faces renewed scrutiny in 2024. The US Department of Education has targeted fully online programs as part of its negotiated rule-making process, while emerging research raises questions about their effectiveness across all types of institutions. For an excellent overview of the regulatory landscape, I recommend the EdTech Newsletter by analysts Phil Hill and Glenda Morgan.
At the center of the research debate is an important new study published in the American Educational Research Journal. The paper, "The Role and Influence of Exclusively Online Degree Programs in Higher Education" by Justin Ortagus, Rodney Hughes, and Hope Allchin (OHA), makes two distinctive contributions to the debate on online education. Unlike previous research that focuses on individual online courses, OHA examine complete online degree programs. More controversially, they conclude that all such programs require stricter oversight, calling for enhanced regulation from both the Department of Education and accrediting bodies.
Results, Recommendations, and Study Design
OHA's central finding is stark: students in fully online degree programs are less likely to complete their bachelor's degree compared to similar students taking at least some face-to-face courses. The researchers emphasize this applies specifically to entire degree programs conducted online, not individual online courses or hybrid programs. In short, online degree programs suck for all students across all institution types.
Policy Recommendations
Based on these findings, OHA propose four major policy recommendations:
Mandate transparent reporting of costs and revenues for exclusively online programs
Implement regulations for non-profit institutions' use of online program managers (OPMs)
Require enhanced institutional support services for online students
Establish more rigorous accreditation standards for fully online programs
Study Design
In educational research, establishing true causation is challenging. Most studies identify correlations - for instance, showing that students who take online courses have different outcomes than those who don't. However, correlation alone doesn't tell us whether the online format caused these differences or if other factors are responsible.
Randomized Controlled Trials (RCTs) are considered the gold standard for determining causation. In an RCT, researchers would randomly assign students to either fully online or traditional programs. However, such experiments in education are often impractical, expensive, and potentially unethical - we typically can't randomly force students into different educational paths.
This is where quasi-experimental design comes in. As Joshua Angrist, 2021 Nobel Laureate in Economics, explains: "A good quasi- or natural experiment is the next best thing to a real experiment."1 In OHA's study, the researchers use statistical techniques to create comparable groups of online and in-person students, essentially mimicking what an RCT would do. They match students with similar backgrounds, academic preparations, and other characteristics, with the only major difference being their enrollment in fully online versus traditional programs.
Quasi-experimental methods, while powerful, have limitations. Without access to OHA's underlying dataset, it's difficult to fully evaluate how well they controlled for various factors that might influence student success. Additionally, the specific type of quasi-experimental design they use has its own unique constraints and assumptions.
While I a offer a “friendly critique” of their methodology later in this post, it's worth noting that quasi-experimental designs have become increasingly respected in education research. They represent one of the major innovations in social science research over recent decades, offering a practical way to study real-world educational interventions while maintaining scientific rigor.
My Approach
In reviewing this research, I follow two fundamental principles drawn from statistical pioneers and research methodology in quasi-experiments. This approach prioritizes simplicity and directness over complexity.
The Detective’s Mindset
John Tukey, one of the 20th century's most influential statisticians, emphasized the importance of understanding data through simple, direct methods before employing complex statistical analyses. He introduced the concept of "Exploratory Data Analysis" (EDA), drawing a parallel to criminal investigation:
The processes of criminal justice are clearly divided between the search for the evidence — in Anglo-Saxon lands the responsibility of the police and other investigative forces — and the evaluation of the evidence's strength — a matter for juries and judges ... Exploratory data analysis is detective in character. Confirmatory data analysis is judicial or quasi-judicial in character.2
In the context of online education research, this means first examining basic patterns: How do completion rates actually differ descriptively? What student characteristics correlate with success or failure? Are there obvious patterns in the data that might explain the results? Just as a detective gathers evidence before forming theories, we need to understand these fundamental patterns before putting forward complex models, including causal explanations.
The Power of Simple Explanations
My second guiding principle comes from some best practices in quasi-experimental research methodology. A seminal paper “The Use and Interpretation of Quasi-Experimental Studies in Medical Informatics” in the Journal of American Medical Informatics Association emphasizes a crucial question when evaluating quasi-experimental studies:
"Are there alternative explanations for the apparent causal association?"
If alternative explanations exist, particularly simpler ones, they should be considered before accepting more complex causal relationships.
This principle is particularly relevant when analyzing online education outcomes. Before attributing differences in completion rates to the online format itself, we should consider straightforward alternative explanations: Do online programs attract different types of students? Are there fundamental differences in support services? Do time management challenges play a role?
By following these two principles - thorough initial exploration and consideration of simple explanations - we can better evaluate OHA's findings and their policy recommendations. This approach acknowledges that while complexity in research design can be necessary, complexity breeds errors. We should first exhaust simpler explanations before embracing more complex ones.
Asymmetrical Expectations and Accountability
Before examining OHA's findings, we must address a fundamental bias in how online education is evaluated. Robert Ubell, who led successful online programs at Stevens Institute of Technology and NYU's Tandon School of Engineering, makes a crucial observation in his book Going Online:
At the very start, the equation is symmetrical, with online forced to stand-up against the nearly universal acceptance of the schoolroom as the norm. Rarely does anyone ask, “Is the classroom any good”. But it’s the first question raised about digital education. The claims made by those who believe that the traditional classroom is the only proper place for teaching and learning are hardly ever supported by evidence.”3
This double standard becomes particularly striking when we consider what learning science actually tells us about traditional instruction. The research on educational effectiveness by leading scholars like Nobel laureate Carl Wieman, Harvard's Eric Mazur, and Carnegie Mellon's Kurt Koedinger consistently demonstrate a key finding: passive learning modalities, including traditional lectures, significantly underperform compared to active learning approaches. Their body of research, along with numerous other studies, challenges the assumption that traditional lecture-based instruction should be considered the default standard against which other methods are measured.
Yet this overwhelming body of research evidence rarely leads to calls for increased oversight of traditional programs.
Consider:
Traditional programs with low completion rates face little or no regulatory scrutiny
Lecture-based courses persist despite clear evidence of their limitations
Face-to-face programs rarely need to prove their effectiveness
Traditional institutions aren't required to demonstrate cost-effectiveness or provide “enhanced” student services
This asymmetry in accountability raises a crucial question: Why do we demand extensive evidence of online education's effectiveness while accepting traditional methods without similar scrutiny? As we examine OHA's research, we should keep this double standard in mind and ask whether their policy recommendations reflect an inherent bias toward traditional instruction.
Let’s turn now to the new research on online learning.
New Research: The Problem With Exclusively Online Degree Programs
OHA's research centers on a striking claim: students in fully online programs complete their degrees at notably lower rates than their peers. Their key finding is specific and quantitative:
Results from our analyses suggest that students who enrolled exclusively online were 8.3 percentage points less likely to complete bachelor’s degrees relative to students who did not enroll exclusively online.4
How do OHA demonstrate this? Mimicking an RCT, they set up two groups for comparison:
Group A (online): Students who enrolled in exclusively online degree programs
Group B (not online): Students who took at least some in-person courses
While this comparison raises methodological questions about Group B's composition (which we'll examine later), the central claim is clear: Group A students complete their degrees at much lower rates than Group B students.
Based on these findings, OHA offer candid advice to students:
When students have a choice between enrolling in at least some in-person courses or exclusively online degree programs, our analyses suggest that students should enroll in some in-person courses.5
As stated above, OHA also recommend tough policy changes, including enhanced regulatory oversight of online programs by the Department of Education and stricter accreditation standards for online degree programs.
Let's now examine the the research methodology and findings.
Comparison of Student Groups: Online vs Mixed-Mode Enrollment
As stated earlier, OHA’s analysis is built on comparing two groups of students: Group A (online) and Group B (not-online). In this quasi-experimental framework, Group B is positioned as the “control” group, and Group A is positioned as the “treatment” group. The aim is to assess whether the “treatment” of an exclusively online degree impacts the likelihood of degree completion.
A Brief Note on Research Design
The designation of "online degree programs" as a single treatment variable raises an important methodological concern. Unlike a clinical trial where researchers can administer standardized doses of a drug, online education is far from uniform. The learning experience varies dramatically based on multiple factors:
Institutional Factors:
Quality of online learning platforms
Availability and depth of student support services
Program design and structure
Resource allocation and investment
Faculty training and support (e.g. most online courses are taught by adjuncts)
Student Characteristics:
Self-directed learning capabilities
Technology access and proficiency
Work and family commitments
Prior academic preparation
Consider two contrasting scenarios: A student at a well-resourced nonprofit university might access state-of-the-art learning platforms, comprehensive support services, and extensive engagement opportunities. Meanwhile, a student at an under-resourced institution might face outdated technology, limited academic support, and minimal engagement options.
By treating these diverse experiences as a single "treatment," the analysis is likely to mask crucial differences in program quality and institutional support that drive student outcomes. This simplification risks attributing differences in completion rates to the online format itself, rather than to the varying contexts in which online education is delivered.
Understanding the Data and Group Construction
OHA's analysis draws from the 2012-2017 Beginning Postsecondary Students Longitudinal Study (BPS), tracking 22,500 students who started college in 2011-2012. Their comparison of online versus traditional education reveals three additional methodological challenges. Let's examine each, starting with a fundamental concern about sample size.
1. The Sample Size Problem
The study's first major challenge lies in its dramatically uneven group sizes:
Group A (fully online students): 8.8% of sample
Group B (some face-to-face instruction): 91.2% of sample
This 10-to-1 ratio creates several statistical complications. In observational studies like this one, such imbalance can:
Reduce our ability to detect real effects in the smaller group
Lead to biased estimates
Make results overly sensitive to outliers in the smaller group
While researchers have developed methods to address such imbalances, these solutions aren't perfect. They depend heavily on having complete, high-quality data about student characteristics. If important factors are missing or poorly measured these corrections may fall short. (Note: OHA, to their credit, employ propensity score matching and inverse probability weighting to address the imbalance.)
Moreover, the extreme size difference makes the analysis particularly vulnerable to assumption violations, which I discuss below. Even small departures from statistical assumptions can significantly distort results when groups are this unbalanced. While OHA acknowledge and attempt to address these challenges, the size imbalance casts a shadow over the study's conclusions.
2. The Control Group Problem
The second methodological challenge emerges from an apparent simplicity that masks a complex reality. At first glance, the comparison seems straightforward: Group A students take all their courses online, while Group B students have some in-person instruction. However, this surface-level distinction conceals a crucial problem: Group B (the control group) also encompasses an incredibly diverse range of educational experiences.
To understand this challenge, consider three students who would all be classified in Group B. Rachel, a working mother, takes just one evening class on campus each semester, completing the rest of her coursework online to accommodate her family responsibilities. Marcus, a traditional college student living in the dorms, attends all his classes in person and participates fully in campus life. Between these extremes is Jennifer, who splits her time evenly between online and in-person courses, balancing her preference for face-to-face learning with a part-time job.
These three students have fundamentally different educational experiences. Rachel's campus engagement is minimal, limited to those few hours each week in her evening class. Marcus enjoys full access to campus resources, regular face-to-face interactions with professors and peers, and impromptu study groups in the library. Jennifer navigates between these worlds, creating a hybrid experience that suits her circumstances. Yet in OHA's analysis, all three are lumped together in Group B, their distinct educational experiences treated as equivalent.
The problem becomes even more pronounced when we consider why students choose different combinations of online and in-person courses. These choices often reflect personal circumstances and student attributes that themselves influence academic success. A student with strong time-management skills and reliable transportation might opt for more face-to-face classes, while someone juggling multiple jobs might minimize campus visits. A student with robust family support might choose full-time campus attendance, while a single parent might need the flexibility of mostly online courses.
When we see that Group B students complete their degrees at higher rates than Group A students, what are we really measuring? Are we seeing the impact of in-person education, or are we observing the advantages enjoyed by students whose life circumstances allow them to attend traditional classes? The study's headline finding – an 8.3 percentage point difference in completion rates – may tell us less about the effectiveness of online education and more about the challenges faced by students who need its flexibility.
Statistically, the composition of Group B is likely to lead to a blurring or dilution effect. This occurs when impact of a treatment is obscured because the control group includes cases that are partially treated. Recall that a student who takes all their courses online except for one course in-person is still classified in Group B. This matters enormously also for policy recommendations. If we attribute completion rate differences to online delivery when they actually stem from student circumstances, we risk implementing policies that address the wrong problem. Instead of focusing exclusively on regulating online programs, we might need to consider how to better support students whose life situations require educational flexibility.
3. Enrollment Distributions of Group A vs Group B.
The third and perhaps most revealing characteristic of the data, surfaced with Exploratory Data Analysis, concerns institutional distribution. The analysis suggests that the type of institution a student attends, rather than their mode of learning, might explain much of the completion rate gap that OHA attribute to online education.
We can see in Table 2 that Group A (online) is heavily concentrated in for-profit four-year institutions.
This matters enormously because completion rates vary dramatically by institution type. See Table 3. Private four-year institutions achieve a 65% completion rate, while for-profit four-year institutions manage only 22% - roughly a third of their private counterparts.6
This institutional variability has direct implications for degree completion. When we calculate the expected completion rate for Group A based on their enrollment distribution across institution types, we arrive at 33%. This low rate isn't surprising given that more than half of Group A students attend for-profit institutions with their characteristically low completion rates. See Table 4 for the calculation.
But what about Group B? Here we face an interesting gap in the research. OHA don't provide the institutional distribution for Group B. But we can conduct a thought experiment. By constructing a plausible scenario where Group B students are more heavily concentrated in higher-performing institutions - while keeping the completion rates for each institution type constant - we arrive at an estimated 41% overall completion rate for Group B. See Tables 5, 6.
The result is striking: the hypothetical distribution generates an 8% difference in completion rates between groups - similar to OHA's reported findings. (Note: I constructed Group B’s distribution to illustrate how the same discrepancy can occur independently of the mode of instruction.) This discrepancy in outcomes raises a crucial question: Are we measuring the impact of online education, or are we simply observing the effect of institutional differences?
To be clear, this analysis doesn't invalidate OHA's research. However, it suggests an alternative explanation that deserves serious consideration. If the enrollment distribution across institution types accounts for most or all of the observed completion rate gap, then the mode of instruction (online vs. face-to-face) may play a much smaller role, or no role at all, than OHA's conclusions suggest.
This is precisely the kind of straightforward pattern analysis that Tukey's exploratory data analysis approach recommends. While anyone with access to the dataset could verify the actual Group B distribution, the broader point remains: before attributing outcomes to online delivery, we must first understand the underlying institutional landscape and student characteristics in which that delivery occurs.
In short, I have shown that a significant portion of the difference between Group A and Group B might be due to institution type, rather than mode of instruction. A more thorough EDA might uncover other alternative explanations.
The Institution Type Dilemma: A Key Methodological Gap
A critical limitation in OHA's analysis lies in their treatment of institutional type - whether students attend for-profit, public, or private institutions. While they acknowledge that online students are more likely to attend for-profit institutions, they don't fully explore how this factor shapes their findings. This oversight could significantly impact their conclusions.
Institution type can influence the analysis in at least two distinct ways:
As a Covariate: Institution type might simply be another factor affecting completion rates, alongside mode of instruction. If so, controlling for it would help isolate the true effect of online learning. Think of this like adjusting for student age or prior GPA - it helps create a clearer picture of how online education itself affects outcomes. Institution type does not appear as a covariate in the study.
As a Confounder: More problematically, institution type might be what statisticians call a confounder - a factor that influences both whether students choose online education AND their likelihood of graduating. For example, for-profit institutions might be more likely to offer online programs AND have lower completion rates for reasons unrelated to online delivery. In this case, controlling for institution type could actually be misleading - it would obscure a fundamental relationship that needs to be examined directly. The lower completion rates at these institutions may be inseparable from their tendency to offer online programs. Institution type also does not appear as a confounder in the study. Indeed, the study does not make an explicit methodological distinction between covariates and confounders. This is a significant methodological shortcoming since in causal analysis the distinction between covariates and confounders is fundamental because they require different analytical approaches.
To their credit, OHA attempt to address potential hidden biases through sensitivity analysis - a statistical technique for testing how robust findings are to various assumptions. However, their approach has two key limitations:
While OHA conduct Rosenbaum's sensitivity analysis to test how robust their findings are to potential unmeasured confounders, this method examines the magnitude of bias needed to invalidate results rather than directly addressing whether specific confounders like institution type are properly incorporated. If institution type is a major confounder that isn't properly specified in the main analysis, the sensitivity analysis can tell us how strong such unmeasured confounding would need to be to explain away the results, but cannot directly tell us whether institution type specifically is that confounder. OHA’s statement of sensitivity analysis is also misleading: “Our sensitivity analysis checked all outcomes and model specifications, revealing that it would be unlikely (emphasis mine) that an observed treatment effect associated with enrolling in an exclusively online degree program was caused by an unmeasured or hidden confounder.” It is misleading because it implies the improbability of a hidden confounder, which is not what Rosenbaum’s sensitivity analysis can assess. Rosenbaum’s method is designed to measure how strong a potential unmeasured confounder would need to be to nullify or significantly weaken the observed treatment effect, not to estimate the likelihood of such a confounder existing. We can think of sensitivity analysis this way: “It’s like knowing that only a very strong wind could blow down a house, but not knowing whether there’s actually a hurricane coming.” Rosenbaum’s method essentially says “Whatever hidden factor(s) might be out there would need to be quite powerful to explain away our results.” But it doesn’t tell us whether such factors do or don’t exist; nor help us identify what the confounders might be; nor rule out the possibility that several confounders working together could have as strong an effect.
The study relies on what statisticians call the "unconfoundedness assumption" - the idea that after controlling for observed variables, enrollment in online programs is independent of other factors affecting graduation. If institution type is indeed a confounder, this fundamental assumption may be violated. It's like studying whether private tutoring improves test scores without considering that families who can afford tutoring also tend to live in better-resourced school districts. The observed "tutoring effect" might actually capture the impact of overall educational resources rather than tutoring itself.
Understanding Unconfoundedness
The unconfoundedness assumption (also called conditional independence) states that potential outcomes are independent of treatment assignment, conditional on observed covariates. In other words, after controlling for observed variables, treatment assignment should be as good as random. This assumption is crucial for causal inference in observational studies.
This assumption fails spectacularly if there are unmeasured confounders that affect both treatment assignment (selection into online education) and outcomes (graduation). The systematic patterns we observe suggest this may be the case:
Institution type affects both who enrolls in online programs and graduation rates
Student characteristics influence both choice of institution and likelihood of graduation
Institutional resources affect both ability to offer online programs and student support
The Policy Trap
This methodological issue has profound implications for policy recommendations. OHA advocates for stricter regulation of online programs based on lower completion rates. But if institution type drives these differences, their proposed policies might:
Target the Wrong Problem
Increasing oversight of online delivery won't help if the real issue is institutional capacity and support
Resources spent on online program regulation might be better invested in institutional improvement
Create Unintended Consequences
Stricter regulation of online programs might disproportionately burden institutions serving non-traditional students
Higher compliance costs could force some institutions to eliminate online options, reducing access for students who need flexibility
Miss Critical Opportunities
Instead of broad online program regulations, policies could target specific institutional practices
Resources could focus on helping lower-performing institutions adopt successful practices from higher-performing peers
A More Nuanced Approach to Public Policy
Better policy would recognize the interplay between institutional characteristics and delivery mode. This might include:
Differentiated support based on institutional capacity and student needs
Investment in institutional infrastructure alongside online program quality
Focus on specific success factors within institutional contexts
Support for evidence-based practices that work across delivery modes
Without properly accounting for institution type as a confounder, we risk implementing policies that address symptoms rather than causes, potentially making it harder for non-traditional students to access higher education while failing to improve their chances of success.
Conclusion
OHA's research makes an important contribution by examining complete online degree programs rather than individual courses. Their large-scale analysis using national data and quasi-experimental methods raises important questions about student success in online education and provides valuable insights into completion rate differences.
However, their analysis, like many complex studies in education, faces methodological challenges. One key consideration is how to account for institution type - whether a school is for-profit, public, or private - in the analysis. This becomes particularly relevant given that over half of exclusively online students attend for-profit institutions, which have historically lower completion rates regardless of delivery mode.
My preliminary analysis suggests that institutional composition could explain a substantial portion of the completion rate gap that OHA attribute to online delivery. While their sensitivity analysis is thorough, institution type emerges as a potentially important factor that deserves further investigation.
These observations have important policy implications. While OHA's recommendations focus on increased regulation of online programs, a more nuanced approach might consider both delivery mode and institutional factors. Future research could build on OHA's work by examining how institutional characteristics and delivery mode interact to influence student success.
The goal is to extend and refine our understanding of what drives student outcomes in online education, leading to more targeted and effective policy interventions that support student success across all delivery modes.
Angrist, Joshua D. "Randomized trials and quasi-experiments in education research." NBER Reporter Online Summer 2003 (2003): 11-14.
Tukey, John W. "Exploratory data analysis." Reading/Addison-Wesley (1977).
Ubell, Robert. Going online: Perspectives on digital learning. Taylor & Francis, 2016.
Ortagus, Justin C., Rodney Hughes, and Hope Allchin. “The problem with exclusively online degree programs.” The Century Foundation. https://shorturl.at/BT4aZ
Ibid.
The degree completion rates come from references cited in OHA’s paper.