Main Body

2. Key Ideas for Evaluating Scientifically-Based Approaches to Literacy Instruction

Kristen A. Munger


This chapter focuses on key ideas for evaluating scientifically-based literacy research, as well as ways these ideas can guide educators in making instructional decisions most likely to bring about benefits to students. The chapter includes an overview of how to recognize which literacy research is scientifically-based and where to find it, how to recognize other valuable forms of research, the special role of experimental design in scientific thinking, and a rationale for monitoring student progress to ensure that literacy approaches promote learning, not just within the context of research but also when implemented in classrooms.

Learning Objectives

After reading this chapter, readers will be able to

  1. describe the difference between empirical and non-empirical research;
  2. explain the role of scientifically-based research in evaluating the effectiveness of literacy-based programs and strategies;
  3. describe the role of all forms of literacy research for informing literacy practices, including how findings from different types of research build upon one another;
  4. discuss characteristics of experimental and non-experimental research, and evaluate how research designs impact the kinds of research questions researchers can address;
  5. discuss the rationale for monitoring the progress of students when a specific instructional approach is selected over another, along with how to use this rationale to make educational decisions.


In 1989, a teacher’s manual by Dennison and Dennison was released containing solutions for improving students’ learning. Essentially, teachers were instructed to show students how to perform certain movements to activate brain regions to promote learning. By following the manual’s instructions, teachers could improve students’ “symbolic recognition for the decoding of written language” (p. 5), “spatial awareness and visual discrimination” (p. 6), “binocular vision” (p. 15), and “creative writing” (p. 29), along with dozens of other skills important to learning. Perhaps you have heard of this program and may have even experienced it yourself when you were a student. It is called Brain Gym (Dennison & Dennison, 1989) and has been immensely popular with educators in many countries since its release.

The purpose of the above description is not to decide whether Brain Gym movements are a good or bad idea. Teaching students various postures or breathing techniques is not likely to cause much harm and may have some benefits, but the point is that many educators purchased and used the program without questioning the claims made by the authors. What evidence is there that “cross crawl sit-ups” enhance “the mechanics of spelling and writing” (p. 13) or that “brain buttons” promote the “correction of letter and number reversals” (p. 25)? There is none, yet even today information for teachers featuring these exact claims are easily located. The most recent edition of the Brain Gym teacher’s manual1 (Dennison & Dennison, 2010) available at is advertised as “now the standard in many schools around the world and is recommended by tutors & teachers & those looking for a more functional satisfying lifestyle & sense [of] well-being” (, n.d.).

At this juncture, you might be thinking something along the lines, “What is the evidence of these claims about Brain Gym?” This question serves as a compelling segue to a central point of this chapter. Effectively evaluating the extent to which literacy approaches are backed by research is critical in an age of information excess and misleading advertising, as is determining how different kinds of research can inform literacy instruction. No matter how fervently program advertisers, educators, researchers, and the media assert that a program is effective and is backed by research, such claims still require some kind of evaluation of the strength of the evidence. This chapter provides a framework for teachers, administrators, and other educational professionals to better understand ways in which different levels of research support can be used to make decisions about literacy instruction.

Research and Scientific Thinking

The teaching profession has become increasingly aligned with the idea that literacy instruction and literacy materials must be “research-based” or “evidence-based” (No Child Left Behind Act, 2002). Although this alignment does not require that every decision teachers make be backed by research evidence, it does mean programs and approaches to literacy instruction that are linked to credible research findings at least have a strong track record of benefiting students. But locating research on literacy instruction is not always as straightforward as it seems. At the time of this writing, typing the phrase “research-based literacy instruction” into the Google search bar yields well over three million hits. Furthermore, since the terms “researched,” “research-based,” and “evidence-based” are often used interchangeably, even using different search terms does not narrow the field very much. In other words, typing “research-based” into search engines is not an ideal strategy for finding approaches to literacy with a meaningful research base.

In an attempt to clarify which literacy programs and strategies have a strong research base, various policy groups2 began using the term “scientifically-based research.” This term does carry a more precise meaning than “research-based”; however, to fully understand its importance, some background in scientific thinking is required. Although scientific thinking is important in guiding educational practices, it is not always appreciated or understood, partly because scientific thinking may be thought to apply only to hard sciences, such as chemistry, and not to fields like education. In addition, scientific thinking may be underused in education for reasons described by Stanovich and Stanovich (2003):

Educational practice has suffered greatly because its dominant model for resolving or adjudicating disputes has been more political (with its corresponding factions and interest groups) than scientific. The field’s failure to ground practice in the attitudes and values of science has made educators susceptible to the “authority syndrome” as well as fads and gimmicks that ignore evidence-based practice. (p. 5)

Based on these concerns, Stanovich and Stanovich recommend providing educators with background on how to think scientifically about literacy instruction so that the allure of gimmicks and fads is replaced by an understanding of how scientific thinking contributes to the selection and use of educational practices that are most likely to be effective.

Scientifically-Based Research

Scientifically-based research includes a broad array of research methods to answer questions about literacy. For evaluating research evidence specifically about the effectiveness of programs and strategies, it can be helpful to use certain criteria such as those identified in the No Child Left Behind Act (2002), as well as specified in other educational resources (e.g., IDEA, 2004; Stanovich & Stanovich, 2003). First, scientifically-based research is empirical, which means that it involves the systematic gathering and analysis of data that have been collected, often using measures that are reliable (i.e., consistent) and valid (i.e., they test what they claim). Empirical research includes using observation to address problems in a field rather than exclusively relying on theories or using only logic. Often when people think of the word “research,” they think of the process of reviewing information in existing documents, like what is done to write a research paper in high school or college. While scientifically-based research does involve reviewing existing research, it also involves testing hypotheses using data, and this is one important way that scientifically-based research differs from the broader term “research.” Second, scientifically-based research uses research design strategies that control for factors that can interfere with evaluating a program or strategy’s effectiveness. Later in this chapter, this key component of scientifically-based research will be described in more detail when the characteristics of experimental design are discussed. Third, scientifically-based research is reported in enough detail so others know exactly what was done, and with whom. Last, scientifically-based research must undergo a review by independent experts in the field before being dispersed to research outlets. This scientific review is often called a “peer-review” and is intended to help maintain quality standards for research that is published in outlets such as academic journals. Despite this quality assurance component, standards for peer-review may be somewhat different among academic journals, and therefore, scientifically-based research must not only make it through a peer-reviewed process but also must meet each of the other criteria outlined above.

Where to Find Scientifically-Based Literacy Research

Panel reports

Fortunately, literacy approaches with a strong scientific research base are relatively easy to find because major literacy panels and clearinghouses regularly release summaries of findings, using criteria which are comparable to those outlined above. One of the most influential panel reports to date on school-aged children is the Report of the National Reading Panel (NRP; National Institute of Child Health and Human Development [NICHD], 2000), which serves as a suitable example for how findings from scientifically-based research studies can be combined and made readily accessible to educators. Panel reports are publicly released documents created by literacy experts, who synthesize findings from research designed to evaluate the effectiveness of instruction.

The NRP provided readers with research evidence regarding the effectiveness of five major instructional components of reading (i.e., phoneme awareness, phonics, fluency, vocabulary, and comprehension). Panel members searched available research on these components and analyzed the findings of studies that met the strict criteria of being scientifically-based. In a free summary report, panel members then shared instructional methods and strategies that were found to be most effective, along with suggestions for future research (NICHD, 2000).

Literacy panels typically analyze study findings using a process called meta-analysis. While the particulars of meta-analysis are not needed to understand its value, it is important to know a little about why it is used and how findings are reported. Meta-analysis involves taking the findings from many studies and combining them to draw conclusions. In the case of the NRP, a series of meta-analyses was conducted to draw conclusions about the most effective approaches to reading instruction. You might wonder how findings across multiple studies on an approach to reading can be combined when studies differ in so many ways. The way researchers do this is by converting study findings to a common metric called an “effect size.” When a group who receives an intervention significantly outperforms a group who did not receive it, an effect size reveals how much better one group did over the other. Panels can average effect sizes across studies and compare them to decide which approaches have a greater positive effect on learning. This kind of analysis reveals the strengths (or weaknesses) of instructional approaches across a large collection of scientifically-based studies, rather than basing conclusions on findings from a single study.

Many examples of meta-analyses are featured in the NRP in which effect sizes were compared across multiple studies. The first major finding in the report related to the value of teaching phonemic awareness3 to young children. Phonemic awareness, as defined in the NRP is “the ability to focus on and manipulate phonemes in spoken words” (p. 2-1) and typically involves students being taught how to break apart and/or blend sounds in spoken words. When considering the collective findings from 52 studies, the panel found that teaching phonemic awareness to young students resulted in a strong, positive effect4 (d = .86) for improving phonemic awareness outcomes compared to students who were not taught it.

Larger effect sizes are found when an intervention group outperforms a control group. When effect sizes are compared across studies to determine which instructional approaches were most and least effective, teachers are then able to know which approaches brought about the most benefits to students. See Table 1 for more information about how to interpret the strength of effect sizes.

Effect Size (d) Qualitative Descriptor
Table 1. Interpretation of Intervention Effect Sizes (Cohen, 1988)
.20 Small Effect
.50 Medium Effect
.80 Large Effect

In the NRP report, effect sizes are reported for each of the major reading components, including phonemic awareness, phonics, fluency, vocabulary, and comprehension. The information is accessible to anyone who is interested in reading more about scientifically-based literacy practices and represents the most dependable findings available about reading up to the date of this publication.

Another panel report worth exploring is Developing Early Literacy: Report of the National Early Literacy Panel (NELP; National Center for Family Literacy, National Early Literacy Panel, & National Institute for Literacy, 2008). This report includes findings from scientifically-based research relevant to children ages birth to five, as well as useful recommendations regarding approaches to literacy which have strongest and most reliable effects for very young children.

Sources of scientifically-based literacy research designed to evaluate the effectiveness of literacy programs and strategies, such as panel reports, are vital documents for educators to know about and review. Unfortunately, panel reports take a long time to compile, and consequently, are released only intermittently. After a few years, as new research emerges, the reports can become dated. More current meta-analyses can often be found in academic journals, which are helpful to mine for scientifically-based findings, though readers must carefully review the criteria used to include and exclude studies. These criteria may differ from those used by panel reviewers, and interpretation of meta-analytic findings hinges on the quality of the studies reviewed. Also, because journal article meta-analyses may be written in more technical terms than those provided in panel report summaries, readers of these meta-analyses may find it beneficial to consult with someone with expertise in both literacy and statistics.

Clearinghouse and website outlets

Another way to access scientifically-based research findings is by searching research-based clearinghouses and websites. Unlike many panel reports, findings on research-based websites tend to remain more current, since they are easily updated to include findings from the most recent research in the field. Presently, the What Works Clearinghouse website (WWC; U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance, n.d.) is one of the most widely accessed websites containing scientifically-based literacy research findings. Clearinghouses use similar criteria as literacy panelists to review available research on the effectiveness of instructional methods and strategies. WWC provides information about educational programs, policies, and strategies and can be helpful in making instructional decisions. Despite its popularity, some organizations, such as the National Institute for Direct Instruction (NIFDI, 2015) have criticized WWC for ignoring intervention programs that should have been reviewed, for ignoring and/or misinterpreting research findings, and for incorrectly classifying a program’s effectiveness. Other researchers, including Timothy Shanahan (2015), who serves as an advisor to WWC, continue to advocate for the value of WWC because it “serves as a kind of Good Housekeeping seal of approval” (para. 3) in that WWC reviewers have no financial stake in programs that they review. A lack of financial interest is not always true for publishers and some researchers. Information about how to navigate the WWC website is presented next; however, because there is not universal agreement across organizations regarding the worth of WWC, it is recommended that educators seek information from a variety of resources to make decisions about literacy programs and strategies they use or may consider using.

For users of the WWC site, topics can be explored by clicking on links to “topics in education,” “publications and reviews,” “find what works,” or other links to recent news or events in education. To “find what works,” users click this tab, which takes them to a menu of options allowing comparisons of interventions frequently used in schools. Users then indicate the domain of interventions of interest to them—in this case—literacy, which is accessed by clicking the “literacy” box. A general list of interventions appears, sorted by literacy outcomes measured. For example, studies that measured intervention effects for alphabetics (i.e., phonemic awareness and phonics) appear, along with an improvement index, an effectiveness rating (an estimation of the average impact of the intervention), and a descriptor of the extent of the evidence supporting the intervention (i.e., small, medium, large). By clicking on underlined terms, pop-ups appear with explanations of what terms mean and how to use them. Users can also search for a specific program to see if it has been reviewed or examine lists of existing programs for which the clearinghouse has created reports.

A search for Brain Gym on the WWC site does not yield any results. This is due, at least in part, to the fact that there are no studies that meet the WWC design standards providing evidence for its use, and because the original claims featured in the teacher’s manual were likely not sufficiently credible for researchers to invest time or money to evaluate them. Conversely, a search for “repeated reading” yields a number of findings of potential interest, including a report regarding the use of repeated reading with students with learning disabilities. According to WWC, there is a small amount of evidence that using repeated reading with students with learning disabilities can lead to improvements in reading comprehension. On the other hand, there is no consistent evidence provided for using repeated reading to improve alphabetic knowledge, reading fluency, or general reading achievement.

Although panel reports and research-based websites can be good options for accessing findings from scientifically-based research, the design standards required for studies to be included on some of the sites, such as WWC, are more rigorous than most academic journal standards, and according to some critics, are too rigorous. Educators will want to keep this criticism in mind, since studies of potential interest may be excluded from these sources because of technical, design, or statistical problems. In cases where research is missing from these sources, it may be useful to search academic journals for additional information about a program or contact publishers directly for research studies. Academic journals are periodicals released monthly or four times per year that include articles related to academic disciplines, such as education. Educational research published in academic journals is typically peer-reviewed (though not always), and these journals usually contain articles related to specific themes, such as literacy teaching and learning, language development, intervention, teacher preparation, educational assessment, and educational policy. College and university libraries provide students and faculty free access to most education journals, and those to which a library does not subscribe can be accessed through a library’s interlibrary loan system. Examples of some of the top academic journals related to literacy include Reading Research Quarterly and the Journal of Educational Psychology; however, there are numerous other high quality journals available internationally, as well as journals that focus on specialized areas of literacy, such as English language learners, reading disabilities, or curriculum.

When you access academic journal articles or information obtained from publishers to determine the research base of a literacy program or strategy, it is important to understand the purpose of the research. For example, if a report or article makes a claim that a program causes a change for one group over another, it is important to keep in mind criteria for scientifically-based research. Knowing these criteria will help you be wary of advertisers’ claims based solely on opinions, testimonies, and anecdotes. If a report or article summarizes research that does not use scientifically-based research, information may still be valuable, as long as researchers do not overstep the claims that can be made. For instance, if researchers were interested in whether maternal education and children’s vocabulary knowledge were correlated, conducting a study could potentially help researchers understand relationships among variables. It would be a problem, however, if the researchers claimed that maternal education causes children to have different levels of vocabulary knowledge.

Although the WWC site is known for its focus on literacy research, visitors to the site can also find reviews on other important topics such as dropout prevention, math and science instruction, and behavioral interventions. Table 2 provides links to selected clearinghouse summaries, panel reports and other guiding documents useful for locating the research base for many literacy approaches currently used in schools. The link titled Reviews of Collections of Programs, Curricula, Practices, Policies, and Tools: Evaluated According to Evidence, compiled by Smith-Davis (2009), is a collection of dozens of meta-analyses at all grade levels and is particularly beneficial to explore.

Because this chapter is centered on scientifically-based literacy research, it is this type of research that gets the most coverage, but there are many other types of literacy research that inform literacy education. The benefits of a wide variety of literacy research will be discussed a little later in this chapter, as well as throughout the chapters in this textbook, where a broad array of research methods are featured.

Reports Links
Table 2. Links to Selected Clearinghouse Summaries, Panel Reports, and Meta-Analyses
Reviews of Collections of Programs, Curricula, Practices, Policies, and Tools: Evaluated According to Evidence
What Works Clearinghouse
Report of the National Reading Panel: Teaching Children to Read (Reports of the Subgroups)
Developing Early Literacy Skills: A Meta-Analysis of Alphabet Learning and Instruction Developing Early Literacy Skills: A Meta-Analysis of Alphabet Learning and Instruction Developing Early Literacy Skills: A Meta-Analysis of Alphabet Learning and Instruction
Developing Early Literacy: Report of the National Early Literacy Panel
Writing to Read: Evidence for How Writing Can Improve Reading

The Importance of Experimental Design in Literacy Research

A close examination of literacy programs and strategies classified as scientifically-based shows a heavy reliance on experimental design, when the goal of the research is to investigate effectiveness. Experimental design5 typically involves researchers randomly assigning participants to intervention and control groups. In schools, the word “intervention” often refers to services provided to students experiencing some sort of academic difficulty, but in experimental education studies, the term intervention is typically used more broadly to signify that a group of deliberately selected participants, who may or may not be at-risk for literacy-related difficulties, will receive some new instructional component or strategy to test its effectiveness.

In experimental literacy research, participants in an intervention group receive an intervention, while participants in a control (or comparison) group receive another competing intervention or sometimes no intervention at all. Although the use of random assignment may sound like technical nonsense, appreciating its importance is critical for understanding differences among literacy studies.

To illustrate the advantage of a researcher using experimental design for evaluating the effectiveness of an intervention, consider this example. Suppose as a researcher, I want to determine whether a repeated reading intervention causes students to read more fluently. To answer this question, I take a group of 100 children, measure their reading fluency, add a repeated reading intervention to their instructional day, and then assess my outcome. I find that my participants’ reading fluency increased considerably. Can I conclude that my repeated reading intervention was effective because it caused the improvement? Unfortunately, no. My students were still receiving their regular ELA instruction—maybe this caused the gains I observed. Maybe the students’ reading fluency improved as a result of having more reading experiences everywhere—not just what they received during my intervention. These potential competing explanations could easily account for the increases in reading fluency I detected, and because my design did not have a comparison group that allowed me to rule out these competing explanations, my study is seriously flawed for assessing the effectiveness of repeated reading.

Knowing that I neglected to include a comparison group, I try again. This time, I provide repeated reading to half of the students, while the other half of the students read with a partner. Now I have two competing interventions, and if I deliver them with the same frequency and duration, I have definitely improved my design. Then, I test both groups after delivering the interventions and find the repeated reading group scored significantly higher than the partner reading group. Can I finally conclude that my repeated reading intervention caused the differences? Unfortunately, no (again). There is every chance that my groups started out with unequal reading fluency even before the interventions were delivered (and were probably unequal in other ways, too). Since I gathered no data on the students prior to intervention, an important competing explanation simply cannot be ruled out—my repeated reading group may have started out higher in reading fluency. So, I still cannot conclude much of anything about the effectiveness of my intervention.

Finally, I decide to randomly assign my participants to groups—repeated reading and reading with a partner. This time I decide to use experimental design, and if my groups are sufficiently large, I have likely eliminated all meaningful differences between groups before delivering my interventions. Although even random assignment does not guarantee with 100% certainty that groups will be equivalent,6 it is the best method available to create equivalent groups from a research design standpoint. By using random assignment, my groups of students should be statistically equal according to age, gender, preintervention reading fluency ability, cognitive abilities, and socioeconomic status. This time, if I find statistically significant differences in favor of the repeated reading group, I may be able to reasonably conclude that the repeated reading intervention plays a role in causing the differences, if other scientifically-based criteria are also met.

It is also recognized that sometimes, intervention studies that do not use random assignment may be of sufficient quality to still be considered scientifically-based. Research that tests the effects of an intervention but does not include random assignment to groups is often referred to as “quasi-experimental” research and is used when it may not be possible, practical, or ethical to randomly assign students to groups. When researchers use quasi-experimental design to assess the impact of instructional strategies, they must also provide evidence that the groups are equivalent prior to intervention, or if the groups are not equivalent, that statistical adjustments were made to remove preintervention differences. These statistical controls are why some quasi-experimental designs may also be classified as scientifically-based. WWC includes some quasi-experimental studies in their reviews but classifies them as meeting design standards “with reservations.” This more tentative classification is used because research that does not include random assignment to groups does not meet the same standard as experimental design.

Another thought to keep in mind when reviewing scientifically-based studies is to whom the findings actually apply. This decision requires knowledge about the characteristics of the participants who were included in the studies, since an intervention found to be effective with one participant sample may or may not carry over (i.e., generalize) to other students. If the participants in the study were mostly White students from high income neighborhoods, will this “effective” intervention still result in significant gains with a more diverse population of students? Analogous to this situation is when in the not-too-distant past, medications that were tested only on men were commonly prescribed for women and children because it was assumed that positive effects were universal rather than sample specific. While findings from studies including only men may generalize to women and children, in reality, we cannot actually know a drug’s effectiveness for these populations until we have actual evidence from studies that include women and children. This same logic applies when reviewing findings from scientifically-based literacy approaches as well by making sure to ask, “To whom do these findings actually apply?”

Additional Forms of Research

Unlike findings from experimental studies, results from other types of studies are not usually used to determine the effectiveness of an instructional approach to literacy, although they are often a first step in the search for effective approaches. For example, simple correlational studies, which investigate relationships between variables, can be helpful to researchers who make decisions about what variables to test for effectiveness, but correlational findings are especially difficult to interpret when trying to assess cause and effect. For example, although there is a measurable relationship between owning a passport and not being diagnosed with diabetes (Wade, 2011), it does not follow that owning a passport prevents diabetes. When other potential explanations are considered, such as the fact that people with higher incomes are less likely to be obese, can afford healthier food, can obtain better medical care, and can afford to travel abroad, it becomes clear that the relationship between passports and diabetes does not mean that one causes the other. In other words, even though a lack of passport ownership and having diabetes are statistically related, they are not causally related. Using this reasoning, simple correlational research does not do a good job of evaluating cause and effect, in part, because alternative causes of an effect are not well controlled.

On the other hand, more sophisticated correlational research where important variables are controlled can help researchers focus in on possible causes. For example, think about a researcher who wants to understand the nature of the relationship between the number of books children have in their home and school achievement. Is the relationship between books and achievement accounted for by how much families read with their children, or perhaps the number of books is simply related to family income? Is it possible that families with higher incomes not only can afford more books but also can afford a host of other learning opportunities for their children? Perhaps having more books in the home is simply an indicator of learning opportunities that lead to achievement rather than the books being the cause. What a researcher can do is look at the relationship between books in the home and student achievement, while removing the effects of income (i.e., controlling for this variable). If the books are not having any effect, then the relationship (i.e., correlation) between books and achievement will disappear. The researcher still does not know exactly what is influencing achievement but may be able to rule out a possible cause. Thus, a campaign to provide lower income families with lots of books may be valuable for a number of reasons; however, it may be off track if the goal is to increase student achievement, depending on the findings of the study. This example shows how more sophisticated correlational research can help inform literacy program and strategy decisions.

Other research methods, such as qualitative research, are also useful for informing literacy practices. Qualitative research frequently involves observing and interviewing participants to obtain detailed accounts of their personal experiences and perspectives. These methods meet the criteria for being classified as empirical research, since observations are gathered and analyzed, but they are not necessarily used when the goal is to evaluate the overall effectiveness of programs or strategies. In qualitative studies, researchers do not focus on testing hypotheses and typically do not try to exert strict control in their studies. These less rigid characteristics of qualitative research are desirable for exploring literacy processes that are unmeasurable or are very complex (e.g., exploring how students make meaning of texts based on their own cultural backgrounds and experiences). Like correlational research, qualitative research can provide valuable information toward the formation of new theories and can inform literacy instruction by exploring what may be effective, and additional research that is more controlled can help determine what is effective.

It is important to note that some people take issue with classifying research based on more narrow definitions of scientifically-based research because if research is not recognized as scientifically based, it may be mistakenly thought of as not being valuable, especially by individuals who lack background in how different research strategies inform literacy teaching and learning. For example, some people may assume that quantitative research involving hypothesis testing is always more valuable than other forms of inquiry; however, since all forms of literacy research build upon the findings from other studies, all forms of research have potential value when they serve to refine future research to inform literacy practices. To avoid the trap of demeaning certain types of research, it is best to keep in mind the actual goal of the research when determining its value. To assess cause and effect, experimental studies are most useful, whereas to understand complex literacy perspectives and processes, a wider variety of methods are used. Both the quality and value of research must always be tied to questions the research was designed to answer, and the extent to which it answers them, rather than tied to a superficial sorting based solely on the methods used.

Approaches with No Research Base

After gaining an understanding of different types of literacy research, it makes sense to also reflect on literacy approaches that have no research base. While it may seem unacceptable to even consider using a literacy program or strategy that has no research supporting it, a lack of a research base may simply indicate an approach is too new to have allowed anyone to complete a well-designed study, or an approach may be too indistinct or too narrow to capture any researcher’s interests. For example, a scientifically-based research study is not likely to happen that compares whether reading one storybook versus another one is better to teach children about penguins. Keep in mind that research designed to evaluate the general effectiveness of literacy programs and strategies is time consuming and expensive, so many decisions teachers make will not have this kind of research base supporting it.

It is also possible that an approach to literacy instruction is not worth the time and resources to evaluate because it is so unlikely to have a meaningful effect, based on what is already known about literacy from other research. In addition, research on certain approaches to literacy may appear like they have not been researched, but this may be because the term someone is using to search for the approach may not be the same term used by researchers. Each of these possibilities is explained in further detail below.

Approaches which are too new

An example of literacy instruction that may be too new to have garnered research evidence might involve a recently released intervention program. When the research base of educational claims is unclear, it is useful to refer to strategies in Daniel Willingham’s work, including his brief article Measured Approach or Magic Elixir: How to Tell Good Science from Bad (2012a), and his book When Can You Trust the Experts: How to Tell Good Science from Bad Science in Education? (2012b). In both of these resources, Willingham recommends using a series of shortcuts to strip down claims and evaluate their credibility.

Consider this claim: Over half of our children are not achieving literacy goals and need your help! Research shows that using Program X will not only help your students achieve but also help you become a more effective teacher. Act now to restore the love of learning to your students. What Willingham (2012b) recommends first is to strip the claim down. In the above example, this involves removing any emotion from it. For example, the terms “help,” and “effective teacher” are used to influence emotions, and emotional reasoning is not a strong foundation for making educational decisions. Who wants to help children and become a more effective teacher? All of us, but whether a program can succeed in meeting these goals is a matter of evidence, not emotion.

Second, Willingham recommends tracing the claim to the source to see if it is simply an appeal to an authority or if there is evidence for it. Is the claim associated with someone held in high status only or is there an actual basis for the claim in terms of credible evidence? Willingham discusses What Works Clearinghouse (WWC) as an “attempt to solve the problem of authority in education” (p. 181) by distributing information based on scientific evidence. Willingham also reinforces the point that the standards used to determine which studies are included are so strict for WWC that other valuable studies may be overlooked.

Willingham’s third step in evaluating educational claims involves analyzing the quality of evidence, using criteria related to scientific thinking, as well as the reader’s own personal experiences as reference points. Willingham’s fourth step involves bringing all of the evidence together to make a decision, while using other strategies to keep our personal belief systems from biasing our views. For example, we may view weak evidence as being strong because it is consistent with our beliefs, or we may dismiss strong evidence as weak because it goes against them. In addition to Willingham’s work, the publication by Stanovich and Stanovich (2003) discussed earlier in this chapter titled Using Research and Reason in Education is also foundational for developing scientific thinking related to research evidence. It may be useful to seek information from multiple sources, and reflect on the different examples and perspectives provided in each, to learn more about how to evaluate educational research.

Approaches which are too narrow

Some approaches to literacy instruction may not have been researched because they are too narrow to capture a researcher’s attention or are an isolated strategy that would be unlikely to have a detectible effect in itself. An example might be using a particular hand signal to get students’ attention during ELA instruction. This strategy would not necessarily be researched because it would be difficult to justify a full-blown scientific investigation into an isolated hand signal compared to the potential value in investigating a new reading intervention. The lack of formal research does not mean that the hand signal should not be used but simply that the teacher is tasked with determining whether it helps to meet his or her objectives within the classroom.

Approaches which lack validity

Still, other approaches may lack research attention because claims attached to them are highly unlikely to be valid, based on what is known from previously published literacy research. For example, some of the claims associated with the early edition of Brain Gym fall into this category. It just does not make sense for researchers to invest time and resources determining whether movements like “brain buttons” promote the “correction of letter and number reversals” (Dennison and Dennison, 1989, p. 25), since there is no convincing reason this might be the case.

Approaches which use different terms

Finally, a literacy strategy may not appear to have a research base because key search terms are inconsistent. For example, a search for “whispa phones” (the familiar name used to describe bent pieces of plastic pipe students can hold up to their ear to hear themselves whisper read without disturbing others), yields some hits when entered into the Google search bar but no hits when entered into the Google Scholar search bar. On the other hand, a search for “whisper phones” reveals at least three dozen mentions on Google Scholar, though most of these mentions appear unrelated to any investigation of their effects. When a literacy strategy or program is just mentioned in academic journal articles, this can sometimes mislead individuals into thinking they have been researched. Closer inspection may reveal that an article may include only comment on the use of the strategy rather than evidence of its effects. Many articles published in academic journals or scholarly books on literacy teaching that are not empirical may still provide a synthesis of other empirical work or may offer theories that are valuable to test. Non-empirical articles are often extremely valuable for explaining the logic behind why an approach to literacy teaching should work, but in the absence of empirical evidence, even well-developed theories provide only limited support for a literacy approach and would definitely not be classified as a scientific research base.

When there is no apparent research evidence to a literacy approach, it is important to be cautious, flexible, and open to modifying or even discontinuing the approach if new and/or clarifying research emerges. This is a judgment call, and using scientific thinking can increase the likelihood of making decisions that will benefit students. Making instructional decisions based on research can be complicated, especially when publishers promote their products using language designed to provoke trust in claims (e.g., “Mrs. Smith found our program helped all of her students achieve more! So buy our research-based program today!”). It is important to be skeptical of any claims that do not have credible evidence to back them up.

In reality, many commercially available programs do not have peer-reviewed research backing them, and therefore, a search for them in academic journals, on clearinghouse sites, or in panel reports may not yield any results. For example, there is a much deeper research base for some areas of literacy teaching and learning compared to others, so educators seeking information are much more likely to find it on reading programs compared to writing programs. Even if specific programs have not been directly researched, many of the strategies found in commercially available programs have been researched and can be found in each of the sources listed at the beginning of this paragraph. For example, a particular phonemic awareness intervention program may not have been researched, but its scope, sequence, and delivery may be nearly identical to other programs that have been found to be highly effective in scientifically-based studies. In cases like this, findings from the researched program may also apply to the non-researched program when both programs are used under similar conditions (e.g., similar group sizes, with students in the same grade, etc.).

The Importance of Monitoring Student Progress

Although teachers and administrators who access panel and clearinghouse reports and who use criteria in Willingham (2012b) are well-positioned in the selection and/or delivery of scientifically-based literacy instruction, an additional step is still needed to make instructional decisions. It is essential to make sure that the instruction used is effective in practicenot just in research studies. To differentiate students who respond well to a literacy approach from those who do not, progress-monitoring strategies help with decisions related to whether to continue to use a literacy approach. For detailed information about progress-monitoring strategies, please see Chapter 5 in this textbook.

To illustrate the rationale for progress monitoring, consider this analogy. I am not feeling well, and the doctor I see determines that my blood pressure is elevated to the point where medical intervention is needed. She prescribes an anti-hypertensive drug, and when I inquire about research related to its effectiveness, she shares a brochure with me featuring findings from a number of well-designed scientific studies. The studies clearly demonstrated that not only did the medication significantly lower participants’ blood pressure more than a placebo (i.e., sugar pill) but even outperformed the two leading anti-hypertensive drugs currently available. Additionally, the participants in the drug trials were quite similar to me, making it reasonable to predict that the drug may, in fact, work for me, too.

The doctor then happily writes me a prescription for a three-month supply of the drug and lets me know that she will gladly refill the prescription over the phone, as needed. She assures me that I do not need to come back to check my blood pressure or for a physical examination, since the medication should be effective, based on the research. Suddenly, I feel uneasy, and this time, it is not my blood pressure that is causing it. While I understand the logic of her prescribing the drug, I do not understand her rationale for just letting me take it without checking in to see if it is actually working for me. Making an assumption that drug trial study findings automatically apply to me is dangerous.

Similarly, even if there is a strong, scientific foundation for an approach to literacy, this does not guarantee that the approach will work for all students. Even if a scientific study shows an intervention to be effective overall, invariably, some students in the study who received the “effective” intervention likely did not respond well. These details are not usually reported in studies, not because the researchers are withholding vital information but simply because the motivation for an intervention study typically is to evaluate overall effectiveness, not effectiveness for individual students. So predictions about how individual students are likely to respond to a literacy approach are often rooted in scientific research, whereas determining how students actually respond to the approach is rooted in school-based progress-monitoring strategies.


Making decisions about approaches to literacy instruction can be challenging. Which approaches are effective based on scientifically-based research can often be found by accessing documents and websites such as clearinghouse and panel reports. On the other hand, many programs and strategies have not been formally researched, and therefore, it can be useful to employ other strategies, such as those offered by Willingham (2012a, 2012b) to think through claims made by promoters. It is important to understand how different forms of research can inform literacy instruction, such as when decisions should be grounded in research designed to evaluate effectiveness, such as experimental research, or when they should be grounded in other types of research, such as correlational or qualitative research.

Not only are teachers tasked with making decisions about using literacy programs and strategies, but often these decisions are made by groups of individuals at all levels of the educational hierarchy. Literacy program decisions are also made by curriculum committees, district administrators, and even boards of education, in which members may or may not be aware of the different levels of research evidence backing the programs they choose to adopt. When teachers develop background knowledge to evaluate the quality of research evidence, they not only make better decisions themselves but can also serve as better informed advocates and advisors to these groups.

Although many educational approaches to literacy have not been well researched, when scientific support can be found, this can help narrow the field of choices. Decisions still require attention to the generalizability of study findings based on students who were included in studies. And even the strongest scientific evidence does not guarantee that an approach to literacy that should work will work. For this reason, learning must be closely monitored using procedures such as those described in further detail in Chapter 5 of this textbook. Monitoring student progress can be especially helpful when trying strategies that may not fall into policy-based definitions of scientifically-based research, since ultimately, the most important effect for teachers to measure is how well a strategy or program works for children in their classrooms. Information presented in this chapter provides a gateway to future opportunities for learning more about research that can maximize students’ literacy learning.

Questions and Activities

  1. Describe different ways that literacy topics are researched.
  2. According to the policy-based definition presented at the beginning of this chapter, explain what characterizes scientifically-based literacy research, how is it different from other forms of research, and why literacy panel reports and What Works Clearinghouse privilege studies that use experimental design.
  3. Describe what an effect size is and why is it important in evaluating the effectiveness of an intervention.
  4. Describe ways that non-experimental approaches to research, such as correlational and qualitative research, inform the field of literacy teaching and learning.
  5. Debate in what instances a qualitative approach to research would be better than using an experimental approach. In what instances would using an experimental approach be more applicable? When might correlational research be most useful?
  6. Explain what needs to be considered when no research is available on a literacy approach carrying the claim of being “highly effective.”
  7. Discuss why monitoring students’ progress in literacy is important, particularly when a new approach to literacy instruction is used.


Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.

Dennison, P. E., & Dennison, G. E. (1989). Brain Gym: Teacher’s Edition Revised. Ventura, CA: Edu-Kinesthetics, Inc.

Dennison, P. E., & Dennison, G. E. (2010). Brain Gym: Teacher’s Edition. Ventura, CA: Hearts at Play, Inc.

Individuals with Disabilities Education Improvement Act of 2004, Pub. L. 108–446, 118 Stat. 2647 (2004).

National Center for Family Literacy, National Early Literacy Panel, & National Institute for Literacy (U.S.). (2008). Developing early literacy: Report of the National Early Literacy Panel: A scientific synthesis of early literacy development and implications for intervention. Washington, DC: National Institute for Literacy. Retrieved from

National Institute of Child Health and Human Development. (2000). Report of the National Reading Panel. Teaching children to read: An evidence-based assessment of the scientific research literature on reading and its implications for reading instruction: Reports of the subgroups (NIH Publication No. 00-4754). Washington, DC: U.S. Government Printing Office. Retrieved from

National Institute for Direct Instruction. (2015). What works clearinghouse. Retrieved from 

No Child Left Behind Act of 2001, Pub. L. No. 107-110, 115 Stat. 1425 (2002).

Shanahan, T. (2015, April 5). Response to complaint about What Works Clearinghouse. Message posted to

Stanovich, P. J., & Stanovich, K. E. (2003). Using research and reason in education: How teachers can use scientifically based research to make curricular & instructional decisions. Washington, DC: National Institute of Child Health and Human Development. Retrieved from

U.S. Department of Education, Institute of Education Sciences, National Center for Education Evaluation and Regional Assistance. (n.d.). What works clearinghouse. Retrieved from

Wade, L. (2011, March 18). Spurious relationship: Passport ownership and diabetes. [Web log posting]. Retrieved from

Willingham, D. T. (2012a, Fall). Measured approach or magical elixir? How to tell good science from bad. American Educator, 36, 4-12, 40.

Willingham, D. T. (2012b). When can you trust the experts?  How to tell good science from bad in education. San Francisco, CA: Jossey-Bass.


1: Which, admittedly, contains softer claims, though some are still concerning. Return

2: No Child Left Behind Act (2002); Individuals with Disabilities Education Improvement Act (IDEA, 2004) Return

3: For detailed information on teaching phonemic awareness, please see Chapter 3 by Murray in this volume. Return

4: The effect size reported here is a Cohen’s d and shows how much better the students who received phonemic awareness instruction did compared to those who did not receive it, in standard deviations. Return

5: Some experimental studies may use a “within-subjects design” in which a single group of participants receives alternating treatments.  Other experimental studies may use “single subject design” in which a single participant receives a treatment. Return

6: A researcher could also pretest students to make sure there are no differences between groups before intervention rather than assuming that the groups are equivalent because of random assignment. Return