Skip to main content
Search Mobile Navigation


As you help students make progress and teachers increase their skill, you want confidence that Heinemann's classroom resources follow the highest research standards. On this page, you'll explore the scholarly research that underlies Fountas & Pinnell LiteracyTM, Lucy Calkins' Units of Study, and Jennifer Serravallo's Complete Comprehension. You'll also discover customer case studies and efficacy studies from top education organizations that demonstrate the ability our resources have to improve the lives of students.

Fountas & Pinnell Literacy™

Fountas and Pinnell share a long history of writing books and materials that are research-based and practical for teachers to use. As a result they are committed to the important role of research in the development and ongoing evaluation of all of their reading, writing, phonics and classroom resources.

Visit the Fountas & Pinnell Literacy™ Research and Standards Page

Fountas & Pinnell Classroom™ (FPC) Research

The Research Base for Fountas & Pinnell Classroom™

Effective literacy instruction demands that research-based evidence is understood in concert with practice-based evidence and close attention to what an individual child can and cannot yet do. It also arises within the context of a careful examination of values and beliefs about literacy and what it takes to prepare children to be literate citizens of the world.

Like other effective comprehensive systems, Fountas & Pinnell Classroom™ rests on a thorough and thoughtful examination of existing research. Data gathered from implementation of FPC demonstrates positive evidence of gains.

What follows is an extensive review of the foundational research underpinning the development of FPC, along with recent research that supports and aligns with a comprehensive literacy system.

Read FPC Research Base

Read more about FPC Research and Standards

Leveled Literacy Intervention (LLI) Research

Research Base for Leveled Literacy Intervention, Grades K–2 (Levels A–N)

The development of LLI was driven by what prior research has established about how children learn to read, and what works best with struggling readers. Please refer to the research base for more information regarding the background research that provided the foundation for the development of this intervention system.

Research Base for Leveled Literacy Intervention, Grades 3–5+ (Levels L–W)

In this summary, Irene Fountas and Gay Su Pinnell review the research base for the Red, Gold, and Purple systems of Leveled Literacy Intervention, which is designed to lift the literacy achievement of students who are falling below grade level expectations in reading. The 15 principles on which the LLI Red, Gold, and Purple systems are based are discussed, along with a list of supporting research. The lesson framework for the extended systems of LLI rests on these principles.

Independent Organizational Research

The What Works Clearinghouse LLI Effectiveness Study

The What Works Clearinghouse and the National Center for Education Evaluation and Regional Assistance (NCEE) found Fountas & Pinnell Leveled Literacy Intervention to have a positive effect on general reading achievement and reading fluency based on a comprehensive review of available evidence.

In the General Reading Achievement domain, the research indicated strong evidence of a positive effect with no overriding contrary evidence. In the two studies that reported findings, the estimated impact of LLI on outcomes in the general reading achievement domain was positive and statistically significant for two studies, both of which meet WWC group design standards without reservations. The extent of the available evidence is medium to large and included 747 students in 22 schools.

In the Reading Fluency domain, the research indicated evidence of a positive effect with no overriding contrary evidence. In the one study that reported findings, the estimated impact of LLI on outcomes in the reading fluency domain was statistically significant and substantively important. This study included 281 students in nine schools.

Read the Report

Evidence for ESSA review of LLI K–2

ESSA has reviewed the research on LLI, finding strong evidence of effectiveness for students in grades K-2. These findings are based on two independent, empirical studies conducted by The University of Memphis's Center for Research in Educational Policy (CREP).

Read the Report

Research connected with the Development of LLI

During the development of LLI, a field study was conducted at sites around the United States to assess the LLI framework. Please refer to the field study for more information about the field study and the research connected with the development of LLI.

Additionally, the student data from three of the sites that participated in the field study (Newark OH, Boston MA and Manchester NH) was analyzed for a pilot research project that examined student progress. Please refer to the pilot study, for the results from this study.

Read more about LLI Research, Standards and Efficacy

Benchmark Assessment System (BAS) Research

After the construction of the Benchmark Assessment System, an outside evaluation team conducted an independent study of the system's reliability and validity as a way of measuring reading progress against grade-level criteria. An independent agency reviewed the data. The first stage of the study provided valuable information for adjusting the difficulty of texts in detailed ways. The second stage provided data to assure that the texts provide a true gradient—that is, that each level is more difficult than the previous level and is easier than the next level. The study also provided information on internal consistency—that the fiction and nonfiction selections at each level are equivalent. The assessment was also correlated with the existing Reading Recovery® leveled assessment and a close fit was discovered. You can review either the Executive Summary or the Full Report.

The Benchmark Assessment System is new but the F&P Text Level Gradient™ on which it is based has been developed over the last twenty years and used with high reliability to establish grade-level expectations. The F&P Text Level Gradient™, which was published in the 1990s, has been refined and developed over the years. You can now find over 50,000 books listed by level on This gradient was used as a standard by the New Standards Project® (Resnick & Hampton, 2009). New Standards is a joint project of the Learning Research and Development Center at the University of Pittsburgh (Pennsylvania) and The National Center on Education and the Economy (Washington, D.C.). Heading a consortium of 26 U. S. states and six school districts, New Standards developed performance standards in English language arts and other areas.

The F&P Text Level Gradient™ is a defined continuum of characteristics related to the level of support and challenge that a reader meets in a text. Terms such as easy and hard are always relative terms that refer to the individual reader's foundation of background knowledge. At each level (A to Z) texts are analyzed using ten characteristics: (1) genre/form; (2) text structure; (3) content; (4) themes and ideas; (5) language and literary features; (6) sentence complexity; (7) vocabulary; (8) word difficulty; (9) illustrations/graphics; and (10) book and print features.

Texts are leveled using a highly reliable process in which teams of trained teachers, working independently and then through consensus, assign a level to books after analyzing them according to the ten factors. They are then analyzed by Fountas and Pinnell. The Benchmark Assessment books were actually created to precisely match the F&P Text Level Gradient™, and they were independently analyzed using the same process.

Often information from readability formulas like the Spache and Flesch-Kincaid are used as part of the text analysis process; however, those formulas measure a more narrow range of factors such as sentence length and number of syllables in words. The leveling system on which this assessment is based takes into account a more complex range of text factors (for example, literary features and abstractness of theme). In fact, it is well known that the grade levels revealed by different formulae vary widely according to what is being analyzed.

So, we would not expect an exact correlation between those factors and this assessment system. They do predict student performance on the kinds of texts and comprehension tasks students are expected to demonstrate in school. In a small evaluation in a city in Ohio data showed that if students proficient at levels M or N there was a strong predictability of proficiency on the Ohio Achievement Test in grade 3. More data are being collected.

The Benchmark Assessment System is appropriate for use in RTI. It does not provide national norms or percentiles; it is not intended for national achievement testing. However, it is based on widely used grade-level criteria (see the website for detailed documents). It enables the classroom teacher and specialist teacher to engage in diagnosis of a variety of sub-skills. This complex and comprehensive assessment system is designed to measure progress in each of the subskills in a way that informs instruction. It is linked to a detailed continuum of observable behaviors to assess and teach for at every level (see The Literacy Continuum). Included in every BAS, this continuum offers a very specific bridge to instruction.

Resnick, L. B., & Hampton, S. (2009). Reading and writing grade by grade. Newark, DE: International Reading Association.

...Read More

Field Study of Reliability and Validity

A formative evaluation of the Fountas & Pinnell Benchmark Assessment System was conducted to ensure that (1) the leveling of the texts is reliable and (2) the reading cores are valid and accurately identify each student's reading level. The purpose of the study was twofold. The first was to examine every book, at every level, for the reliability of its designated level within a broader literacy framework and across corresponding fiction and nonfiction genres, i.e., is the readability of the books consistent across the fiction and nonfiction domains? For example, are the level G fiction and nonfiction books not only typical level G books, but do corresponding fiction and nonfiction books at this level have the same degree of readability? The second purpose of the evaluation was to determine the correlation between the Fountas & Pinnell Benchmark Assessment System and other reading assessments, i.e., to what extent is the Fountas & Pinnell Benchmark Assessment System associated with other valid reading assessments?

In order to determine the reliability and validity of the Fountas & Pinnell Benchmark Assessment System, the following three research questions guided the formative evaluation:

Research Question 1

  • How reliable is the Fountas & Pinnell Benchmark Assessment System? That is, how consistent and stable is the information derived from the reading books?
  • Does each book of the Fountas & Pinnell Benchmark Assessment System consistently occupy the same position on the gradient of readability, based on multiple readings by age-appropriate students? That is, does each book, level A–Z represent a degree of increased difficulty that is consistent with other Fountas and Pinnell leveled texts.

Research Question 2

  • To what extent are the gradients of difficulty for fiction and nonfiction books aligned within the Fountas & Pinnell Benchmark Assessment System? Do fiction and nonfiction books represent similar levels of difficulty within similar levels of reading?

Research Question 3

  • To what extent is the Fountas & Pinnell Benchmark Assessment System associated with other established reading assessments?
    • What is the convergent validity between the System 1 and Reading Recovery® assessment texts?
    • What is the convergent validity between the System 2 and the Slosson Oral Reading Test—Revised (SORT-R3) and the Degrees of Reading Power® (DRP)?

Read the Executive Summary

Read the Full Report

...Read More

Read the Research Base for BAS

Read more about BAS Research and Standards

Sistema de evaluación de la lectura (SEL) Research

Field Study of Validity and Reliability: Sistema de evaluación de la lectura (SEL)

The Sistema de Evaluación de la Lectura (Sistema), Grados K–2, Niveles A–N is a formative assessment of reading in Spanish comprised of 28 high-quality original titles, or books, divided evenly between fiction and nonfiction. The Sistema measures decoding, fluency, vocabulary, and comprehension skills in kindergarten through third grade (mid-third grade). The set of books, recording forms, and other materials is an assessment tool for teachers, literacy specialists, and clinicians to use in determining students' developmental Spanish reading levels for the purpose of informing instruction and documenting reading progress. The Sistema is the Spanish counterpart to the Fountas & Pinnell Benchmark Assessment System in English, published in 2007 (revised in 2010) to critical acclaim.

A formative evaluation of the Sistema was conducted to ensure that (1) the leveling of the texts is reliable, and (2) the reading scores are valid and accurately identify each student's reading level. Click on the links below to review the results.

Read Executive Summary

Read Full Report

Read more about SEL Research and the Advisory Panel

Phonics, Spelling, and Word Study System (PWS) Research

Research Base: Phonics, Spelling and Word Study Systems (K-6)

Twelve Compelling Principles from the Research on Effective Phonics Instruction

In this document we will explore the important findings—twelve compelling principles—from a large body of research. These principles rest on decades of research on literacy instruction and how literacy and language develop in children over time, as well as on more than thirty years of our own extensive experience in classrooms across the country. A high-quality phonics design is based on what we know about how children learn to read, and it continuously expands their knowledge about words and how they work. (An essential foundation for the implementation of such a design is the teacher's understanding of the content to be taught—the complexity and structure of language.)

Read the research behind effective phonics

The Fountas & Pinnell Phonics, Spelling, and Word Study Systems: Explicit, Systematic, and Grounded in the Twelve Principles from Research

The Phonics, Spelling, and Word Study Systems are grounded in twelve principles from research. In this document we will identify how the Phonics, Spelling, and Word Study Systems directly align to the twelve principles supported by research evidence.

Read how PWS is directly aligned to the research

Phonics Lessons © 2003

Phonics Lessons are grounded in a wide base of academic research, including all the areas examined by The National Reading Panel, and reflect its recommendations for phonemic awareness, phonics, fluency, vocabulary, and comprehension. In addition, the lessons reflect practical, classroom-based research in how children learn, practices that have been reconfirmed by many teachers as they have field-tested Phonics Lessons and Word Study.

Review the former edition of Phonics Lessons: The Research Base

Visit the PWS Research Page

Units of Study

Visit the Units of Study Research & Efficacy Page

American Institutes for Research Study

The American Institutes for Research (AIR), a not-for-profit, independent research firm based in the greater Washington, D.C., area, has completed the first objective, rigorous, quasi-experimental study of the Teachers College Reading and Writing Project's reading and writing workshop and Units of Study curriculum.

The study used publicly-available aggregate state English Language Arts (ELA) data—spanning up to 10 years for some schools—and showed that beginning in Year 2 of program use, TCRWP implementation was associated with statistically significant positive effects on state ELA test scores. In addition, this difference became larger as time passed, suggesting positive cumulative effects of use of the TCRWP approach.

Read the AIR Report

Heinemann had the opportunity to visit some of the New York City treatment schools who participated in the study. After just four months of implementation, we met school educators and observed classrooms, and we documented some of that experience in this video.

Units of Study Research Base

Teachers College Reading and Writing Project is a think tank that has long built its ideas and practices off of established and new research. Download the Research Base document for a summary of some of TCRWP's key beliefs and practices as well as some of the research that informs those beliefs and practices.

Read the Research Base

Units of Study Data Reports

The Teachers College Reading and Writing Project works with thousands of schools and districts around the nation and around the world. Periodically the Project publishes reports documenting the performance of these schools. The following Data Reports provide information on student performance in schools that work with the TCRWP.

These state-specific reports look at testing data in TCRWP schools in a variety of ways, including tracing growth over time, investigating the performance of English Language Learners and economically disadvantaged students, and comparing TCRWP schools with other schools overall. Note: The TCRWP is currently analyzing data for a number of specific locations. Check back for additional reports coming soon.

Read the Connecticut 2018 Report

Read the California 2018 Report

Read the New York 2018 Report

Read the New York 2017 Report

Units of Study Case Studies

In these documents, the Teachers College Reading and Writing Project focuses a spotlight on a selection of New York City and Wisconsin schools that have benefited from their relationship with the Project. See their journeys—the early challenges and careful tending required to ensure all students are growing as readers and writers, regardless of background or circumstance. As one principal put it, "You have to believe in the work. If you can't put your heart completely into it, don't bother."

Read the Case Studies: New York

Read Case Studies: Wisconsin

Complete Comprehension by Jennifer Serravallo

Research Summary for Complete Comprehension

With Complete Comprehension, Jennifer Serravallo operationalizes top-quality, peer-reviewed research, simplifying instruction and increasing clarity for classroom teachers. This research summary describes six key findings from more than 30 research studies and meta-analyses that Serravallo was informed by in the creation of Complete Comprehension's instructional model.

Read the Research Summary