Few would argue that advanced teacher training does not make a difference in student achievement. In fact, Professor William Sanders of the University of Tennessee argues persuasively that "the single most dominant factor affecting student academic gain is teacher effect."1 However, little statistical research is available for evaluating which type of training and teaching degree has the best effect on student achievement. As the demand for higher academic achievement and accountability in public education grows, it is important to determine whether teachers who hold advanced degrees in education as a general field are more effective than those who have degrees in specific subjects like English or math.
Currently, to teach elementary (K-8) education, most public school teachers must have a bachelor's degree and related teaching credentials or certification. College students who want to become teachers usually fulfill this requirement in one of two ways: They obtain a degree in a traditional academic discipline such as English, mathematics, geography, or history; or they seek a general degree in education or education management.2
Because the relative effectiveness of the education degree versus a subject degree is a topic of much conjecture but little empirical research, this report has attempted to fill the gap by studying the test scores of fourth and eighth grade students who took the 1998 National Assessment of Educational Progress (NAEP) reading test and the 1996 NAEP math test. This analysis compared students who were taught by teachers holding advanced degrees in education with those whose teachers did not. The data showed that:
In math, eighth grade students of teachers who hold advanced degrees in education perform worse on the NAEP exam than those whose teachers hold any degree in math or science (bachelor's or advanced degrees).
Among fourth grade students, there is no significant difference in achievement between those whose teachers hold a bachelor's degree in reading or math and those whose teachers have advanced degrees in education.
A teacher's education may be less important for achievement than the parents' education. This research indicates that both math and reading scores rise if at least one parent holds a bachelor's or postgraduate degree.
Most Americans base their support for education spending on the belief that better teachers and teaching practices lead to enhanced student achievement.3 The debate over teacher quality usually focuses on coursework at the colleges and universities that train today's professional teachers. As critics of America's public school system note, "U.S. schools aren't producing satisfactory results, and this problem is not likely to be solved until U.S. classrooms are filled with excellent teachers."4 And although enclaves of good teachers can be found in districts all across the nation, they are not necessarily the norm.5
Those who seek to understand this issue should ask: Does a teacher's choice of undergraduate or graduate major affect his or her students' academic performance? Is teacher education the most important element of student achievement? Academic and professional literature in the education field dispenses much rhetoric on this subject, yet hard data on the effects of teacher education are limited, at best.
Public elementary schoolteachers are required to have a bachelor's degree, additional postgraduate work related to educational practices, and--to teach in nearly 80 percent of the states--student teaching experience.6 Although the postgraduate work is often administered through a university's school of education, the initial degree can be obtained in a major other than general education, such as math or science.
Some critics argue that education classes can better prepare the individual teacher for the classroom,7 but others contend that many of these general education and pedagogy classes are so ideologically driven that they expose future teachers to theorems that may not be well grounded in empirical research.8
For example, in Better Teachers, Better Schools,9 Thomas B. Fordham Foundation president Chester E. Finn, Jr., notes that "Every additional requirement for prospective teachers--every additional pedagogical course, every new hoop or hurdle--will have a predictable and inexorable effect: it will limit the potential supply of teachers by narrowing the pipeline while having no bearing whatever on the quality or effectiveness of those in the pipeline." One critic of this assessment claimed the analysis in Better Teachers was based on "pseudo research, rumor, and innuendo that virtually ignore historical and demographic facts and/or rely on extraordinarily suspect methods of data collection and analysis."10 Such outbursts, whether well-founded or not, demand that more solid data analysis be conducted to assess the effectiveness of teachers in the classroom. Their education, experience, and long-term development should be compared with the academic success of their students (as measured by standardized test scores) to see if a trend toward greater achievement exists.
Some argue that as the general population becomes better educated, so will teachers, and because the average number of years of education have increased substantially,11 a master's degree should be more desirable. It is true that a larger proportion of Americans over the age of 25 now hold college degrees: In 1910, only about 2.5 percent of Americans graduated from college; by 1998, that number had grown to nearly 25 percent.12 Moreover, an increasing proportion of elementary and secondary schoolteachers, because of job market demands, remuneration, and other factors, hold master's degrees. For example, in the 1993-1994 school year, over 42 percent of teachers reported having a master's or doctorate degree.13
Yet, how this increased education translates into improved student performance and higher academic achievement is not clear. The analysis here of students' scores in math and reading on the National Assessment of Educational Progress exams sheds some light on this concern.
To analyze the influence of teacher education on student achievement, this analysis considered the results of the 1996 and 1998 National Assessment of Educational Progress tests for fourth and eighth grade students in math and reading, respectively.14
The NAEP, first administered in 1969, measures academic achievement in a variety of fields, including reading, writing, mathematics, science, geography, civics, and the arts. Currently, the NAEP is administered to fourth, eighth, and twelfth grade students, and the tests for math and reading are given alternately every two years. In 1998, for example, students took the NAEP reading test; math was assessed in 1996 and 2000.15
The NAEP actually involves two tests: a national test and state-administered tests. About 40 states participate in the separate state samples that are used to gauge achievement within individual jurisdictions. For the purposes of this study, the 1996 and 1998 national data were used.
The most significant benefit of using the NAEP data is that, in addition to test scores in a subject area, the assessment asks an assortment of background questions of the students taking the exam, their main subject-area teacher, and their school administrator. Responses from the teachers and school administrators are linked to the student's information, which yields a rich database of information. These questions concern:
How to Interpret These Findings
This report contains the results of statistical analyses of student's National Assessment of Educational Progress scores in reading and math. These statistical tests isolate the independent effects of a number of factors on test scores in order to determine the effect of advanced teaching degrees alone. Because the statistical model includes socioeconomic characteristics and factors such as parents' education and number of reading materials available at home, it controls for the effect of each variable on the test scores. Thus, the findings about teacher education and NAEP scores apply as much to upper-income as to lower-income students, to blacks as to whites, to girls as to boys, and so forth, because the model isolates the effect of each.
However, even though there is a statistical relationship between each factor and student achievement, these independent factors do not necessarily cause differences in academic achievement. The model does not include everything that might have an effect on academic achievement, such as the methods used to teach reading or math. Thus, some variables also may be measuring the effect of an unobservable factor. For example, this model does not suggest that children from poor families will do worse on the NAEP because they are poor. Rather, poor families may have some unobservable characteristics or challenges that make it more difficult for their children to succeed in school. Similarly, controls for race may measure characteristics correlated with race that make it more difficult for students to score well on the tests.
Moreover, some variables, such as participation in the federal Free and Reduced-Price Lunch program, are proxies for other unobserved factors. Eligibility for this federal program, for example, is determined by income; only children from low-income families may participate. Although not all low-income children will participate in it, many will. Such information may be used to analyze the effect of different characteristics on achievement.
Finally, a finding of "statistically insignificant" indicates that the effect of the variable/factor is no different than zero. For example, if the relationship between teacher education and academic achievement is statistically insignificant, students who have teachers with subject degrees do no better than students who have teachers with education degrees.
The effect of each of these factors on test scores can be isolated using a regression analysis. The Heritage model employs a jackknifed ordinary least squares model16 and examines the effects of each factor on the NAEP 1996 math and 1998 reading tests' nationwide sample of public school children.17
Teacher Education. The effect of teacher education can only be adequately assessed if the teacher's undergraduate or graduate major and the highest degree achieved are both considered. The combination of these two factors will yield six teacher education scenarios:
Race and Ethnicity. Many studies and reports have shown that, over time, students from predominantly African-American and Hispanic communities tend to perform more poorly on standardized tests than do students from predominantly white communities (although the gap generally has narrowed over the past 25 years).18 There are a number of possible socioeconomic explanations for this trend, among which are poverty, peer pressure that discourages academic achievement, and crime.19 Because strong differences in academic achievement exist among races, variables of race and ethnicity are included in the analysis.
Parents' Education. Many researchers have noted that the educational attainment of a child's parents is a good predictor of that child's academic achievement. Parents who, for instance, are college educated may be better equipped to help their children with homework and understanding concepts than are those who have less than a high school education, other things being equal. Because the education level of one parent is often highly correlated with that of the other, only a single variable is included in the analysis.
Number of Reading Materials in the Home. The presence of books, magazines, an encyclopedia, and newspapers generally indicates a dedication to learning. Researchers have determined that these reading materials are important aspects of the home environment.20 This analysis includes a variable controlling for the number of these four types of reading materials in the home.
Free/Reduced-Price Lunch Participation. Income can be a key predictor of academic achievement because low-income families seldom have the financial resources to purchase extra study materials or tutorial classes to help their children perform better in school. Although the NAEP does not collect data on household income, it does collect data on participation in the federal Free and Reduced-Price Lunch program, which are used here.21
Gender. Research indicates that girls tend to perform better on reading and writing tests, while boys perform better in the more analytical subjects of math and science.22 Many authors have expounded on this idea,23 yet the data on the male-female achievement gaps can often lead researchers to inconsistent observations. For example, in 1998, young men scored higher than young women on both the verbal and quantitative sections of the Scholastic Aptitude Test (SAT). Some writers suggest that this may be due to a fundamental bias against females in America's educational system.24 Another explanation, however, is that the test results reflect a selection bias in which more "at-risk" females opt to take the SAT relative to males.25 In order to account for this difference, the analysis includes a variable for gender.
Omitted variables. Previous Heritage research26 on education-related issues included additional family background variables in the model specification. In the 1998 NAEP database, the only information available on children's parents is educational attainment. The NAEP does not ask whether the child lives with both parents (or parental figures), one parent, or no parents (in a group home). Future administrations of the NAEP test should include this type of question since a great deal of research is finding that having both parents in the home can improve a child's academic achievement.
For this analysis, the six variables listed above were entered into a statistical model,27 which was then applied to the NAEP's 1996 and 1998 nationwide sample of public school children who took the reading and math tests, respectively. Chart 1 and Chart 2 show the percent change in fourth and eighth grade reading scores attributable to these factors, compared with the base case; Chart 3 and Chart 4 report the percent change in math scores.28
Table 1 reports the average, or hypothetical, base case scores on the reading and math NAEP fourth and eighth grade tests. If the student were black, Hispanic, male, or poor, her score would drop, on average; if her home had more than two reading materials or her parents had taken college-level courses, her score would increase.
The analysis found that fourth grade students of teachers who have any degrees in English or math do not score higher on the reading or math exam (respectively) than fourth graders taught by teachers with advanced degrees in education. (See Chart 1 for the fourth grade reading scores and Chart 3 for the fourth grade math results).29 This is not surprising, considering the type of coursework children are taught in the lower grades. Since the material is less rigorous in the early grades as children learn basic to intermediate concepts, teachers may not realize much additional value by obtaining another degree in a subject. By the eighth grade, though, measurable differences appear; an advanced degree in the subject improves student achievement significantly more than an advanced degree in education.
Compared with the students of teachers who hold advanced degrees in education, the eighth grade students of teachers who possess advanced degrees in English or literature scored 2.7 percent higher on the reading NAEP exam (see Chart 2). Teachers who had just a bachelor's degree in education had eighth grade students who scored statistically the same as their peers who had teachers holding an advanced degree in education.
There are even more noteworthy results in math (see Chart 4). Eighth grade students taught by a teacher who has a bachelor's degree in math or science scored 2.2 percent higher than their peers who were taught by a teacher who holds an advanced degree in education. That percentage increases to 3.4 percent if the teacher holds an advanced degree in math or science. These results demonstrate that teachers who are more qualified in a subject transmit the more advanced concepts in junior high school math better (on average). This result suggests that eighth grade teachers with the most basic math subject education have students who do better than those taught by the best educated teachers with degrees from university-level departments of education.
Both fourth and eighth grade girls score slightly higher than boys on the NAEP reading exam, and statistically the same as boys on the NAEP math test. These facts bolster recent evidence on gender differences in academic achievement. As American Enterprise Institute W. H. Brady Fellow Christina Hoff Sommers notes, girls on average "get better grades, are more engaged academically, and are now the majority sex in higher education."30 The results here support the contention that schools are not shortchanging girls, contrary to some recent claims.31
Public elementary school administrators have an interest in hiring the best teachers for their schools, especially since Americans increasingly demand results and accountability for public education spending. As the findings of this analysis indicate, hiring teachers who hold subject degrees in math or English, rather than education degrees, is more likely to result in higher math or reading achievement among older (eighth grade) students.
Kirk A. Johnson, Ph.D., is a Policy Analyst in The Center for Data Analysis at The Heritage Foundation.
The results of the fourth and eighth grade models for 1998 NAEP reading data and 1996 NAEP math data, respectively, are shown in Table 2 and Table 3. The data show that the teacher education variable is statistically significant for eighth grade students only.32
In this analysis, two statistical issues must be considered. First, the NAEP exam is a long test, and it is therefore not administered in its entirety to all children. Rather, different parts are given to different children. Certain students will do better on certain portions of the test than others. Consequently, a "true" score must be estimated, or imputed, from the incomplete information. NAEP estimates five plausible composite reading scores and recommends that researchers use all five in any analysis. The Heritage model here follows the guidelines specified by the Educational Testing Service (which works closely with the National Center for Education Statistics in developing the file) to incorporate all five reading scores into the analysis.33
Second, the NAEP utilizes a complex sample design, which oversamples children with certain characteristics.34 Each child is assigned a unique weight calculated from the probability of being selected out of the population at large (in this case, from the U.S. population of fourth or eighth graders in public schools). The NAEP sample design requires a complex modeling technique, which the Heritage model has employed.35
1. William L. Sanders and June C. Rivers, "Cumulative and Residual Effects of Teachers on Future Student Academic Achievement," Research Progress Report, University of Tennessee Value-Added Research and Assessment Center, Knoxville, Tennessee, November 1996, p. i.
2. Various degrees under the general rubric of "education" are available, such as a degree in elementary education, education administration, and special or bilingual education. These majors tend to take a generalist approach to qualify graduates to teach the variety of subjects demanded in K-8 classes. The general discussion of degrees in this study includes these groups.
3. Arthur E. Wise and Donna M. Gollnick, Performance-Based Accreditation for the New Millennium (Washington, D.C.: National Council for Accreditation of Teacher Education, 2000), at http://www.ncate.org/newsbrfs/nb-0200.htm.
4. Marci Kanstoroom and Chester E. Finn, Jr., "The Teachers We Need and How to Get More of Them: A Manifesto," in Marci Kanstoroom and Chester E. Finn, Jr., eds., Better Teachers, Better Schools (Washington, D.C.: The Thomas B. Fordham Foundation, 1999), p. 1.
6. National Association of State Directors of Teacher Education and Certification, Manual on the Preparation and Certification of Educational Personnel, 1998-99 (Dubuque, Ia.: Kendall/Hunt Publishing Company, 1998).
8. For a broad discussion of these issues, see Dale Ballou and Michael Podgursky, "Teacher Training and Licensure: A Layman's Guide," in Kanstoroom and Finn, Better Teachers, Better Schools, pp. 31-82.
12. National Center for Education Statistics, Digest of Educational Statistics (Washington, D.C.: U.S. Government Printing Office, 1999), Table 8. See http://nces.ed.gov/pubs2000/digest99/d99t008.html.
13. Ibid., Table 69. See http://nces.ed.gov/pubs2000/digest99/d99t069.html.
14. Twelfth grade students were excluded because the background questionnaire that accompanies the fourth and eighth grade tests is not given to twelfth graders. Since this questionnaire is critical to the analysis, only the fourth and eighth grade data are used.
16. The ordinary least squares model is a general statistical regression technique often employed by researchers. See Michael Lewis-Beck, Applied Regression: An Introduction (Beverly Hills, Cal.: Sage Publications, 1980); from Sage Publications' Quantitative Applications in the Social Sciences, Series No. 07-022. A jackknife is a complex resampling technique designed to estimate statistical significance accurately from data in surveys such as the NAEP that employ a complex sampling methodology. See the Appendix for the results and a more complete discussion of the jackknifed ordinary least squares model.
18. For an analysis of the long-term achievement gap, see U.S. Department of Education, Report in Brief: NAEP 1996 Trends in Academic Progress (Washington, D.C.: U.S. Government Printing Office, 1997), Figure 2, p. 14.
23. For a brief discussion, see Thomas Hancock et al., "Gender and Developmental Differences in the Academic Study Behaviors of Elementary School Children," Journal of Experimental Education, Vol. 65 (1996), pp. 18-39.
26. See, for example, Kirk A. Johnson, " Comparing Math Scores of Black Students in D.C.'s Public and Catholic Schools," Heritage Foundation Center for Data Analysis Report No. CDA99-08, October 7, 1999.
32. Usually pegged at a 5 percent or 10 percent level. See Michael Lewis-Beck, Applied Regression: An Introduction (Beverly Hills, Cal.: Sage Publications, 1980); from Sage Publications' Quantitative Applications in the Social Sciences, Series No. 07-022. If a variable is not statistically significant, it means that the variable has no statistically discernable difference between the coefficient value and zero, so there is no effect.
33. From a multivariate regression perspective, the model below must be replicated five times using each of the plausible values individually, and then averaging the resulting coefficients to yield the final model results. In technical terms, this process corrects for measurement error in the reading score variable, since the test administrators do not actually observe the test score if the exam is taken in its entirety.
35. A procedure called a jackknife must be employed to correctly assess the variance of each variable's coefficient, and the NAEP database has a series of 62 "replicate weights" to aid in this task. These 62 jackknifes must be applied and the variances of each coefficient averaged for each of the five plausible test score models above (yielding a total of 315 models compiled for the purpose of this research). The WesVar Complex Samples software (produced by SPSS, Inc.) did much of this replication work. Using the jackknife results with the five plausible values models allows for a variance correction mechanism. The purpose of the jackknife is to estimate a true sampling error. Correcting for the two types of error (measurement and sampling) allows for the most accurate estimates possible. See Bradley Efron, The Jackknife, the Bootstrap, and Other Resampling Plans (Philadelphia: Society for Industrial and Applied Mathematics, 1982); and Jun Shao and Dongsheng Tu, The Jackknife and Bootstrap (New York: Springer Verlag, 1995) for a more complete discussion of how this jackknife technique works.