"The whole people must take upon themselves the education of the whole people and be willing to bear the expenses of it. There should not be a district of one mile square, without a school in it, not founded by a charitable individual, but maintained at the public expense of the people themselves." -- John Adams

"No money shall be drawn from the treasury, for the benefit of any religious or theological institution." -- Indiana Constitution Article 1, Section 6.

"...no man shall be compelled to frequent or support any religious worship, place, or ministry whatsoever, nor shall be enforced, restrained, molested, or burthened in his body or goods, nor shall otherwise suffer on account of his religious opinions or belief; but that all men shall be free to profess, and by argument to maintain, their opinion in matters of religion, and that the same shall in no wise diminish enlarge, or affect their civil capacities." – Thomas Jefferson

Saturday, June 22, 2013

It's Not Valid to Begin With!


Yesterday representatives from CTB/McGraw-Hill reported to the Indiana legislature about the technical problems with the state test, the ISTEP+...

ISTEP+ vendor apologizes, admits errors
CTB has agreed to pay for a third-party validity study and that the company’s $95 million, four-year contract with the state allows for penalties and fines.

...The Indiana Department of Education has since hired an outside consultant to review the validity of scores for tens of thousands of students.

Depending on the results, all or some of those tests could be thrown out.

ISTEP+ scores are used in part to determine teacher performance and compensation. And they determine each school’s A-to-F accountability grade. The accountability grade can be used to eventually close failing schools or allow more students to take vouchers without first attending public school.
Before the "outside consultant" can determine if the tests are valid let's look at what "valid" actually means in the assessment world. Here is the definition of validity (click the quote to read about reliability).

[Note: in assessment there is more than one kind of validity: content validity, face validity, criterion-related validity (or predictive validity), construct validity, factorial validity, concurrent validity, convergent validity and divergent (or discriminant validity). The definitions below are generalized. Furthermore, to be valid an assessment must also be reliable, though reliability is not sufficient to make an assessment valid. Clear?]
...validity refers to the extent we are measuring what we hope to measure (and what we think we are measuring).
What, then, is the ISTEP+ supposed to measure? The following is from the 2012-2013 Indiana Assessment Program Manual.

ISTEP+ Grades 3-8
The purpose of the Indiana Statewide Testing for Educational Progress-Plus (ISTEP+) program is to measure student achievement in the subject areas of English/language arts, mathematics, science, and social studies. In particular, ISTEP+ reports student achievement levels according to the Indiana Academic Standards that were adopted by the Indiana State Board of Education.
Appendix H of the program manual reports on the reliability and validity of the test. Unless you're trained and interested in tests and measurements you're not likely to care much about the discussion in this section of the Program Manual. However, for those who understand the statistics involved and are interested, this appendix explains how the state has determined that the test is reliable and valid.

The outside consultant hired by the state will determine whether the validity of the test has been compromised by the testing irregularities caused by the technical glitches.


The ISTEP+ purports to be a valid measure of student achievement with respect to the Indiana standards. Good testing practice dictates that it should be used only for determining student achievement. Other uses have not been validated and variables which would influence the test's validity in other areas have not been taken into account. Therefore...

1. It's not a valid measure of teacher effectiveness. It has never been validated for that purpose. (It's also not a reliable measure of teacher effectiveness since reliability has never been determined either.)

2. It's not a valid measure with which to "grade" schools ("A" to "F").

All that's ever been provided for the ISTEP+ is it's (supposed) validity as a measure of student achievement. Using it for any other purpose is not valid. Period.


Back in 2000 Alfie Kohn wrote an article, Standardized Testing and Its Victims, in which he listed reasons why standardized tests are not just inadequate for evaluating students (and schools), but downright harmful. He lists some facts (in the original all facts are explained in more detail.
Fact 1. Our children are tested to an extent that is unprecedented in our history and unparalleled anywhere else in the world...Few countries use standardized tests for children below high school age—or multiple-choice tests for students of any age.

Fact 2. Noninstructional factors explain most of the variance among test scores when schools or districts are compared. A study of math results on the 1992 National Assessment of Educational Progress found that the combination of four such variables...accounted for a whopping 89 percent of the differences in state scores.

Fact 3. Norm-referenced tests were never intended to measure the quality of learning or teaching.

Fact 4. Standardized-test scores often measure superficial thinking...it appears that standardized-test results are positively correlated with a shallow approach to learning.

Fact 5. Virtually all specialists condemn the practice of giving standardized tests to children younger than 8 or 9 years old.

Fact 6. Virtually all relevant experts and organizations condemn the practice of basing important decisions, such as graduation or promotion, on the results of a single test. The National Research Council takes this position, as do most other professional groups (such as the American Educational Research Association and the American Psychological Association), the generally pro-testing American Federation of Teachers, and even the companies that manufacture and sell the exams. Yet just such high-stakes testing is currently taking place, or scheduled to be introduced soon, in more than half the states.

Fact 7. The time, energy, and money that are being devoted to preparing students for standardized tests have to come from somewhere. Schools across the country are cutting back or even eliminating programs in the arts, recess for young children, electives for high schoolers, class meetings...discussions about current events...the use of literature in the early grades...and entire subject areas such as science...

Fact 8. Many educators are leaving the field because of what is being done to schools in the name of "accountability" and "tougher standards."
[NOTE: Remember, Kohn's article was written in 2000, before No Child Left Behind became law! ISTEP+ is a criterion-referenced test, not a norm-referenced test. Criterion-referenced tests are "intended to measure how well a person has learned a specific body of knowledge and skills." Furthermore, the ISTEP+ is a particular variation of a criterion-referenced test known as a "standards-referenced test" or "standards based assessment" because it measures the accumulation of knowledge of the Indiana Standards.

Nevertheless, Fact 3 can correctly be rewritten as: Criterion-based tests were never intended to measure the quality of learning or teaching.

The main point of Kohn's article is not simply to suggest that standardized testing is inappropriate as a high stakes measure, but to emphasize that those children who need the most help -- children who come to school with fewer skills, i.e. children of poverty -- are hurt the most by the emphasis on testing. He writes.
*The quality of instruction declines most for those who have least. Standardized tests tend to measure the temporary acquisition of facts and skills, including the skill of test-taking itself, more than genuine understanding. To that extent, the fact that such tests are more likely to be used and emphasized in schools with higher percentages of minority students (a fact that has been empirically verified) predictably results in poorer-quality teaching in such schools. The use of a high-stakes strategy only underscores the preoccupation with these tests and, as a result, accelerates a reliance on direct-instruction techniques and endless practice tests. "Skills-based instruction, the type to which most children of color are subjected, tends to foster low-level uniformity and subvert academic potential," as Dorothy Strickland, an African-American professor at Rutgers University, has remarked...

*Standards aren't the main ingredient that's in low supply. Anyone who is serious about addressing the inequities of American education would naturally want to investigate differences in available resources. A good argument could be made that the fairest allocation strategy, which is only common sense in some countries, is to provide not merely equal amounts across schools and districts, but more for the most challenging student populations. This does happen in some states—by no means all—but, even when it does, the money is commonly offered as a short-term grant (hardly sufficient to compensate for years of inadequate funding) and is often earmarked for test preparation rather than for higher-quality teaching. Worse, high-stakes testing systems may provide more money to those already successful (for example, in the form of bonuses for good scores) and less to those whose need is greatest.

Many public officials, along with like-minded journalists and other observers, are apt to minimize the matter of resources and assume that everything deficient about education for poor and minority children can be remedied by more forceful demands that we "raise the bar." The implication here would seem to be that teachers and students could be doing a better job but have, for some reason, chosen not to do so and need only be bribed or threatened into improvement. (In fact, this is the tacit assumption behind all incentive systems.) The focus among policymakers has been on standards of outcome rather than standards of opportunity.

To make matters worse, some supporters of high-stakes testing have not just ignored, but contemptuously dismissed, the relevance of barriers to achievement in certain neighborhoods. Explanations about very real obstacles such as racism, poverty, fear of crime, low teacher salaries, inadequate facilities, and language barriers are sometimes written off as mere "excuses." This is at once naive and callous, and, like any other example of minimizing the relevance of structural constraints, ultimately serves the interests of those fortunate enough not to face them.
Finally, testing in Indiana, as in most other places around the country, has become the "end" of education, not just a method of measuring learning. For the state, scoring well on the test is the goal. This forces schools to emphasize them or be punished (as opposed to being offered more support). In the conclusion to an article titled, The Limits of Standardized Tests for Diagnosing and Assisting Student Learning the authors at Fairtest.org wrote,
When standardized tests are the primary factor in accountability, the temptation is to use the tests to define curriculum and focus instruction. What is not tested is not taught, and what is taught does not include higher-order learning. How the subject is tested becomes a model for how to teach the subject. At the extreme, school becomes a test prep program – and this extreme already exists.

It is of course possible to use a standardized test and not let its limits control curriculum and instruction. However, this can result in a school putting itself at risk for producing lower test scores. It also means parents and the community are not informed systematically about the non-tested areas, unless the school or district makes a great effort.

To improve learning and provide meaningful accountability, schools and districts cannot rely solely on standardized tests. The inherent limits of the instruments allow them only to generate information that is inadequate in both breadth and depth. Thus, states, districts and schools must find ways to strengthen classroom assessments and to use the information that comes from these richer measures to inform the public.
In its ignorance and arrogance the State of Indiana has elevated the state assessments, including ISTEP+, as the prime measure with which to judge students, schools, teachers, administrators, and school systems.

The current uproar over the technical glitch debacle during the last ISTEP+ administration window is just a distraction from the real issue of our overuse and misuse of testing. It has become an argument over how best to misuse testing in our obsessive quest for data.

The whole discussion about the technical glitch during the ISTEP+ is irrelevant.

For more about testing see The Case Against High Stakes Testing


All who envision a more just, progressive and fair society cannot ignore the battle for our nation’s educational future. Principals fighting for better schools, teachers fighting for better classrooms, students fighting for greater opportunities, parents fighting for a future worthy of their child’s promise: their fight is our fight. We must all join in.


Stop the Testing Insanity!


No comments: