Tuesday, December 19, 2023

Gambler's Fallacy In Teacher Evaluation: Reprint of my 2012 blog post, "NYC Teacher Evaluation Scores Show The Irrelevancy of Teachers On Student Test Performance."

Reprint of my 2012 blog post, "NYC Teacher Evaluation Scores Show The Irrelevancy of Teachers On Student Test Performance."


Monday, February 27, 2012

NYC Teacher Evaluation Scores Show The Irrelevancy of Teachers On Student Test Performance

Two articles in The New York Times (mentioned below) about the recently released NYC teacher evaluations indicate that teacher quality is not an important factor in student performance. Parents, teachers, media and many educational researchers overemphasize the importance of teacher quality on student learning and student standardized test performance.

Of course, there is more to a good teacher's characteristics than just the ability to teach effectively. In lower grades, a nurturing, reassuring and encouraging personality is probably a trait that most parents and children would like. The ability to maintain discipline and safety in the classroom, without being overly harsh, is another trait parents and students would probably like to see in teachers at all grade levels. Similarly, teachers that instill curiosity or an interest in a particular subject or extra-curricular activity are valuable beyond their ability to improve reading or math scores.

As for student performance on reading and math tests, research indicates that teacher quality is not that important. For example, research shows that teacher credentials, such as certification, advanced degrees, and years of experience in the classroom, do not affect student educational performance in reading and math. See as an example my post, "Teacher Credentials Unrelated To Student Achievement." Also, see results from Teach for America and similar programs which place eager, but not trained in education, recent college graduates into poorly performing schools. These students are very effective in the limited time they are in the schools.

Gambler's Fallacy

Many educational researchers, because of their own prior beliefs in effective teaching, fall into the gambler's fallacy of believing in "hot hands" that certain decks of cards, roulette wheels or slot machines are paying out more than others, when what the gambler is seeing is a momentary random streak of winning hands.

Studies that evaluate teacher impact on student performance over time generally find wide variation in a teacher's effectiveness from one year to the next. If some teachers are more effective than others and the important factor, then why does student performance measurement vary much from year to year? (For example, a recent Harvard study that found that better teachers in the lower grades had an adulthood impact on students and their earnings stated "teacher value-added is not in fact a time-invariant characteristic." and "if true VA [value-added] is mean reverting, deselecting teachers based on their current VA [value-added] will yield smaller gains in subsequent years, because some of the low VA [value-added] teachers improve over time.") These studies find some teachers are better than others because over the time frame measured, on average, they are more effective than other teachers. It is the same as believing in winning slot machines. Over a short time frame, with wins and losses, the researchers see some teachers have a higher winning average than other teachers, confusing a random streak for a measure of the effectiveness of the gambler. Additionally, researchers have found and attempt to control in their studies, that there is student sorting in the schools where some teachers consistently get more of the higher test scoring, better students than other teachers.

NYC Evaluation Results

The data from the NYC teacher evaluations shows that high, average and low scoring teachers are about equally scattered among all the NYC schools, the high and the low performing schools.

From The New York Times, "Teacher Quality Widely Diffused, Ratings Indicate" by Fernanda Santos and Robert Gebeloff:
The controversial ratings of roughly 18,000 New York City teachers released on Friday showed that teachers who were most and least successful in improving their students’ test scores could be found all around — in the poorest corners of the Bronx, like Tremont and Soundview, and in middle-class neighborhoods of Queens, like Bayside and Forest Hills.
***
They were in similar proportions in successful and struggling schools, and they were just as likely to have taught the most challenging of students and the most accomplished.
From The New York Times, "In Teacher Ratings, Good Test Scores Are Sometimes Not Good Enough" by Sharon Ottyerman and Robert Gebeloff:
At Public School 234 in TriBeCa, where children routinely alight for school from luxury cars, roughly one-third of the teachers’ ratings were above average, one-third average and one-third below average.

At Public School 87 on the Upper West Side, where waiting lists for kindergarten spots stretch to stomach-turning lengths, just over half the ratings were above average. The other half were average or below average on measure, based on student test scores.
***
The principal cause of the wide variation within schools is the methodology of the ratings, which compares teachers with similar student demographics and scores. For teachers in schools with high-achieving students, good test results are often not good enough, at least by the standards set by the formula.
The NYC evaluations control and adjust for different student demographics, family income, etc. Students from better off families perform better on standardized tests and teachers should not get credit for the higher test scores of these students, especially since the higher scores are due to socio-economics and not teacher effectiveness.

The article mentions that some of the higher income parents were surprised that teachers who they evaluated as effective did not produce test results beyond the expectation for the families' demographic group.

On average, independent of teacher quality, students from higher income families do better on standardized tests including SATs, graduate from high school at higher rates, go to and graduate from college at higher rates than students from poorer families.

Changes that have effect on student performance in school tend to be outside of the classroom, such as an increase in parental expectations of student educational performance, increase in parental literacy, decrease in teenage pregnancy rates, decrease in bullying in school and other factors which are unrelated to teacher effectiveness.

Friday, December 15, 2023

New Regulations Really Do Not Fix Problems: Reprint Of A 2009 Blog Post

Reprint of my March 2009 blog post, "New Regulations Really Do Not Fix Problems:"

Wednesday, March 18, 2009

New Regulations Really Do Not Fix Problems

It is unclear that new regulations fix a problem. The causative events of the problem often cease to exist before regulations are proposed. Additionally, many problems that require government intervention to protect the public are usually those that receive a lot of public media attention.

The public and the entities modify their behaviors prior to any regulatory effects. For example, peanut butter sales are down due to the recent salmonella problem [NBC Newsand the responsible company closed [CDC]. Peanut butter companies across the US have or are modifying their production processes to prevent a recurrence and parents are choosing other foods for their children.

Undoubtedly, the government will issue new food production regulations and take the undeserved credit for "fixing" the problem. The reality is that regulations are often parallel to the corrective change in behavior, but not the cause.

Since there will be an industry and consumer change in behavior after a negative event prior to regulations, the concern about regulations becomes whether they match (codify) the natural reaction of the public and the industry or whether they distort the natural reaction and cause new problems. In addition, sometimes other industries use similar methods or inputs for different purposes but must modify their behavior and cost structure to comply without any of the benefit.

As for the current financial crisis, the first cause is not yet determined despite the public media and politicians. Most mortgage defaults and foreclosures are limited to a few states, California, Florida, Arizona, and Nevada. Yet house price declines are a national problem even in areas of the US with below historical average defaults and foreclosures, such as the Northeast. Supposedly, we were in a housing bubble, yet the areas of the US with the greatest appreciation were the areas with the greatest increases in the number of new housing stock. Since when does economics allow for price increases when there is an increase in supply and more than enough to meet demand?

Similarly, studies of subprime mortgages (see St. Louis Fed) show that at the end of three years, eighty percent of these instruments cease to exist through refinancing, repayment, etc. Due to their high loan to value ratios, when house prices declined subprime defaults dramatically increased because the homeowner could not refinance the mortgage or repay the mortgage through a sale of the home. In other words, house price declines happened before the defaults happened and were in fact a cause of the increase in subprime defaults.

If defaults did not cause the decline in house prices, what did? What structural changes were occurring in the US economy to make homes worth less across the US and not just in areas of overbuilding and high mortgage defaults?

Bear Stearns went bankrupt about a year ago for liquidity reasons. It was unable to continue to post collateral to fund its revolving debt. The market value of Bear's mortgage collateral declined substantially in value. It no longer had sufficient collateral to continue its operations. The mystery is that on a cash flow basis at that time and currently, the collateral is worth much more than the market price. What other factors besides mortgage defaults and foreclosures depressed and continue to depress the price of mortgage securities?
[See my Oct 21, 2008 blog post, "Home Values Were Not In A Bubble."]

Until the underlying causes are determined, any regulatory response "fixing" the financial system has an excellent chance of missing the mark and causing significant future structural problems for the US economy.

Tuesday, December 5, 2023

Gross Domestic Product by State, 2nd Quarter 2023: US Bureau of Economic Analysis Map

 

From Bureau of Economic Analysis, "Gross Domestic Product by State, 2nd Quarter 2023 and Comprehensive Update:"
Real gross domestic product (GDP) increased in 44 states and the District of Columbia in the second quarter of 2023, with the percent change ranging from 8.7 percent in Wyoming to –1.9 percent in Vermont, according to statistics released today by the U.S. Bureau of Economic Analysis (BEA). 
Current-dollar GDP increased in 46 states and the District of Columbia in the second quarter, with the percent change ranging from 8.3 percent in Kansas to –4.3 percent in North Dakota.

Source: Bureau of Economic Analysis

Friday, December 1, 2023

US Supreme Court Never Decided A Legal Case About Falsely Shouting Fire In A Crowded Theater: Prohibited Speech Must Be Directed To And Likely To Incite Or Produce Imminent Lawless Action

 

From City Journal, "The “Shouting Fire” Pretext: A new book torches an old censorship canard." by Corbin K. Barthold:
“Fire in a crowded theater” does not come from a legal case involving fires or theaters. It comes, rather, from Schenck v. United States (1919), a case about a socialist arrested during World War I for peacefully protesting the military draft. The Supreme Court upheld his conviction. Along the way, Justice Oliver Wendell Holmes Jr. wrote that “the most stringent protection of free speech would not protect a man in falsely shouting fire in a theatre and causing a panic.” This imaginary situation was used to show that the First Amendment retreats before a “clear and present danger” of speech “bring[ing] about the substantive evils that Congress has a right to prevent.”

The Court later tightened this test substantially, ruling in Brandenburg v. Ohio (1969) that, to lose First Amendment protection, dangerous speech must both aim at producing, and be likely to produce, imminent lawless action. Yet the flimsy “fire in a crowded theater” metaphor lives on. As [Jeff] Kosseff observes, the line “is used as a placeholder justification for regulating any speech that someone believes is harmful or objectionable.” For the would-be censor, the concept is wonderfully plastic: almost any speech can stand in for the shout in the theater.