This is easy to say but hard to do. "Based on excellence" is the mantra I've heard from software development circles for years and yet...
1. How many of us have lost productivity because some other developer ran rough-shod through code caring not for maintainability, readability, or even extensibility? Which one was praised and which one rewarded? Doesn't apply to teachers? Ever have a dud who couldn't do simple math in charge of the algebra class? Guess what the next teacher has to deal with.
2. How do "normal" people in managerial roles judge competency between two people in such roles? When looking at our own industry we see how difficult it is for "normal" people to make those sorts of judgement calls. Some schools require only 2 years teaching experience before moving on to superintendent roles. Some states more. With such divergence what can we expect?
3. How will we measure such excellence? What metric? What new system will evolve to capitalize on this metric despite the spirit of the metric? Unintended consequences?
There is no bureaucratic rule that can be devised that can separate good teachers from bad ones. Every such rule can and will be gamed. It's like the difference between art and porn - the difference is obvious by inspection, but impossible to codify.
With a free market school system, people will know who the good teachers are and will pay for them (supply and demand).
With a government school system, this simply will never work, because the people paying the salaries will never be the ones judging the teachers.
Unless the teachers are going to be individually contracted by parents, don't private-sector schools have the same evaluation problem? They still have to separate good and bad teachers via some sort of process, probably overseen by management (aka bureaucracy). The similarity of the agency problems is one reason corporations internally often look so much like governments.
Private schools can fire teachers subjectively. A good private school principal can use gut feel to evaluate a teacher. But that doesn't scale, unless we find a way to identify and recruit very smart, very trustworthy principals on a large scale.
People are emotionally averse to firing others they may have been working with for years, and there's not a good enough reason to fire someone in a private organization unless they affect the bottom line, and outside of exceptional cases I don't see how the bottom line changes much based on individual teachers. You're probably not going to quit private school based on any one single teacher.
You need a strong economic incentive for the organization in order for the "fire more easily" idea to work. For instance, if a private school got paid according to the competitiveness of the college the student is accepted into.
I remember reading an article (linked from HN I believe) that was a response to the "Waiting for Superman" movie which cited numbers showing that private schools didn't perform any better than public schools. That would suggest that either the ability to fire teachers subjectively doesn't have an impact on student performance, or that private schools don't fire enough teachers or the metrics the private schools use to fire teachers are not in line with what would improve student achievement.
To some extent, sure, the larger the organization is the more bureaucratic and inefficient it gets at about everything. But there's still one fundamental difference - with public schools, one is forced to attend (truancy laws) and forced to pay for it.
Overall I would agree that free market bureaucracies are more efficient and better run than goverernment bureaucracies.
However, the fact that a bureaucracy is not run by the government does not mean that it knows what level of contribution individual employees are making.
Calculating such contributions is infeasible for all but the smallest groups (like startups, as PG has written about).
The people who are best at selecting good teachers are pupils. If classes were optional we would quickly find out which were worth attending by watching children vote with their feet.
However, children will never be allowed to do this in schools. A school that took children seriously and respected them as humans (or even just as customers), would quickly cease to resemble anything that we would recognise as a school.
Despite all this, education is in the process of being reformed. This is possible because reform is happening outside schools, for example in the unschooling movement.
Right, and in a free market system, those who can pay for the best teachers will get them, and those who cannot pay will be left with the worst teachers. Isn't that unfair?
That is also exactly the opposite of what Gates is trying to accomplish with his plan. He wants to incentivize teachers to work with the poorest students at the most disadvantaged schools. By definition, the free market alone CANNOT solve this problem.
The problem we're discussing is how does one determine who the "best" teachers are? Gates did not propose a solution to this problem. I submit that, being a government school system, such a solution will involve a bureaucratic rule that will inevitably be gamed.
Note that the current system of determining the best as being a bureaucratic rule-based combination of seniority and educational level is a recognized failure. I guarantee you that all the alternative rules proposed in this thread can be easily gamed as well.
This is in spite of the fact that anyone working in a school (including the students) knows who the best and worst teachers are, and don't need any rule to tell them.
Generally people seem to support either performance improvement measured by exams (i.e. how much students improved over the course of the term) or some sort of peer-review system.
Both have their own problems, but both seem to be much better than the arbitrary system of seniority. There is evidence to show teachers get better with experience, but almost all of that improvement happens within the first few years of a teachers career.
It's actually not obvious that a testing-based teacher measurement regime would be better than the seniority system. The seniority system does lock in mediocre teachers, but it has minimal knock-on effects to the actual curriculum. Test-based measurement has already made some curricula toxic to students, and those test results have minimal impact to the teachers themselves.
How much worse would it be if the teach-to-the-test incentive was "afford a new car"?
I'm guessing less worse than the incentive to do the minimum amount possible knowing you're going to get a pay raise regardless.
We can change the test if it's not measuring the right thing (we should do that anyway regardless of it's use for teacher measurement). Test scores are what parents, universities, etc. measure the performance of children by, we shouldn't have a double standard of having one metric for children and another for those who teach them.
Imo, universities measuring applicants mainly by test scores is a bad thing, done only because they can't find a better way to do it, but I'd hardly want to subordinate all of education to that. K-12 education should teach kids useful stuff, meta-learning habits, civic information, attempt to inspire them, etc., not solely aim to maximize their SAT scores. To the extent I found K-12 education useful at all, it was only the parts that weren't narrowly aiming at maximizing my test scores, so it'd be a shame to declare those irrelevant and ditch them.
Plus, universities do take some other things into account. I got significant credit on my university applications for an open-source project I worked on in high school, and a website I maintained (in the late 90s websites weren't exactly rare, but it wasn't yet to the "everyone has a blog" level of pervasiveness). Will a teacher who successfully inspires kids to do that sort of extracurricular stuff get credit in these evaluation schemes?
Alternatively, if you consider preparing kids for college to be the sole purpose of K-12 education, why not just measure it directly? For 2nd-grade teachers it would be difficult to measure due to the large time lag, but it'd be possible for high-school teachers. Instead of the state inventing its own tests, just use the % of students who go on to college as the metric, and then do some statistical analysis over a 5-year window to estimate the effect a particular teacher had on that rate. There are a lot of variables and each student is influenced by many teachers, but over a window you can make an estimate of whether a particular teacher had any influence (positive or negative) on college-admission rates, compared to a hypothetical average teacher of the same subject and of a hypothetically equivalent set of students (according to whatever demographic/etc. variables). A teacher version of http://en.wikipedia.org/wiki/Value_over_replacement_player, basically.
> K-12 education should teach kids useful stuff, meta-learning habits, civic information, attempt to inspire them, etc., not solely aim to maximize their SAT scores
I agree AND I'm willing to pay for that to happen.
However, I'm not willing to pay for teachers who don't teach that stuff. Are you?
So, how do you propose determining whether we're getting what we're paying for?
Note - if your answer is "trust us", my response is "not with my money".
This might sound trite, but I think the solution would be to couple standardized testing with performance based pay. Unlike code, it's hard to aim for solely short-term gains. As long as teachers aren't cheating, I can't imagine many ways to teach students that aren't inherently beneficial to the student. Teaching better study habits, etc might not be directly related to their material, but it still helps the students.
Also, I think there are fairly generalizable development timelines for children. While standardized testing can never be a gold standard in determining progress, I think we can both agree such testing does measure a worthwhile metric. I would definitely agree that there are edge cases, both in terms of individual students and in terms of school districts, but linking pay with progress seems on face value to be a good idea. That, plus independent school districts, is system-based evolution. Such selection depends on creating the right relationship between effort and reward.
The argument against standard test scores as the metric is that it covers two basic competencies, reading and mathematics. Which are of course fundamental but there are edge cases of people who have no interest in those fields but would prefer foreign languages, arts, etc. Teachers in the arts can inspire and help provide to students but don't have any standardized evaluation method.
Plus for students who are younger where one teacher teaches all subjects, the focus on mathematics/reading will increase compared to arts. The arts education significantly reduced during No Child Left Behind b/c of the requirements.
You can obviously argue that competency in basics should be achieved before arts too though. Just pointing out an issue
Foreign language yes, but art? I could memorize a bunch of facts about art, styles, etc. But that's not the same as doing well in the class. I could be a brilliant artist who learned a lot in the class because the teacher motivated me but I have no idea what the difference between penciling vs stenciling. (I have no knowledge of art, I just said that example as something that sounds similar but could be very different).
Most education in art focuses on the technical and contextual (history, styles, etc) aspects. The contextual aspects lend themselves quite well to standardized testing.
The technical aspects don't lend themselves very well to a "fill in the bubbles" test, but that doesn't mean they are subjective. Testing them would be more difficult, but it is routinely done.
>I think we can both agree such testing does measure a worthwhile metric.
I don't know about that. I've come across plenty of critiques of standardized tests, especially back during the debate over Bush's 'No Child Left Behind' initiative.
That kind of 'education' may produce good little worker drones, but I'm not sure it's something we want to base the entire education system of the Republic on.
In many cases, you will get better short-term results by drilling recipes or factoids than pursuing deep understanding, though the latter is far more valuable in the long run (and in early schooling, long is the only run that matters). There's too much trivia-memorizing in schools as it is.
I'm OK with paying a drill sergeant more than someone who sits around the teacher's lounge eating donuts all day.
I'm not saying this approach is a panacea to all of education's problems, I'm just saying this is a step in the right direction. I think the question this approach answers is the disconnect between effort and reward in education.
Regarding 2), you do it via Value Added Modeling. You build a statistical predictor of student performance (based on standardized test scores) then measure Performance = Avg(Actual - predicted).
> You build a statistical predictor of student performance (based on standardized test scores) then measure Performance = Avg(Actual - predicted).
And then you watch student performance get gamed by the teachers. There are all kinds of ways: turn a blind eye to cheating, move poor performers to others schools (including by getting them to graduate when they shouldn't), alter their tests (this is already being done under NCLB according to several news reports), ...
You're better off by having skilled teachers rate the other teachers. People aren't as easily gamed as systems of rules are.
Easy solutions to your problems: "turn a blind eye to cheating""alter their tests"
Tests are given by independent proctors and designed by an independent agency. In much the same way, the DOD rates Halliburton's performance - it's not great, but it's better than Halliburton doing their own performance ratings.
move poor performers to others schools (including by getting them to graduate when they shouldn't)
Won't help with VAM. If you get rid of the poor performers, the predicted value of the people remaining will increase. Thus, you just made your job higher - the bar has been raised.
People aren't as easily gamed as systems of rules are.
Your solution to fix education is to do things the way Haliburton & the DOD do?
Yes, independent ratings can help, but at some point, if you can't trust anyone inside the system, you're screwed. And your results are exactly as good (or bad) as your ability to do ratings. Fancy math won't fixed a flawed premise, for example.
We've been going for complex systems and simple people. Having good people and relatively simple systems, however, is the optimum from what I can see.
I'm not opposed to the idea that accountability is needed, but the idea that you can go up to a troubled institution, make everyone's job harder, and then expect improvement is unrealistic in any industry. I'm not a teacher, but I've seen the effects of management like that first hand.
Always taking the easy solution when managing something is not effective.
>> People aren't as easily gamed as systems of rules are.
> I don't even know how to respond to this.
Obviously, I thought it implicit that I was talking about competent and experienced folks. Yes, if you get people who know nothing and put them in charge, you will have terrible results. They won't improve even if you give them a long list of rules to follow, which is how bureaucracy usually ends up.
Your solution to fix education is to do things the way Haliburton & the DOD do?
Would you prefer to do things the way the educational system does? Throw money at Halliburton and hope for the best? No outside auditors or performance measurements?
I thought I just said that I would prefer to hire highly competent teachers to do the evaluations. Not third parties, not politicians, and certainly not Haliburton or the DOD.
Those who are competent ought to be able to recognize each other.
This makes sense but suggests that we need to simultaneously invest in better standardized tests. The most common standardized tests we have today make for poor year-round curricula.
I'm always in favor of better standardized tests -- who isn't? -- but one handy thing about value-added modeling is that if you use the same flawed tests, you should get similar outcomes: if a kid is in the 10th percentile of his class when he starts his sophomore year and the 20th when he ends, as measured by similar tests, then he's progressed.
Objection. The kid may have improved specifically in his ability to score well on a specific standardized test, and in no other way. And meanwhile, the process that delivered that superficial improvement could have immeasurably harmed other students who might have otherwise genuinely excelled.
If your point is that standardized tests have problems, then I agree. But I don't see any other way of gathering some kind of data, and, as far as I know, no proponents of using standardized tests -- including Gates -- want to use them solely to evaluate teachers.
But if you had a kid, and you had two teachers, both of whom regularly got classes in the 50th percentile, and one teacher regularly had kids leave in the 40th percentile, the other in the 60th, which would you want? In fact, L.A. is already now in effect conducting the experiment: http://www.latimes.com/news/local/la-me-teachers-value-20100... and parents are responding accordingly -- for good reason.
Your concerns are valid, but they are mostly misguided if the alternative to them is "doing what we're doing now." See here: http://www.marginalrevolution.com/marginalrevolution/2010/08... for more; "We cannot simultaneously claim, however, that teachers are vitally important for the future of our children and also that their effectiveness should not be measured." Also see the list of education-related articles I compiled here: http://jseliger.com/2009/11/12/susan-engel-doesnt-get
If your argument is that we should purse some kind of standardized metric for teaching effectiveness, and that it might take the form of a standardized test, I'm with you.
If your argument is "nothing could be worse than the way it is today", I strongly disagree. More than 70% of Americans are satisfied (or better) with their own local school. High-stakes testing can easily make schools worse by disrupting curricula and damaging incentives for teachers. If we're going to focus our efforts on the school systems that are in crisis, there are probably better interventions.
Measuring teacher effectiveness is a real problem that needs real work, but in the interim, primum non nocere.
Is there a study that shows that scoring well on standardized tests does NOT reflect real learning? Perhaps there is, but... There is a lot rhetoric to the effect that "scoring well on tests just means you are good at taking tests", but (1) there is nothing wrong with getting good at taking tests, and (2) are we sure?
Also, the process MAY or MAY not have harmed other students -- there is no reason to believe it would, that I can tell -- again, please respond if there is.
Yes. Numerous ones. In particular, "high-stakes" middle school standardized tests (of the NCLB variety) appear to correlate negatively with ACT/SAT performance (a test that actually matters).
Actually, percentiles are not such a great way to measure progress, it's just relative. What we should measure is something more like earned value- did he progress more or less than one year of material forward? That way, kids who are two years behind and two years ahead all get measured on how fast they are moving, not where they started.
Statistical forward looking models proposed by a guy who works in finance?
How could that ever go wrong :)
EDIT: More seriously, if it's already impossible to measure, I fail to see how more complicated metrics would do anything but do an even worse job of measuring. There might be exceptions to this rule in a few cases but it's generally the way things work.
The VAM metric (which is not very complicated) measures teacher performance by excluding student quality as a causative factor.
If you have low quality students, who are only expected to score at the 25% level, and you boost them up to 35%, you win. If you have high quality students expected to reach 75%, but the only achieve 65%, you lose. Do you believe it's really that hard to predict that Asian kids in 2 parent $250k/year+ households will score higher than black kids in 1 grandparent $10-20k/year households?
If you truly believe the effects of education are immeasurable, perhaps we should stop wasting money on it. We don't spend $859 billion on homeopathy for much the same reason.
Well, I'm pretty sure they're already doing something similar to that. And you consistently hear the best teachers complaining that by teaching to the test they're doing a worse job of teaching.
Developer productivity is really hard to measure with universal statistics -- every attempt to do so has failed miserably and usually made things worse. Does that mean that nobody should spend money on development?
You should not hire developers if they have no measurable effect on your software quality/feature set/P&L.
Incidentally, measuring developers is done routinely. Every bank (except maybe Citi) has a set of procedures and guidelines for outsourced development. These procedures make the process fairly predictable and they reduce costs/dependencies.
Much like outsourced IT, teaching is repetitive grunt work. It can and should be defined and measured.
Yeah, but the procedures are basically a whole bunch of subjective stuff added up. "Ok we're dividing this task into these subtasks, we're gonna put a deadline on each one and if you miss too many deadlines then we need to talk". That's way more fuzzy and subjective than a standardized test. You might sum up the subjective judgments in a quantitative way, but ultimately it's built on subjective assessments. Unless they're actually doing kloc or something.
1. How many of us have lost productivity because some other developer ran rough-shod through code caring not for maintainability, readability, or even extensibility? Which one was praised and which one rewarded? Doesn't apply to teachers? Ever have a dud who couldn't do simple math in charge of the algebra class? Guess what the next teacher has to deal with.
2. How do "normal" people in managerial roles judge competency between two people in such roles? When looking at our own industry we see how difficult it is for "normal" people to make those sorts of judgement calls. Some schools require only 2 years teaching experience before moving on to superintendent roles. Some states more. With such divergence what can we expect?
3. How will we measure such excellence? What metric? What new system will evolve to capitalize on this metric despite the spirit of the metric? Unintended consequences?