Rabu, 25 Juli 2012

Judgment calls and philosophy of science


Lately I've done several philosophy-of-science posts on macro, complaining about what I call "judgment calls" (see here and here). I've been getting a lot of comments about philosophy of science, and I thought I'd take some time to step back and lay out how I think about the subject. So here's a post on Noah's General Philosophy of Science.

Preliminaries:

1. Language. I don't believe in demanding that terms be precisely defined, or in demanding that definitions be perfectly consistent. Reason: Defining terms precisely is incredibly hard. I find that philosophical conversations that focus too much on definitions and jargon quickly get lost in the weeds of "What do you really mean by X?". Sometimes you actually do need to drill down and figure out exactly how you're using a term, or how your usage is different from somebody else's, or how the usage depends on the situation. But I think that sort of intense focus on terminology should only be used sparingly, in times of great need.

2. Past Philosophers. I only care a little bit about what Popper, or Kuhn, or Feyerabend said. Not zero, because those guys were smart, and they read a lot of history and a lot of other people's ideas. But I only care a little, because I really want to figure these things out for myself instead of taking the word of an "expert", and I believe that I am intellectually equipped to do so (note: I am not trying to push the boundaries of philosophy-of-science scholarship here). So if I say something that conflicts with what Kuhn thought...well, as the Japanese say, sho ga nai ne! So to anyone who reads this and says "You are such an ignorant amateur, why don't you go read what some real experts say?!", I preemptively reply: "Why don't you try thinking for yourself instead of parroting an authority figure?" I see absolutely no reason not to reinvent the wheel occasionally. (If you're interested, my main sources of ideas were probably Charles Marcus, Robert Laughlin, Richard Feynman, Lee Smolin, Robert Waldmann, and Steven Smith, as well as some of those well-known philosophers. But some parts I think I just made up.)

3. Ontology. Ontology is the philosophy of what existence means. My ontology is basically what I think of as "pragmatist"...we believe things because it's useful to believe them. If you disbelieve in the existence of a wall, you're going to stub your toe and it's going to be unpleasant. Or maybe not...try disbelieving in the wall and let me know the result. I'll be over here with a beer, getting your experiment on video. That's basically my philosophy of what "existence" means. One result of this outlook is that I think of "detectability" (or "observability") is the same thing as "existence"...if you can't in some way, however indirectly, stub your toe on something, it might as well not exist. I don't know if that's what other people mean when they say "pragmatism", but see Point 1 about language.

4. Epistemology. Epistemology is the philosophy of how you can know things. When it comes to science, there are limitations on how much you can know. Tomorrow, things might all start falling up. You don't know that they won't! This is, I believe, called the "paradox of induction". "Laws" of the Universe might change tomorrow. If we're lucky, they won't change. So far, things still fall down. Whew! Also, tomorrow there might cease to be any sort of "laws" at all. See here and here for ideas about what might happen in that case. Suffice to say that it would be a weird, weird day.

OK, now that those are out of the way, on to My General Philosophy of Science:

5. The Goal of Science. The main goal of science, as I see it, is to increase humankind's power over the Universe. Where did I get this goal? Simple; I made it up...where else does anyone come up with goals? Anyway, combining this goal with Point 3 about ontology, I think the aim of science should be to give humankind the ability to accomplish pragmatic things, like predicting future phenomena, or making technology, etc. A secondary goal of science would be for its direct pleasure benefit for non-scientists - e.g., people can read about infinite inflationary cosmology and go "Wow, that is neat-o!" I do not view the pleasure of scientists themselves as a goal of science.

6. Scientific Models. George Box famously said that "All models are wrong, but some are useful." To me, this is like saying "No house is 100% big, but some are big." It's just a silly statement. No model describes all of reality. Most or all models fail to perfectly describe the set of phenomena that they purport to describe. And if some model - say, general-relativistic quantum mechanics - does perfectly describe its chosen set of phenomena,  - say, quantum mechanics, or general relativity - it would hardly be worth mentioning. Combining Points 3, 4, and 5, I think that "useful" and "right" are the same thing when it comes to scientific models. Perfect usefulness ("rightness"?) is only one measure-zero point on a multidimensional continuum of rightness/usefulness.

7. Techniques of Science. It seems to me that all scientific endeavors involve three basic processes: A) Logic, B) Evidence, and C) Judgment.

7a. Logic. Logic, to me, means following some sort of rules for your arguments. I basically think you should always use some sort of logic when you make your arguments, since it seems to help convince people of stuff in a repeated and consistent manner. Other methods of convincing - appeals to emotion, for instance, or tribal affiliation - do also seem to work sometimes, but more sporadically. So I think scientists should always use logic when they can. But logic is like a rule for constructing a chain...it doesn't tell you where to start the chain. You need some sort of premise or starting point.

7b. Evidence. Evidence, to me, means that how well a theory matches past or existing data is an indication of how right/useful the theory is. If science is going to work, then tomorrow is going to be have to be something like today. By Point 4, this means that there have to be some sort of "laws", over some sort of time horizon. Maybe the laws only hold for a short time, but they have to hold for longer than it takes you to figure them out. This means that "induction" is going to have to work to some degree. In other words, how well a model or theory describes data today must be some sort of indication of how well it describes data tomorrow, or else science is useless.

Now here we get to an interesting side question: In what way is a theory's descriptive power an indication of its prescriptive power? You can say "If a theory doesn't match the data, it's not useful/right." This, I think, is what people call "falsification". Alternatively, you can say the converse: "If a theory does match the data, it is useful/right." I don't know a name for that. Actually, I think both these statements are too extreme. Rigid insistence on pure falsification is not always a good idea, since the cutoff for saying a theory "matches data" is pretty arbitrary, like a confidence interval in statistics. And the converse of falsification - "It looks right so it must be right" - seems to lead to overfitting. Also, ranking theories based on how well they fit the data has its own pitfalls, since theories that make big mistakes can sometimes lead to future theories that make fewer mistakes than the best presently existing theories - as an example, Copernicus' initial heliocentric model was less good at predicting eclipses than Ptolemy's geocentric model with epicycles, but it led to the creation of Kepler's theory of elliptical orbits, which predicted both eclipses and planetary phases better than Ptolemy's model.

So in selecting your criteria for matching theories to evidence, you inevitably need to use some judgment.

7c. Judgment. Judgment, to me, basically means using your intuition or instinct to tell you things about the world. Some of this is always present in science, for the reason specified above (deciding how to match theory to evidence). Also, there's at least one other reason to use judgment, since formally there are infinite possible hypotheses to test, and infinite models that fit any set of phenomena. (This is pointed out by Robert Pirsig in Zen and the Art of Motorcycle Maintenance, which is mostly a first-hand account of psychological disorder, but is also a cool philosophy-of-science book.)

So you always need some judgment in science. There are many different ways to use judgment, and the the amount to which you use judgment in each of those ways can vary. A few examples include: What do you set as the null and alternative hypotheses? What kind of confidence intervals do you set for your regressions? Do you toss outliers, and if so, which? How do you penalize the addition of parameters to the model? And, most importantly, there is The Big Judgment Call: How do you use the match between theory and evidence to evaluate the usefulness of a model?

Basically, it seems to me that if you don't have logic, evidence, and judgment, you're not doing science.

8. Different "Scientific" Disciplines. There are many different disciplines that purport to inform us about the world - physics, history, biology, economics, etc. Some of these call themselves "sciences", some call themselves "social sciences" with an emphasis on the "social", and some don't call themselves "sciences". The ways in which these disciplines use evidence and judgment are different. Furthermore, the way that each discipline uses evidence and judgment may change in time (witness the changes in physics between Aristotle and Feynman!). These changes are probably evolutionary, based on trial-and-error - basically, kick your foot into a way of doing science, and see if you stub your toe or not. Aristotle's way of doing physics wasn't useful for producing models that allowed gunners to hit targets accurately from long distances; Newton and Galileo's was. Aristotle's way of doing physics mostly died out, Newton and Galileo's survived and evolved.

Evaluating the usefulness of an approach toward science involves a sort of meta-science, involving evidence about judgment, judgment about judgment, evidence about evidence, and judgment about evidence. Physics seems to have a lot more replicability than history, which is probably why physics has usually relied a lot more on evidence, and history a lot more on judgment. Nowadays, with the controversy over string theory, we see a big debate over whether physics should evolve toward a method in which evidence is less important. In history, with the efforts of people like Daron Acemoglu and Jared Diamond, we see people arguing that history can rely more on evidence than in the past. (Yes, I am using the terms "less evidence" and "more evidence" VERY loosely, see Point 1 about language.)

So each science has its own "scientific method", and this method evolves in time. We just have to figure out what seems to be working and what doesn't seem to be working, and adjust accordingly. Its not always obvious, and it's certainly not a rapid process. We can only hope that the marketplace of ideas promptly produces the best scientific method for each discipline given the available technologies of investigation. But it seems to me, witnessing the failures of science to blossom in the Roman Empire, Abbasid Caliphate, and Sung Dynasty, that opportunities to improve science are often missed.

9. A Few Unhelpful Ideas About Science. I'm not sure if anybody says exactly these things, but they loosely conform to ideas that some people seem to entertain (i.e. they are straw men), so I might as well list them...

"Since all disciplines use judgment in some way, all uses of judgment are equally appropriate." Actually, different uses of judgment sometimes seem to produce radically different results within a discipline, as with the oft-discussed shift in physics, chemistry, and biology in the 1600s and 1700s.

"Science does not always need evidence; sometimes, we can start from judgment and proceed by logic to a conclusion, and then accept that conclusion without checking it against evidence." This is like when the evil wizard tries to win by turning himself into a snake...it never works.

"We only make theories in order to check the internal consistency of our ideas." Who cares? Remember, scientists having fun is not a goal of science (according to my arbitrary value system). We do not pay you $200k/yr to play video games; why should we pay you $200k/yr to satisfy yourself of the internal consistency of your ideas?

"It takes a theory to beat a theory." This may be true sociologically, as a description of how science evolves in practice, but I don't think this ought to be the case. I am fine with doubt and ignorance. I think that bad ideas can block and delay the development of good ideas. And I think that a false sense of certainty can lead to mistakes (e.g. by policymakers).


And, finally, we come to the application of this General Philosophy of Science to macroeconomics. There is not much new in this section; it's a summary of things I've said before, and I just included it here so that you could see how I map from my philosophy of science to my tendency to complain about macro. Since it's such a rehash, this will be my last big complaint session about macro for quite some time.

10. Judgment Calls in Modern Macroeconomics.

I occasionally complain about certain uses of judgment in modern macro. These mostly revolve around one complaint: The macro field does not, in my opinion, use sufficiently stringent criteria for rejecting theories. Multiple theories are simultaneously judged "good" by explaining the same stylized facts (for example, producing simulated economic fluctuations in GDP that match the variance of observed GDP)...These mechanisms can't all be accounting for 100% of the same phenomenon at the same time! Also, macroeconomic models are rarely if ever tossed out because of the results of some statistical test (statistics being the only way we have of matching macro theories to data, since experiments are unavailable). Additionally, the microfoundations used in macro theories are not required to match the microfoundations observed by microeconomists. It thus seems to me that there is "too much" judgment involved in modern macro, and "not enough" evidence. Yes, "too much" and "not enough" are loose terms. 

I suspect that the reason for this, historically/evolutionarily speaking, is the poor quality of macro data. Macroeconomics has better data than history, but not a lot better! You're still dealing with time series that may or may not be ergodic (in other words, macroeconomic history may have just been "one damn thing after another", with no stable "shock process" or "adjustment process"). The time series may not be stationary (unit root tests have low power). Cross-country comparisons are notoriously difficult.

So if you require macroeconomics to hew to the same standards of empirical verification/falsification as, say, tax economics or financial economics, you will be left scratching your head most of the time and saying "Well, we just really don't know what the heck is going on!" So, historically, macroeconomists had to settle for less ambitious goals. They had to behave more like historians, writing "literary" tomes vaguely describing what they thought was going on. After World War 2, this changed, and macroeconomists started to describe their ideas in the language of mathematics. For a while, people thought macro could work a bit like physics, but the Lucas Critque and some notorious policy mistakes seem to have dashed that ambition. Now, macroeconomists seem to be back to "telling stories" (a phrase they themselves often use), though they've retained the language of math.

One common response is that macroeconomics produces a bunch of different models that tell a bunch of different stories, and that judgment should be used to select which stories apply at which times. And I am OK with that in principle! Maybe evidence, rather than judgment, can be used to tell whether, for example, the Diamond-Dybvig model of bank runs is about to come into effect (people like Markus Brunnermeier and Hyun Song Shin are trying to do things like that, which is just one reason why I am big fans of theirs). But the set of possible stories is essentially infinite; it is a certainty that some of these models are bad ones; that they are not as good as they could be, and that evidence could be used to show this and to construct better models. I feel - and this is just the sense I get from talking to people and going to talks and reading papers - that not very much model rejection is being done.

In other words, although I am not a strict "falsification-ist", I think that rejecting models is almost certain to be an essential feature of using evidence to select the best set of models to use in practice.

And what I suspect is that macroeconomics went so long without any hope of matching any data that it developed bad habits. Internal consistency and the collective intuition of macroeconomists were overemphasized, and what little data there was was often ignored. Theoretical tolerance became the norm, and models that were essentially never useful remained prominent in the toolkits of economists and policymakers alike. And the large reliance on judgment seems (unsurprisingly) to have allowed some political bias to seep into the profession.


So to sum up: I don't complain about macro methodology because I have a rigid idea of what "science" ought to be, and I demand that all disciplines either live up to the standards of physics or admit radical ignorance. I simply judge that macro has too much judgment in too many places, that there are popular models out there that could and should be rejected by what little evidence exists, and that many macroeconomists should admit more doubt about our understanding of the "business cycle".

Maybe I'm wrong, and if so, I'm prepared to revise my thinking...

Tidak ada komentar:

Posting Komentar