Thanks to Graham Toal for pointing this out!
As has been noted frequently by others, correlation absolutely does (usually) imply causation (just not necessarily the simplistic X → Y model that immediately forms in our head after reading “X is correlated with Y”). The problem has always been that correlation itself is never enough to know where this causation came from. There are too many possibilities, such as:
- X → Y
- Y → X
- X + Z → Y
- X + Z – (H + L) / J + (K + S)TU → Y (I mean it can be really complicated; you get the picture…)
Apparently, however, in relatively simple two-variable systems, causality can be identified accurately about 80% of the time, from purely observational data. A recent paper, from a team of German, Swiss, and Dutch researchers, reports the findings. Using a variety of known and simulated cause-effect situations, and utilizing only the observational aspects of the data (not using any experimentally-manipulated situations, for example), the researchers report a very high success rate at figuring out whether X caused Y or vice-versa, by analyzing asymmetries in the “noise” (or error variance) associated with X and Y. The process is called the “additive noise method.” Because, as it turns out, noise in the causal variable can influence the noise in the effect, but not the other way around.
I suppose it’s still possible that this is bad science or bad reporting or something, but to my not-very-astute eye it looks legit. If so, I’m sure it will be developed quickly for other applications. If it is even moderately effective with the very noisy variables many of us in the behavioral sciences deal with, I think it will get itself integrated into SEM methodology and its many descendants and cousins. This would be a huge step forward, as SEM models are currently criticizable for almost certainly mis-specifying cause and effect a lot. This would reduce that a lot. And would probably reduce the number of SEM analyses appearing in behavioral sciences journals, once it became extremely difficult for one’s model to fit both the covariance structure of the data and the patterns of error variance suggesting which variables were later in the causal chain and which earlier.
I am excited 🙂