Why are Distributions Non-Gaussian?
Most recent answer: 11/05/2013
- Anonymous
In a way, that's like asking "when you see a non-elephant animal, what is it?" OK, it's not that extreme because there is a fairly common class of problems- observables which are the sum of many independent contributions- that gives Gaussian distributions. There are also some other special cases that have Gaussian distributions. The generic case, however, is non-Gaussian. Since for some decades I mainly made a living by studying a variety of non-Gaussian statistical variables in physics, you probably don't want to get me started telling their many diverse stories.
Here's an example. Maybe you're looking at resistance in a small sample of a material with an insulating and conducting phase. Disorder in the sample breaks it up into little domains. Right near the phase transition a small numebr of domains fluctuate randomly between the two phases. The distribution of resistances may be bi-modal, or 4-modal, etc. The explanation is obvious.
With regard to the skill distribution, things are more subtle than for simple physical variables. For starters, if you define skill as some sort of positive number, the distribution couldn't be exactly Gaussian, because Gaussian distributions have tails extending infinitely far in both directions. There's another basic problem in describing a "skill distribution". There's no natural measure. Say A can use only a hammer, B can also use a scewdriver, and C can use Allen wrenches as well. What's the ratio of the skill increment from A to B to that from B to C? The numbers you assign are an arbitrary choice. The same problem comes up in designing, for example, IQ tests. There the test designers chose to define the differences to make the distribution Gaussian, just for convenience.
So let's pick some variable where the facts tell us what the distribution is, not where we tell the facts what distribution we'll choose to describe them. Income is a nice variable with a fairly clear measure. It's highly non-Gaussian. In order to survive, some minimum income is needed, especially if the dollar value of assistance is included. So the distribution not only stays positive but doesn't go down to zero. It's highly skewed upward. Sometimes people claim that many factors independently affect income multiplicatively. That would give a Gaussian distribution not of incomes but of logarithms of income. Actual distributions have much bigger tails at the high end than even that skewed distriibution. Social scientists discuss why that happens, often with ideas along the lines of "for whosoever hath, to him shall be given".
Mike W.
(published on 11/05/2013)
Follow-Up #1: non-Gaussian distributions in physics and income
- Anonymous
1) There are countless examples to pick among, but here's one of the simplest. Say you're counting radioactive decays from a big block of weakly radioactive material. The decays are independent of each other. The distribution of counts in a given time interval is Poissonian. After enough time to expect N counts, the chance of getting n is P(n)= Nne-N/n!. That's skewed upward. Only after N becomes very large does the meat of that distribution start to look Gaussian.
What about one skewed toward the low side? There must be something more obvious, but here's the first that crosses my mind. Look at the fluctuating EMF's as you magnetize a piece of iron with some impurities in it. (These fluctuations are called Barkhausen noise.) The magnetization changes are all of the same sign, so you can take their logs. The distribution of the logs is skewed downward. There's a pretty good theory describing this process. (Google "Barkhausen ABBM".)
2) I've never heard of any negatively skewed income distribution on a national scale. Usually one keeps track of width of the distribution of logarithms of incomes, which is related to but not the same as the skewness of the income distribution. These widths vary a lot between countries and over time, even in my lifetime. My personal opinion, which I believe is supported by a lot of data, is that most indices of health (life expectancy, etc.) tend to be higher in societies with narrow distributions. You could scrounge around among social science research for good data. (some key words: "OECD", Saez, Gini index)
These distributions are changed by many, many factors, including election results. Perhaps the most important historical event, however, was the development of agriculture, which allowed stored surpluses and large inequalities.
I've already been blabbing beyond my expertise, so I won't address the role of education. The OECD keeps good data on changing inequality over time in many countries, and you could try to see if that reflects changes in educational systems.
Mike W.
(published on 11/08/2013)