Saturday, August 18, 2012

The Statistics of Climate Change - A Case Analysis of Dr James Hansens 2012 Paper

The Statistics of Climate Change - A Case Analysis of Dr James Hansen's 2012 Paper

Climate change is such a hotly debated topic that almost everyone has an opinion on it and even to not have an opinion ("I just don't know") is a valid opinion as many believe that we still do not have enough data to swing the decision in either side's favour.

The phenomenally controversial head of the Goddard Institute of Space Studies of NASA and noted Anthropogenic Global Warming proponent, Dr James Hansen has published a paper called Perception of Climate Change, in a scientific journal called Proceedings of the National Academy of Sciences, in which he gives statistical data for temperature changes over the last six decades in the Northern Hemisphere of the Earth. Using this data, Dr Hansen concludes that the Earth has been getting much warmer and that the evidence is incontrovertible. Further he also claims that extreme weather situations have become much more common than earlier. The Economist has carried a story on this study and this is where I first read about the study.





The data graph of the study given in The Economist






Description of the Graph :

  • The data plotted pertains to temperature readings during the months from June to August for six decades starting from 1950s down to 2011 for the Northern Hemisphere.
  • The data plotted is relative data not absolute data. This has been done to facilitate better comparisons.
  • The reference curve (given in Dark Brown) comprises the average temperature values from 1951 to 1980; this builds a base with which to compare the temperature variations in each of the six decades. 1951-80 has been taken as a reference because this time period is long enough to build a normalized data range.
  • 0 marked on the X axis (standard deviation) is the average temperature for the reference curve (1951-1980), which obviously has a sstandard deviation value of 0.
  • The data for each decade is Normally Distributed and hence the peak frequency value for each decade coincides with the Mean(Average)/Median/Mode temperature of that decade. Further the 68-95-99.7 rule holds wherein 68% of the temperature values for any decade fall within 1 standard deviation on either side of the Mean temperature value, 95% temperature values fall within 2 standard deviations on either side of the Mean and 99.7% fall within 3 standard deviation.
  • Dr Hansen describes an "Extreme Condition" as a temperature value which falls beyond 3 standard deviations from the Mean ie beyond 99.7% of the temperature range. This obviously has to be an absolute limit because uncomfortable or harmful weather is mostly an absolute figure in the short run. In the long run, life could perhaps adapt to the permanent changes but then a few decades can not be considered enough for adapting to these extreme conditions. These Extreme Conditions are perhaps the most important part of the data, from Dr Hansen's point of view.
Conclusions from the Graph:
  • The data plotted clearly shows an increase in the average Summer temperature with each new decade. This can be observed from a right shift, along the X axis, of the Mean temperature value for Reach successive decade as shown by the peak of the Normal Curve for that decade. The Reference Curve's Mean temperature value moves from 0 to 1 for 2001-2011.
  • The data also shows a much broader range and hence variation of temperatures for each successive decade. This can be seen from the increasingly flattening Normal Distribution. As one moves from one decade to the next, the frequency for the temperature starts to spread out over a wider range which shows up as a lower and flatter curve than the one for the previous decade. This means that since 1951 temperatures have been fluctuating as an ever increasingly pace. This large variation translates into lesser and lesser equitable climate, lesser reliable weather and could mean greater stress on crops and businesses and more discomfort for people, animals, plants and vegetation in general. The number of recordings for the Mean temperature drops from 0.4 readings in the Reference Curve to 0.3 in the 2001-2011 curve.
  • The cases of Extreme Conditions in weather,as defined by Dr Hansen, were about 0.3% or less for the Reference Curve but for the 2001-2011 decade these were 6-8% or even more for the hot right side of the curve (though not so for the cold left side). This can be inferred from the right most part of the 2001-2011 Normal Curve, representing perhaps 1.5 standard deviations which  has crept in the zone of Extreme Heat. Hence there are 6-8% temperature values for the decade of 2001-11 which fall into the category of Extreme Heat. Hot weather extremities are clearly on the rise as well.



Positive Attributes of the Study:


  • The GISS data is all encompassing. Data for the entire Earth has been collected. This is not a sample survey, this is a reading of the entire data universe (the entire Earth in this case). Climate is always a global phenomenon (as opposed to weather) and that is why climate change can only be talked about on a global scale. Rising temperatures in one or many parts of the globe will always be concomitant with other related phenomena in other parts of the globe. This universality of data is one of the two big advantages of this study. On the flip side however, only the data for the Summer months in the Northern Hemisphere has been actually plotted in the Normal Curves (see the Doubts section below for more details)


  • Data plotted is actually recorded data, not future projected data. There are no modelling results used here, no assumptions to validate or disprove. This is all genuine and historic data, hence the readings, at least, are unassailable. This, absolute independence from projected data to my mind is the the other big advantage of this study.
  • This data analysis has been accepted by a leading scientific journal called Proceedings of the National Academy of Sciences and this gives it some serious credibility, though, it is no guarantee of the water tightness of the analysis.



Doubts over the Study

Statistics have to earn our trust. We can't give it away for free to them. They have to be able to withstand our scrutiny. And this applies more so to statistics we want should be true, the ones which prove something we hold dear. Or else, suddenly, we might find ourselves betrayed by our own hasty acceptance of data we want to believe in.
Data Doubts



  • Is a period of 60 years enough to provide data for climate change? What is the accepted time  span amongst the scientific community for talking about climate rather than weather, especially when discussing long series temperature changes? In fact, is there is any consensus at all about the time span? What results do we get when we analyse the data for the last 100 years (provided we have reliable global temperature data going back that far).


  • Why only plot the data for the Northern Hemisphere and not for the Southern Hemisphere? Perhaps one explanation can be that this has been done in order to facilitate a true comparison. Summer in the Northern Hemisphere extends from June to August, whereas at this time the Southern Hemisphere usually experiences Winter. Hence in order to compare apples with apples, one needs to select the Summer months' data from one hemisphere ie compare June-August data for each year for the Northern Hemisphere and compare December-March data for each year for the Southern Hemisphere. The temperature data for the Summer months in the Southern Hemisphere should be plotted separately as a Normal Distribution for a truly global analysis. If there is indeed global warming taking place then the data for the Southern Summer months will further support this conclusion because, again, climate is a global phenomenon.


Correlation and Causation

Although, in this scientific paper Dr Hansen has not made any claims about the causes of this clear rise in global temperatures, he is a vociferous advocate of Anthropogenic Global Warming (read human induced global warming) and will place this data analysis in that context.

I think the single most important caveat for anyone looking at statistics, be they of weekly milk prices for one's household budget or climate change figures, is to not confuse correlation with causation. Sympathetic movement of two things does not imply a dependency relationship between those two things. Recently Krishnamurthy V Subramanian of the Indian School of Business has written a very lucid and meaningful article on the difference between correlation and causation. It can be accessed here.

I believe that climate change is being brought about human actions.I don't know this but I believe it. I am constantly trying to find ways to reduce my carbon footprint. However I do not wish to be slotted into either of the two opposing camps.

However instead of going into why I believe this I would like to dwell on some questions come to my mind based only on rationale. The answers to these questions will need more data and analysis but once answered they will perhaps help convert my belief into my knowledge and certainly aid policy makers and businesses to come to terms with the reality of our economic activities (again provided these questions are answered).

These questions directly address the correlation and causation problem for this study on climate change. Answering these could help convert any correlation between greenhouse emissions and climate change into a casual relationship and further even establish the direction of the relationship.

  • How are we certain that human action is causing climate change? Evidence seems to suggest that for all of the previous climate changes in the 4.5 odd billion years long history of our planet, humans have not even been around to witness them, let alone influence them. 
  • Other causes need to be eliminated (though I personally do not give much credence to most of them) such as increased Solar activity, fundamental geo-changes, inner core dynamics etcetera.
  • Perhaps this time, that we are living in, is the inflection point of a very long run millions of years long climate cycle and hence this sudden acceleration might be a regular thing before such a cycle enter the next phase.
  • Could it be that in the absence of greenhouse gas emissions temperatures would drop rapidly? Is human activity somehow forestalling the next Ice Age, is this delaying of the Ice Age somehow better (but at what cost)? Perhaps there are some human activities which are masking the effect of excess CO2 (such as smoke emission which cools as opposed to CO2 emission which warms)


The data above clearly points to increasing weather temperatures in the Northern Hemisphere since the 1950s. Further heat extremities are also rising, fast. But I do need to clear my reservations, as given above in the Data Doubts section, before I can whole heatedly embrace the analysis. Perhaps within the next few years even more conclusive and exhaustive statistical studies on climate change will be carried. I eagerly await that time.


To know more about:

The Economist's story on the study published by Dr Hansen

Climate change Pro Anthropogenic Global Warming
New Scientist
NASA on climate change
Earth Observatory's global temperature data going back 2500 years
And this page which gives human greenhouse gas emission data alongside

Anti-Anthropogenic Global Warming
Telegraph
Skeptical Science
Wikipedia


Normal Distribution in Statistics
Cliff Notes
Stat Trek


Meanwhile, here in Delhi.......it's getting hotter and drier and more erratic........please ignore, as this is anecdotal evidence and does not make for good statistics ;)

3 comments:

  1. Normal Climate signal is calculated from a 30 yr data. Currently all scientist use 30 yr cycle to assess any climate change, but yes any longer cycle will surely help to understand whether the climate change signal as some anthropogenic cause or not.

    Since NH has more measurements as compared to SH (due to more land mass), that's why most of the studies which utilize observed data are from NH.

    Since the study has a global scale and is plotted for the entire NH, there will always be variation at regional scale.

    In the end, I as a climate Modeller doesn't reply only on statistics since Climate is a dynamic phenomena and pure statistical analysis is not robust.

    ReplyDelete
    Replies
    1. Read reply as rely

      Delete
    2. Saurabh, your comment are very informative and given your position in one of the leading climate research institutions of the world, also highly authoritative. Thanks for writing in.

      So compared to 30 years which is usually taken for climate analysis, 60 years is a very good time scale. If I remember correctly the 1930s-40s are considered much hotter than usual. If this is indeed the case, then the rise of temperature post 1970s would not be a sharp departure from the 100 year trend.


      I had not thought about NH simply having more observation posts than SH. That is also a reasonable explanation.

      Thank you for shedding light on some of the questions. Let's hope more robust studies on climate change come out soon.

      Delete