Not all CMIP5 models can be correct

Ed Hawkins has an interesting new paper which discusses why models use temperature anomalies and how the normalisation period chosen can affect the results. The main reason why anomalies are needed in the first place is because the models have a large spread in Global Surface Temperature(GST) values. The paper makes this clear.

It may be a surprise to some readers that an accurate simulation of global mean temperature is not necessarily an essential pre-requisite for accurate global temperature projections.

CMIP5 Global surface temperatures taken from the paper. The coloured graphs are Meteorological reanalyses and represent best 'observed' global values

CMIP5 Global surface temperatures taken from the paper. The coloured graphs are Meteorological reanalyses based onweather forecasts and represent the best estimate of ‘observed’ GST.

Calculated GST values can differ by up to 3C which is nearly 4 times larger than all observed global warming since 1850. Ed argues that this does not matter since the trends are all rather similar so that by calculating anomalies such offsets are subtracted out. The use of anomalies for sparse weather station data is common practice but this is not true for models which are typically run  on a 1 degree grid of the surface. This all sounds a bit fishy, so  I decided to look in more detail at where such large temperature differences occur, and whether this is not simply papering over other systematic problems with the models.  I selected GISS-EC-R which has a low climate sensitivity and GFDL-CM3 which has a high sensitivity. The plot below shows the net temperature difference by 2100 between the ‘hot’ running GFDL-C3 and the ‘cooler’ GISS-EC-R . The former has a temperature increase of 5.6C by 2100 and the latter of 3C.


Plot shows the difference in temperature between the 2 models averaged over the decade 2090-2099.

The largest differences are in the Arctic, Himalayas, Andes and central Africa. This difference remains consistent throughout the decadel projections since 1861, while by 2100 the Arctic discrepency of over 5C has spreads across Russia and into Europe and Canada by 2100.

We can also  compare how the Arctic sea ice cover evolves for both models in 5 steps, 1861, 2015, 2035, 2061, 2099. This is shown below.


GISS-EC-R Minimum summer ice coverage from 1861 to 2099


GFD-M miniumum ice coverage from 1861 to 2099

Summer Ice essentially dissapears about 40 years earlier in GFD than in GISS.

There seems to be a gentleman’s agreement among modeling groups not to criticise other members of the CMIP5 ensemble. I understand that huge amounts of effort is put into developing each GCM, but they simply can’t all be correct. Physics has always been based on developing theoretical models to describe nature. These models make predictions which can then be tested by experiment. If the results of these experiments dissagree with the predictions then either the model can be updated to explain the new data, or else discarded. Why should climate science be different?

Surely we are already at the stage that we can distinguish between models based on measurements. Why is this not being done?

So let’s  just compare global temperature anomalies directly.

Compare CMIP5 temperature anomalies models to Hadcrut4 (black points). Blue graph is the GISS-E2 model. The red curve is a fit to Hadcrut4 extended to 2100.

Compare CMIP5 temperature anomalies models to Hadcrut4 (black points). Blue graph is the GISS -EC model. The red curve is a fit to Hadcrut4 extended to 2100.

Detaiedl comparison of Hadcrut4 and Models.

Detailed comparison of Hadcrut4 and Models.

It would apear that  the data is already pointing one way. The data certainly favours the low sensitivity model GIS-E2-R and essentially excludes the high sensitivity GFDL-CM3. RCP8.5 represents more than a quadrupling of CO2 forcing by 2100 (2.2TCR), so TCR looks likely to be ~ 1.4C.  A value of 2.5C is ruled out.

About Clive Best

PhD High Energy Physics Worked at CERN, Rutherford Lab, JET, JRC, OSVision
This entry was posted in AGW, Climate Change, climate science, GCM, Science and tagged , , . Bookmark the permalink.

20 Responses to Not all CMIP5 models can be correct

  1. Ron Graf says:

    Clive, great post. Nice animations of arctic ice. You have a typo on fourth line “amkes” makes.

    “Surely we are already at the stage that we can distinguish between models based on measurements. Why is this not being done?”

    Three reasons: First is the same reason the sports fan will not leave the game hoping his team will score three goals in a row even thought there is only 10 seconds on the clock. In climate the clock never runs out. Second, climate science’s importance is primarily in the political arena and any admission of failure can have high perceived political costs. Third, who decides what the criteria is for failure? How would it be decided who decides?

  2. I suggest you consider a comparison using RCP6 or similar, instead of using RCP8.5. The fossil fuel resources just aren’t sufficient for that maximum emissions case.

    • Clive Best says:

      I agree. RCP8.5 only reaches such high radiative forvcing (8.5 WATTS/M2) cecause it assumes that the carbon cycle saturates and an ever increasing % of human emissions remains in the atmosphere essentially for ever.

      Right now I only have the RCP8.5 model runs because they are the ones used to scare the public .

  3. DrO says:

    Once and for all, there is NO possibility whatsoever to produce ANY meaningful climate forecast, at least not in the IPCC et al sense. The mathematics is absolutely crystal clear on this, and, as far as the math’s are concerned … non-negotiable … at least until someone figures out how to “out smart the cosmos” and come up with a techniques for dealing with aperiodic/bifurcating/singular/non-stationary etc etc systems.

    Even then, we just don’t have the data, and cannot possibly have the data for at least many decades to follow.

    … and then we must figure out how to incorporate a huge number of real world forces currently (“conveniently”) omitted from the IPCC et al models.

    Amongst other things:

    1) Ron Graf is absolutely correct, though I would make his second point, the first and foremost point. This is a political matter, not a scientific matter. No amount of science or math’s will help here.

    All the (little) data we have repeatedly crushes the model’s. Crucially, what little data we have is massively abused to arrive at “politically correct” results.

    … if you can’t even agree on temperature, you are in deep doo-doo.

    … when things get so desperate that you have “cook” the data to this extent (they have been cooking the data for some time, just listen to Hansen) … it is really well and truly outside of science.

    2) Even more important compared to forecasting the “climate”, is the forecast of whether the change in climate is “good, bad, or otherwise” for mankind.

    This is absolutely categorically impossible (I compare temp forecast to a difficulty of “countable infinity”, while impact-on-man forecasts as an “uncountable infinity” difficulty.

    … the only experience we really have with this is the last 150 years or so, which, while there has been global warming, has also been the absolutely greatest period of increase in the standard of living for the planet/mankind in recorded history. If that’s what global warming brings, then “bring it”.

    3) Ensemble averaging is one of the most evil abuses going. I can model a submarine, and model an air-plane, then may I “ensemble” those to imagine that I have modelled a “car” … I don’t think so. While there is one tiny portion of the modelling where a tiny amount of ensemble averaging might make sense … it is an entirely and wholly misappropriated method to achieve politically correct ends, not science.

    The real (political) reason for ensemble averaging is to give the impression that there is “stable” forecast, and rely on a lot of nonsense that “errors cancel” … bullocks, the entire point of aperiodic systems is to capture the fluctuations, not average them away.

    Moreover, by routinely showing plots of ensembles, one can trick the reader into imaging that the giant and erroneous variance in the models is a possible outcome for the planet … very sneaky indeed.

    ASIDE: this is a bit like the “iconic” pictures used by fanatics showing giant columns of what appears to be “white pollution” from “big fat towers”. In fact, almost all of those images are “cooling towers”, and the billowing exhaust is “water vapour” … NOT combustion by products.

    Not only are those images “lies”, but also it is perversely ironic that as water vapour is 4 – 15 more powerful a GHG, the IPCC et al “pooh-pooh” it to spin their CO2 story … baffling.

    4) The very clear and increasing reliance on “trends” or otherwise statistical methods (in my part of universe referred to as maximum likelihood estimation) is the final death-knell for climate modelling, at least in the IPCC context. Here is a little test, looking, for example, at the temperature history etc on page 41 of my Note 1 here (

    a) “Fit” your models for the period 1700-1800, and see what they forecast for the 1800’s.

    b) “Fit” your models for the period 1750-1850, and see what they forecast for the next 100 years. This period has a nice “reversal” in it, so lets see how the models handle that.

    c) “Fit” you models for the period 1800-1900, and see what they predict for the 1900′.

    … now repeat all this for various other choices, such as “fit” 1700-1900, and forecast, etc.

    CRUCIALLY, be sure NOT TO CHEAT, for example, by “manually” telling your models about volcanoes, solar variation, etc etc … i.e. all the things that forecasts cant possibly know. Also, do not cheat by calibrating your “atmosphere’s” transparency to a “scrubbed” period (i.e. maximizing incoming solar that would otherwise be reflected by aerosols etc) … and so on, and so on.

    Indeed, even in the charts shown above, if you remove just a few of the volcanoes (e.g. Krakatoa etc etc), the entire “ensemble” would be shifted way up (maybe 1.5 – 3 C, or more) and have little to do with the actual measured results.

    Crucially, while they include volcanoes in the “fitting” portion (otherwise the models would go ballistic already in the early 1900’s), they remove them completely from the “future” to massively exaggerate the “up tick”.

    … not to many mention the dozens of other crucial dynamics missing from the models.

    5) Since the 1970’s, there have been many billions spent on climate modelling just by the IPCC et al and on the currently many dozens of armies of IPCC et al modelling groups and mountains of supercomputers. The results have not only not improved, they are becoming more routinely contradicted by data as we slowly start to get at least the beginnings of proper data.

    This is exactly the result expected for modelling what we call fundamentally unstable phenomenon. It makes no difference how many math geeks or supercomputers you throw at these types of problems … they defy predictability as a kind of raison d’etre … as is woven into to fabric of the cosmos.

    It is very well documented, e.g. by the hundreds of scientist who have resigned from the IPCC in disgust, that the IPCC/UN et al insist on manipulating results/conclusions to promote their political agenda, regardless of, or in spite of, the (scientific) results.

    … I am way behind in my efforts to produce a couple of new monographs that explain “transport phenomenon”, and modelling basics. Those demonstrate in greater detail why climate modelling (in the IPCC context) is a “pipe-dream” or “religion/belief”, not science. I’ll try to finish those ASAP.

    • Clive Best says:


      I agree that no model can really calculate the details of climate. Much of the complex stuff like clouds, aerosols, humidity are simple assumptions or simple parameterizations.

      1. Yes the political bandwagon is more important than the science. It would be a brave researcher who now argues that climate sensitivity is so small that we have nothing to worry about.

      2. This is another interesting point. Richard Toll has published evidence that 2C warming is of economic benefit to mankind. This is anathema to the green lobby and he gets pilloried as a result. Another trick is the use of temperature anomalies rather tha surface temperature. 2 degrees C sounds a lot in Manila where it is already hot, but is essentially nothing in Antarctica where average temperaures can be -40C. The models show much more warming where it is already very cold and much less where it is already hot.

      3. Agree. The ensemble average is meaningless. It is an attempt to pretend that different models are like different making different measurements. This is nonsense. All models are wrong and the only way to improve them is to compare their results individually to measurements. It should be a competitive process and not an old boys network.

      4. Yes Volcanoes and the fine tuning of aerosols are the way models manage to ‘hindcast’. So probably the hiatus in warming will be explained away this way.

      5. I hope that the more extreme predictions will hit the buffer soon – within the next 5 to 10 years. Let’s hope that the satellite data eventually forces the IPCC to eat humble pie. The west is wasting billions of dollars a year in self-flagellation, while Asia burns all the cheap coal.

      Good luck with your book!

  4. Hans Erren says:

    Now, if we apply a 1.3 TCR on the SRES scenarios and RCP8.5, then it is apocalypse no,

  5. DrO says:

    Not sure what Hans had in mind, but Fick’s Law form’s a basic PDE/model for diffusion, and is sort of the equivalent of the “Heat equation” for the “diffusion” (i.e. conduction) of heat, except as applied to, for example, molecular diffusion (also applies to sugar dissolving in coffee etc etc). Each of the three basic elements of “transport phenomenon” (heat, mass, momentum balances/flows) have their equivalent of this basic type of “diffusion”. Indeed, this generalises to many other types of “diffusion” such as entropy, and also “diffusion of information” (and thus uncertainty) as used in the Black-Scholes-Merton options pricing models etc.

    • DrO says:

      Sorry about the “placing” of this comment, my original “reply” Re “Fick’s Law” should have been part of the thread above.

      To clarify the terminology in general use for the modelling of transport phenomenon, all such models tend to be based on conservation principles (conservation of energy and mass). In all such cases the essential expression takes the form:

      in – out = accumulation = d/dt of the contents

      such as the in/out of your bank account determines its “contents/balance”.

      More generally,

      in – out +/- sinks/sources = d/dt content

      where, usually, sinks/sources are things “occur in the body” of the process.

      For example, holding a candle to end of a steel bar will cause heat to flow in at one end, and out at various other points (say the just the other end if the bar is insulated). Thus, the temperature profile is determined by the “heat/Laplace” equation, and takes the simpler form above.

      If the bar is made of radioactive uranium, then the candle is not the only source of heat. There is heat generated, say, uniformly, throughout the volume due to the radioactive process. In this case the second form of the balance is required.

      A very simple form of the model in one dimension might take the form

      d (a dT/dx)/dx + q = b dT/dt

      where q is the “sink/source”. In this uranium example, a source.

      You bank balance also has sinks/sources, such fees, interest paid on your deposit etc.

      An equation essentially identical in form (but having the obvious difference in physical interpretation) to this would describe a cube of sugar dissolving in a cup of (still) coffer.

      ASIDE: The note by Hans referring to “coffee filtering” does not seem correct. Diffusion/Fick’s Law is not about filtering. that is a different problem, unless we extend the definition of “filtering” to something very much wider.

      I have not followed Hans’s references, so I cannot be certain what exactly he is referring to. However, modelling the atmosphere requires. amongst very many other things, accounting for water vapour, CO2, etc etc. Those molecules are transported by convection and diffusion.

      However, there are also (many) sinks/sources. For just one example, phytoplankton consume CO2 during photosynthesis, reducing the CO2 content of the ocean. Then, the concentration gradient might cause atmospheric CO2 to be absorbed by the ocean … thus creating a “sink” for the atmospheric CO2. When the phytoplankton die, they sink to the bottom, and thus “sequester” some carbon … though in reality, this is a much more complicated “boundary” issue in a multiphase environment.

      Similarly, water vapour condensing as rain would be a sink, etc.

      • Clive Best says:

        I have never really properly understood the Bern model. It proposes 3 or 4 sinks acting independently with different time constants. Yet it seems to say that each term has a different amplitude so that the fastest sink sucks its CO2 molecules out and then simply stops, then the second does the same and then the third etc. How can that be ? How can each sink divide up the pool of available CO2 molecules? Why isn’t there a single time constat for concetration decay?

        • re Multiple time constants

          I’ve written about it in the link above but also google Christian Beck and the concept of “superstatistics”. It’s essentially placing a spread on a measure, such as via Maximum Entropy.

          I always assumed the BERN model is simply approximating either this spread and of course the diffusional exp(-x^2/Dt) fat-tail.

          Note that the time is in the denominator and not numerator, which is first-order decay. First order decay does not occur for diffusional sequestration.

          So the BERN model is really an empirical approximation of a full diffusional slab calculation. You see the same time profile for doping in a semiconductor wafer — that is a form of diffusional sequestration, as you are trying to incorporate dopants in the bulk. Could use the equivalent of a BERN profile there, but engineers aren’t going to use an approximation when the actual solution to the diffusion equation works better.

          This is a preprint of a paper I wrote with some colleagues:

        • DrO says:

          I am not an expert on the Bern Model (Bern Cycle), though there are various discussions of its origins and properties (such as here ( and here (

          From my perspective, it is an attempt to (way) oversimplify the carbon cycle, and in some cases rely solely on that to arrive a global ave temp via the (as far as I concerned) completely ridiculous ln(CO2/CO2orig) type approach.

          Omitting spatial dynamics (and a vast number of other matters) massively reduces the complexity of the equations (indeed Bern has no PDE’s etc), and allows you to “pop out” some numbers “more easily”, and also alters the “proper and complex modelling” of sinks/sources to oversimplified elements … but you get what you pay for, or as I like to quote:

          “For every complicated problem, there is always a simple solution, but usually it’s wrong.”

          Moreover, it is really just a variation on a form of MLE approach with an attempt to provide “physical connections” to the regression coefficients (as the many blogs, and you, are trying to come to terms with).

          … in short, it’s nonsense, since, as always, one can fit any time series to almost anything they like … and end up with apparently “good looking”, but meaningless “models”. That is, over some known data range, MLE et al can fit a vast number of different “models”, where each “look good” over the fitted range … but so what.

          A necessary (albeit not sufficient) first test is to do the “fit”, and then test the model on data that was not part of the fit (bit like my comments above for applying CMIP’s to the 1700’s etc).

          … what a surprise, you don’t see any of that in the IPCC et al. Why do you suppose even basic necessary (partial) verification of that sort is omitted?

          The second equation at ( illustrates this beautifully. The Bern model, as derived there, results with nothing more than a generalised nth order power-series regression.

          If you start this “model” in “reverse”, ie. just start with a pure nth order geometric regression with no regard for “physics”, then clearly you have made no attempt to make any physical interpretation of the model/elements, and need not do so … if all you are interested in is a numerical approximation/fit. That is, it is not required to make “any physical sense”, in that context.

          … crudely speaking, that is the entire point of statistics, a method to use when you don’t understand the “physics” (I’ll comment on stochastic process another time).

          So why not just start in “reverse” with, say, a Fourier series, or a Taylor series, or various such permutations, etc etc. You will be able to create a “fit”, each of which may “look good”, but I guarantee each of those will predict very different futures, almost surely … and crucially, where none of them is correct.

          Once you move to MLE methods, almost surely, you can throw conservation principles out the window, along with a vast number of other crucial dynamics.

          For example, how much CO2 is there in the “residual” or “error” term, and given the hoopla over just a few ppm, does that now also destroy the model in the IPCC et al context?

          Also, with traditional MLE methods, there is NO possibility to capture certain types of crucial dynamics. For example, phytoplankton (which produce 50 – 95% of the planet’s oxygen, and thus must have a comparable scale of impact on CO2) may need to be modelled by something along the lines of “Predator-Prey” models (e.g. the Logistic equation being the simplest such). Those tend to demonstrate aperiodic (i.e. chaotic) dynamics, which are necessarily destroyed by the MLE processing. Aperiodic linkages of that sort may be exactly what connects slow and fast “CO2 sinks”.

          Then, to add insult to injury, the Bern models appear to be used primarily for “pulse response” (or “perturbation”) analyses. This can be a useful thing sometimes, but from where I sit, it is used in highly abusive manner since the actual predictive power of Bern (and the IPCC et al) is nonsense, so using perturbations allows you to hide the failed predictions and pretend that something “useful” has come of it … a standard trick used throughout the IPCC et al assessments.

          … indeed, the entire “ensemble horror”, which is used in connection with perturbation analysis is a profound abuse of mathematics and modelling … and I guess they hope/rely on the notion that the readers (i.e. the voters) don’t understand stability analyses.

          I am not sure if that answer’s your questions, but I guess what I am saying is that some authors attempt to assign physical meaning to something that is better treated as a pure (say, MLE) numerical approximation … and you can drive yourself crazy trying to assign (meaningful) physical interpretations to those (and even crazier if you attempt to rely on predictions thereof :-).

  6. DrO says:

    In response to WebHubTelescopew – I tried the link provided but get an “unable to connect” error.

  7. opluso says:

    Why do so many teams keep churning out inaccurate model projections? Follow the money.

  8. Pingback: Zeg Urgenda, er komt helemaal geen klimaatkatastrofe! | Klimaathype

Leave a Reply