Marotzke & Forster Revisited

Marotzke & Forster(2015) found that 60 year trends in global surface temperatures are dominated by underlying climate physics. However, the  data show that climate models overestimate such 60 year decadel trends after 1940.

Comparison of 60y trends in observations and models (see text for details).

Comparison of 60y trends in observations and models (see text for details).

The recent paper in Nature by Jochem Marotzke & Piers Forster ‘Forcing, feedback and internal variability in global temperature trends’ has gained much attention because it makes the claim that climate models are just fine and do not overstimate warming despite the observed 17 year hiatus since 1998. They attempt to show this by demonstrating that 15y trends in the Hadcrut4 data can be expected in CMIP5 models through quasi random internal variability, whereas any 60y trends are deterministic (anthropogenic). They identify ‘deterministic’ and ‘internal variability’ in the models through a multi-regression analysis with their known forcings as input.

\Delta{T} = \frac{\Delta{F}}{(\alpha + \kappa)} + \epsilon

where \Delta{F} is the forcing, \alpha is a climate feedback and \kappa is fraction of ocean heat uptake and \epsilon is random variation.

This procedure was criticised by Nic Lewis and generated an endless discussion on Climate Audit and Climate-Lab  about whether this procedure made statistical sense. However for the most part I think this is irrelevant as it is an analysis of the difference between models and not observational data.

Firstly the assumption that all internal variability is quasi-random is likely wrong. In fact there is clear evidence of a 60y oscillation in the GMST data probably related to the AMO/PDO – see realclimate. In this sense all models are likely wrong because they fail to include this non-random variation. Secondly as I will show below the observed 15y trends in Hadcrut4 are themselves not quasi-random. Thirdly I demonstrate that the observed 60y trends after 1945 are poorly described by the models and that by 1954 essentially all of the models predict higher trends than those observed. This means that the ‘deterministic’ component of all CMIP5 models do indeed overestimate  the GMST response from increasing greenhouse gas concentrations.

Evidence of regular climate oscillations

Hadcrut4 anomaly data compared to a fit with a 60y oscillation and an underlying logarithmic anthropogenic term.

Hadcrut4 anomaly data compared to a fit with a 60y oscillation and an underlying logarithmic anthropogenic term.

Figure 1 shows that the surface data can be well described by a formula (described here) that includes both an net CO2 forcing term and a 60y oscillation as follows:

DT(t) = -0.3 + 2.5\ln{\frac{CO2(t)}{290.0}} + 0.14\sin(0.105(t-1860))-0.003 \sin(0.57(t-1867))-0.02\sin(0.68(t-1879))

The physical justification for such a 0.2C oscillation is the observed PDO/AMO which just like ENSO can effect global surface temperatures, but over a longer period. No models currently include any such  regular natural oscillations. Instead the albedo effect of aerosols and volcanoes have been tuned to agree with past GMST and follow its undulations. Many others have noted this oscillation in GMST, and even Michael Mann is now proposing that a downturn in the PDO/AMO is responsible for  the hiatus.

15y and 60y trends in observations and models

I have repeated the analysis described in M&F. I use linear regression fits over periods of 15y and 60y to the Hadcrut4 data and also to the fitted equation described above. In addition I have downloaded  42 CMIP5 model simulations of monthly surface temperature data from 1860 to 2014, calculated the monthly anomalies and then averaged them over each year. Then for each CMIP5 simulation  I calculated the 15y and 60y trends for increasing start year as described in M&F.

Figure 2 shows the calculated  15y trends in the H4 dataset compared to trends from the fit. For comparison we first show Fig 2a taken from  M&F below.

15y trends from M&P compared to model regressions

Fig 2a: 15y trends from M&F compared to model regressions. Error bars for  random internal variation  are ± 0.26C which dominate ‘deterministic’ (AGW) error spread beween models of ±0.11 C

M&F regression analysis then goes on to show that the deterministic effects in the CMIP5 models should dominate for longer 60y trends. In particular the error on the 60y trends as given across  models is ± 0.081 C which is 30% lower  than random variation. Therefore the acid test of the models comes when comparing 60y model trends to the obervation because now statistical variation is much smaller. These are my results below.

a) 15y trends derived from Hadcrut4 data and the fit described above. Note how the trends are not random but also follow a regular variation in phase with the fit.  b) 60y trends in Hadcrut4 data (black circles) comparted with the fitr (blu line) and an  ensemble of CMIP5 modle calculations. The rted curve is the avergae of all CMIP5 models

a) 15y trends derived from Hadcrut4 data and the fit described above. Note how the trends are not random but also follow a regular variation in phase with the fit.
b) 60y trends in Hadcrut4 data (black circles) comparted with the fitr (blu line) and an ensemble of CMIP5 modle calculations. The rted curve is the avergae of all CMIP5 models

This analysis shows two effects which were  unreported by M&F. Firstly the 15y variation in trends of the observed data is not random but shows a periodic shape as is also reproduced by the fit. This is characteristic of an underlying natural climate oscillation. The quasi-random natural variation in the CMIP5 models as shown in Fig 2a above  encompases the overall magnitude of the variation but not its structure.

Secondly the 60y trends also show a much smaller but still residual structure reflecting the  underlying oscillation shown in blue. The spread in 42 models is of course due to their different effective radiative forcing and feedbacks. The fact that before 1920 all  model trends can track the observed trends is partly due to parametric tuning in aerosols to agree with hindcast temperaures. After 1925 the observed trend begins to fall beneath the average of CMIP5 so that by 1947 the observations lie below all 42 model trends in the CMIP5 ensemble. This increase in model trends above the observed 60y trend cannot now be explained by natural variation since M&F argue that the deterministic component must dominate.  The models must be too sensitive to net greenhouse forcing. However M&F dismiss this fact simply  because they can’t determine what component within the models causes the trend . In fact the conclusion of the paper is based on analysing model data and not the observation data. It is bizarre. They conclude their paper as follows:

There is scientific, political and public debate regarding the question of whether the GMST difference between simulations and observations during the hiatus period might be a sign of an equilibrium model response to a given radiative forcing that is systematically too strong, or, equivalently, of a simulated climatefeedback a that is systematically too small (equation (2)). By contrast, we find no substantive physical or statistical connection between simulated climate feedback and simulated GMST trends over the hiatus or any other period, for either 15- or 62-year trends (Figs 2 and 3 and Extended Data Fig. 4).The role of simulated climate feedback in explaining the difference between simulations and observations is hence minor or even negligible. By implication, the comparison of simulated and observed GMST trends does not permit inference about which magnitude of simulated climate feedback—ranging from 0.6 to 1.8 W m22 uC21 in the CMIP5 ensemble—better fits the observations. Because observed GMST trends do not allow us to distinguish between simulated climate feedbacks that vary by a factor of three, the claim that climate models systematically overestimate the GMST response to radiative forcing from increasing greenhouse gas concentrations seems to be unfounded.

It almost seems like they  have reached the conclusion that they  intended to reach all along – namely that the models are fit for purpose and the hiatus is a statistical fluke not unexpected in 15y trend data. This way they can save the conclusions of AR5, but  only by ignoring the evidence that the observational data support the AMO/PDO oscillation and moderate gloabl warming.

Physics has always been based on developing theoretical models to describe nature. These models make predictions which can then be  tested by experiment. If the results of these experiments dissagree with the predictions then either the model  can be updated  to explain the new data or else discarded. What one can’t do is to discard the experimental data because the models can’t distinguish why they dissagree with the data.

My conclusion is that the 60y trend data show strong evidence that CMIP5 models do indeed overestimate global warming from increased greenhouse gasses. The discrepency of climate projections with observations will only get worse as the hiatus continues for probably another 10 years. The current 60y  decadel trend is in fact only slightly larger than that that in 1900. Once the oscillation reverses around 2030 warming will resume, but climate sensitivity is still much less than most models predict.

This entry was posted in AGW, Climate Change, climate science, Science and tagged , , , . Bookmark the permalink.

28 Responses to Marotzke & Forster Revisited

  1. DrO says:

    Dear Clive

    Nice to see some proper science entering the discussion.

    I offer some comments on your assessment, but I have also a separate matter for your consideration, which I leave to end.

    1) It is extremely important, as you have done, to EMPHASISE that the IPCC et al tend to compare “models to models”, and very little comparison to reality.

    This is a standard policy at the IPCC. When ever reality disagrees with their beliefs/models, they blame the data or claim their models are proven by other models.

    In fact, almost all physical evidence patently contradicts the IPCC et al model assumptions, and MASSIVELY.

    This is not simply just the “pause”, but many other crucial data, such as the comparison of satellite and ground based data found here ( Amongst other things, the IPCC et al models are as much as 20W/m2 WRONG about forcing, and they use the WRONG SIGN.

    The list of real data contradicting/crushing the IPCC et al is very long.

    Proper scientists, who when faced with contradictory data “go back to the drawing board”. Instead the IPCC et al, every time they are contradicted by reality, become EVEN MORE CERTAIN they are right … too weird.

    2) It is highly OBJECTIONABLE to use the word “deterministic” in the context of any statistical methods whatsoever. I believe this to be yet another “spin” by the IPCC et al.

    If it is deterministic, then TAUTOLOGICALLY, it CANNOT have any RANDOM or statistical element. This has a profoundly DIFFERENT meaning.

    If what they mean to say is the that they have “de-trended” the series, and they have extracted a “trend” from the “statistical” component, then that is what they should say. THAT IS NOT DETERMINISTIC.

    In fact, the climate is deterministic, but to call these statistical methods deterministic is a very great dishonesty/spin intended to give the impression of “reliability”, where none exists (at least not in the IPCC et al context).

    This may seem picayune to some, but it is yet another of many examples of the destruction of science and mathematics for the sake of ideology.

    3) Why would you use linear regression to assess for periodicities? That is too weird, and I am afraid misguided.

    While it is possible to make many guesses or automate that process, why not just apply a Fourier Transform and obtain the Power Spectrum, or other methods that “periodicity specific”? That is the preferred method when looking for periodicities.

    This has two immediately advantages:

    a) You will immediately see if there are significant periodicities (and there are some out to 60 yrs), without having to guess (at least with the data I have).

    b) You will also see that the Power Spectrum is “continuous”. This is a CLEAR sign that the process is APERIODIC. This has IMMEDIATE and massive (catastrophic) implications for any type of statistical analysis and “fitting” of any sort.

    Anybody who does not understand why that must be so will need to do a little homework. I may have a solution soon, see my last comment.

    4) In your first figure, and then again later, you compare the models to HadCrut4. This is a highly dubious choice.

    It is well known that many ground based temperature databases have a near continuum of so-called “administrative adjustments”. That is, the keepers of the databases “fiddle” the numbers for a variety of “questionable” reasons.

    What a surprise, the net effect of all of those gross “administrative adjustments” is to “increase” the global average temperature and its trend.

    For example, some databases have seen a 0.4C or more “up shift” compared to the start of the 20th century.

    This is a massive RELIABILITY problem for ground based data. The entire warming of the planet for the 20th century is 0.7C, and these so-called “administrative adjustments” amount to in excess of a 60% alteration.

    Thus, and especially for the period since the 1,970’s, satellite data is preferred.

    If you use unmolested satellite data, your charts will show a very much more significant disagreement between the models and reality.

    5) While there seems to be strong evidence of decadal and multi-decadal climate cycles, it is only in the last few decades humans have started to asses and MEASURE those effects.

    So, how many cycles would you need to observe in the real world before any meaningful scientific statement could be made? My view is, at minimum, 3 – 5 cycles.

    Thus, to make meaningful statements about deep ocean cycles, and the like, 3 – 5 times 30 – 60 years means that humans need AT LEAST another 60-90+ YEARS before it is even possible to have the critical mass minimum data.

    As such, and particularly in the IPCC et al “let’s turn the world on it’s head, on whim” context, it is NOT possible to make any meaningful statement of these crucial factors impacting the climate.

    Finally, on a separate but related matter, as you may recall I had promised a couple of Notes explaining a number of matters, particularly in relation to the CRUCIAL misapplication and misunderstanding of Non-Linear Dynamics (NLD), and thus in relation to aperiodic systems, chaos, and deterministic systems, plus other material on the “meaning” of satellite data and massive abuses by the IPCC et al, all leading a thorough destruction of any notion of climate predictability in the IPCC et al context.

    In the event, what started out as a few 30 page Notes, has now exploded into several hundred pages, and there is maybe another 100 to go. This “scale” became necessary to provide sufficient material (and also in a sufficiently pedestrian language), to show with sufficient facts/precision what is ACTUALLY happening, and to do so with pretty much everything available in a single document.

    I would be grateful for a small number of individuals to act as “referees”.

    At the moment, I am relying on some material from various commercial sources, and it is not clear if I will be permitted to make the final product available without some “licensing” agreement. Thus, at least for the moment, I am obliged to distribute drafts in an encrypted PDF format.

    Please let me know if you would like to be “referee”. If so, I will need an email address for you, which I won’t distribute or abuse in anyway. Email me at droli (at) thebajors (dot) com.



    • Clive Best says:


      Your points

      1) I agree. There is now so much IPCC credibility invested in the models that they are quite prepared to rubbish the data rather than admit they may have made a mistake. There are literally trillions of dollars now committed to those model predictions being right. The next few years are going to be messy.

      2) Deterministic was the word used by M&S implying that AGW emerges lihe a Phoenix above random natural variation.

      3) They use linear trends because the IPCC avoids addressing the possibility of any periodic variations since the dogma is that of a pure quasi-random signals, and these are suppressed over time periods above 60y.

      4) I use Hadcrut4 because it is the IPCC official global temperature time series. There isn’t much difference between any of the ground measurements. What GCN have been up to adjusting the past etc, or the Urban warming effect are not show stoppers. We have to use the same data to keep credibility.

      5) There are the Bond cycles during this interglacial and D-H events at ~1000y intervals during the last glaciation. These prove that there are natural sources of climate variation. The much larger 100,000y cycle of ice ages also proves that.

      I am happy to take a look at your NLD paper, although I don’t claim to be an expert however ! You can send it to me at

      clive (dot) best (at) gmail (dot) com



      • DrO says:

        Dear Clive

        Cheers for that. Just a couple of remarks for now:

        1) Re Hadcrut4 and land-based data: consistency is fine. Though it may not hurt to include ALSO the satellite data to show the discrepancy. Over the past 100 years or so that “administrative adjustments” (and which are not simply for “urban heating”) now come something like +0.68C … which is as much as the entire planet’s warming for that period … a very serious “reliability” issue for land-based data.

        2) I had forgotten to mention earlier, that as with ALL curve fitting, even if it is meaningful (and that has not been established), it applies ONLY for the period covered by the DATA. That is, the “trig fit” used above MAY NOT be used for any forecasting, unless it can be proven that the same periodicities and characteristics hold for the future … which clearly they do not. For example, what would the “trig fit” be for the period 1,700 – 1,800, and then “predict with that”?

  2. Ron Graf says:

    Clive, Thank you for doing this.

    I have a few questions:

    1) The paper relied on values diagnosed from plotting model simulations that were of pre-industrial air and sea and subjected to an abrupt 4XCO2 concentration. BTW, they do not mention any of this in the paper. One must read the footnoted paper. M&F say they utilized the CMIP5 archive (technically borrowing their own 2013 contribution I guess). They first diagnosed dF, alpha and kappa by plotting dT against time. They then plug alpha and kappa into historical runs of the same models with historical CO2 and take the average dF they arrive at to balance each year against dT. There was a lot of discussion at about whether this biased the regression toward high variability since there was no e term in the diagnosis (it assumed averaged out). The e was only added for the regression. Do you feel the method was statistically sound, notwithstanding that kappa and alpha are not likely linear and other problems? Also, I agree that running OLS on model ensembles tells us little about the real world.

    2) Your plot of the models leaves Hadcrut4 for good at 1925 and M&F’s does not leave until 1985. Which of these factors do you think is most causing the discrepancy: that you started at 1860 and M&F started at 1900, that you used 42 models and they 36, or that your choice of statistical filter provided less distortion (smoothing)?

    3) Do you believe the paper has any scientific value?

    4) If no, was the paper motivated simply to have a pretext to make an international press release that the models had been “validated?”

    5) Do you believe there is any accuracy the press release first line: “Sceptics who still doubt anthropogenic climate change have now been stripped of one of their last-ditch arguments.”?

    6) Do you find it deceptive that Marotzke in his interview implies that all the models were used. And that they repeatedly mention all 114 available models when they really meant model runs, not all 112 CMIP5 models (56 couples)?

    7) Whereas the models were programmed with artificial oscillations to follow a random ENSO cycle, and were also programmed to converge in the late 1990s-2000, do you find it deceptive that most of the runs cited and half the paper is devoted to “validating” the models by the fact that they were not the furthest off course by 2012 than they had ever been? (By the time the paper was published in Jan 2015 and Marotzke giving interviews the models certainly were off course more than ever.)

    8) Should a call be made for the paper to be withdrawn?

    9) Can you account for how the PDO/AMO signal you plot was overlooked before now?

    10) Do you plan to submit a paper of your own for peer review?

    11) Should CMIP5 be revised to incorporate AMO/PDO or do the IPCC models provide no scientific benefit anyway?

    • Clive Best says:

      I am travelling today so I have to be quick.

      This paper basically set out to prove the following based on their model studies:
      1) That the 17 year hiatus has no statistical significance because climate models show that random natural variations (epsilon) of the same magnitude occur over 15 year periods. Random fluctuations errors can be larger than the AGW signal.
      2) The AGW signal emerges from the noise over 60(62) year trends. The models are compatible with observed trends.

      I think they failed to prove the first point since the observations do not show a random scatter in temperature. Instead they favour an oscillation. I have shown that the second point is also untrue since 60y model trends are all higher than observations after a start date greater than 1948.

      Now your points.

      Point 1) I had not realised that was how they had derived F, alpha! The problem is that Nic Lewis and others on climateaudit have failed to give the knockout blow on the statistical arguments. They can only do this if they get their hands on the exact data used by M&F.

      My questions are anyway why bother:
      – If they wanted to measure what epsilon is, why not run the models with fixed CO2 levels for 100 years and measure epsilon directly !
      – If they want to measure what alpha & kappa are, why not switch off random variations in the models and simply measure DT and TOA forcing !

      Point 2) I am using a different set of model runs I guess. I am using the 42 monthly model data results made avaiable by a few months ago by Willis Eisenbach. He originally downloaded them from the CMIP5 site which requires access control. Who knows how M&F selected their model runs?

      Point 3) Not much. It has far more political value.

      Point 4) Yes

      Point 5) No

      Point 6) I didn’t see his interview. Strong statements are not justified by their paper. How did they select which models and which runs were used ?

      Point 7) The model hidcasting is done backwards from 2010 I think so it is hardly surprising they are consistent around 2012. However they are still above H4 data by then.

      Point 8) Who has the stature to do that ? Steve MacIntyre, Nic Lewis, ….

      Point 9) There is barely a mention in AR5 about PDO/AMO. However it is clear that it was in the back of everyone’s mind. It does rather nicely explain the hiatus. Even Michael Mann seems to be getting on the band-wagon.

      Point 10) I would be prepared to help write a joint paper with others. If I was to write one with just myself as author it would very likely be rejected by the editor of Nature. I am tired of expending huge effort to get a paper together only for it to be rejected with some remark like. ‘we feel it is not of sufficient wide interest for Nature’.

      Point 11) The trouble is they do not know how to model PDO/AMO or even ENSO. I think it imay be still based on random number generation.


  3. Ron Graf says:

    I also now remember the two best points I read on in critique of M&F’s mathematics. In the second part of Forster’s diagnosis he used fixed alpha and kappa in his first step and thereby applied all unforced variability to ERF in the second, which is the opposite of what really happens. The variability is in alpha and kappa or affecting them, not ERF. Also, when M&F assume alpha and kappa are linear when they are not the statistical mismatch get’s thrown into (e) natural variability when it is really just poorly fitting filter. Now we can take Clive’s PDO/AMO signal out of natural variability and that all adds up to a good amount of tightening of the error bars. Models are way off, way hot.

  4. Clive Best says:

    How could they distinguish between alpha and kappa in the real world without using observational data ?

  5. Ron Graf says:

    Clive, I will do my best to convey all I know to this point and see if I recruit help for you.

    M&F ignored Steve McIntyre’s request for their data made in about Feb 7. They did respond on CLB, as you saw, but just to repeat their abstract, deny circularity and attack Nic Lewis and his direct observational methods for determining ECS in Lewis & Curry (2014). They admitted using Forster 2013’s data but I noticed the model list did not match up completely. They dropped ISPL-CM5B-LR and NorESM1-M (this last one is likely an omission since their list is short one) and added bcc-csm1-1, bcc-csm1-1-m, GISS-E2-R and MIROC5. Nic found their selection reasonable since they were the only models supplying top of atmosphere information and were run in the pre-industrial calibration ensemble. It seems that the models are so complex that they are almost as hard to study as the atmosphere directly. And they take so much super computing power that they make a predetermined matrix of “realizations” that were ran in 2012 and became the core of the CMIP5 archive. This is why M&F went only to 2012 and even then they had to splice data from two runs. The realizations (runs) are described in a paper here:

    The overview says most models were fitted with a randomly starting ENSO wave.

    That ensemble of pre-industrial calibration with abrupt 4XCO2 was specifically done for the purpose of supplying data to be used with OLS method to diagnose dF, alpha and kapps. The method was pioneered I believe by Forster & Taylor 2006. Here:

    There is a good comment that describes better the flaw with the OLS diagnostic method here:

    And there is another good paper on this by Gregory & Forster 2008 here:

    Here is Forster 2013:

    Here is AR5 Chap 9 on CMIP5:

    “Why bother?” I believe that McIntyre and McKitrick made a significant impact with their expose’ of Michael Mann’s hockey stick graph. Climategate was about hiding similar data from M&M and defaming M&M to prevent reporters from covering them and journals from accepting their work. Mann had the moxie to sue Mark Steyn and several other institutions for implying his work was tainted. I am hoping that comes to trial and it’s on TV.

    Marotzke has the IPCC stature of Mann. This paper is transparently useless except as a pretext to make an unfounded press release. The public should be informed properly. Citizens are the last recourse I believe. Willis, Soon, Pielke and others are under attack to chill response within the ranks.

    I have loved science since I was 7. I’m guessing you too. It’s ours to lose, Ben Franklin said.


    • DrO says:

      Apparently, one of those sued by MM was Tim Ball. MM’s legal case required MM to release his data to the Courts … he would not! So the Court dismissed MM’s case with costs.

      What does that say about MM? He wouldn’t release data that should be available without having to ask for it, but he would not. Then, he shoots himself in the foot in Court and destroys his own legal case for not releasing the data.

      … apparently, everybody is starting to distance themselves from Mann.

      … I empathise with the comments above “what are they doing to my beloved science?”

  6. Clive

    how well does your model ‘predict’ MWP?

    your co2 term has temperature increasing with co2 concentration forever. Do you really believe this?

    Why not get rid of co2 and add another sine term with period of roughly 200-300 years and see what happens. Then perhaps add yet another with period of about 1000 years.

    As things stand you’re just saying “I did a fit” wouldn’t it be more interesting to say I did 2 slightly different fits to see which worked best?

    At least commenter has made a similar point on the WUWT version of your post



    • clivebest says:

      It does not predict the MWP. To do that you need a ~1000y cycle and the only possible explanation would be astronomical. So either the Sun, possibly the moon or a combination of the two.

      • If the model doesn’t predict MWP isn’t that a sign that something is missing and that missing something may reduce the importance of things in the model (eg CO2)?

        What do you think caused the 60 year cycles?

    • Jeremy
      IF a model ”predict” MWP = that model is wrong, because there was no MWP! Scientist didn’t start lying about the climate in the 80’s; lying was always their bread and butter.

      They concocted ”global” warming for Dark Ages – from a record written on a goat’s skin – how much a person paid tax to the bishop I.e he produced more grain per acre than was produced during the Reinsurance…?! That goat’s skin in England told told them the temp for the WHOLE planet, for hundreds of years – no wander that people don’t take you guys seriously”’

      Truth: one thermometer cannot tell the temp for 10 000km2, maybe a thermometer is sufficient to monitor room temp, nothing more – even in a room, close to the floor is by 1C colder than close to the sealing. You are wasting your life, on ”Sandpit job” collecting irrelevant / misleading data!!!

  7. Greg Goodman says:

    Strange, I thought I’d posted on this thread already.

    A quite observation on your periodic model. It seems to have a phase lag of somewhere between 10and 20y years ( by eye ).

    I did a similar exercise a few year back for AMO ( ie detrended ) .

    It was not regression fitted, just a demonstration of climate like variation from periodic fns.

    Assuming these periodics are “forcing”, you would presumably need to add some kind of relaxation response. (convolution with decaying exponential ).

    A tau of 10 or 11y would probably correct your phase lag.

  8. Paul_K says:

    Can I ask you to double-check your calculation procedure for the production of your 60 yr trend lines. I get a quite different picture from the CMIP5 results, even when I test using Willis’s 42 series result, and using several different ways of arriving at the result.
    As a benchmark, I think that your “Series 13” should give a 60 yr trend of around .005 deg C per year forwards from 1950. Your graph shows no series with a trend below .01 deg C per year, if I am interpreting it correctly.

    • Clive Best says:

      All I did was to loop through each model taking an incremental start year. Then I make a linear fit to the 59 successive years to get the 60y trend. However be careful. You first have to calculate annual anomalies based on the same Hadcrut4 algorithm. To do this you need to subtract the monthly averages from 1961-1990 and then take the yearly average.

      If you look at M&F’s paper you can find their model trends in the appendix you’ll see that they too have no trends below 0.1 C by 1950 and they are supposedly using over 100 CMIP5 simulations.

      • Paul_K says:

        Thanks Clive.
        The M&F graph is not based on the raw model temperatures. It is based on predicted temperatures using the derived “AF” forcing values. I had assumed that your graph was using the raw model temperatures as input. Is this correct?

        Over a 60 year trend period, it makes little difference if you deseasonalise the data or not.

        Again using Series 13 as a benchmark, I have calculated the 60 year trend by each of the three methods below:-
        (a) using the raw monthly data directly and calculating a forward trend over 720 months. Reaverage the trends into annual averages.
        (b) convert the data to deseasonalised anomalies, calculating forward trends each month and averaging the trends into an annual average
        (c) converting to deseasonalised anomalies, calculating an average annual anomaly and then calculating a forward 60 year trend on the annual values.

        I get similar answers for any of the above approaches,

        Are we using the same data??

        • Clive Best says:

          I think we must be using the same data if it comes from Willis – 42 series. The first 3 values of series 13 are : 273.9759
          274.6008 277.1515

          I used algorithm c) I wrote the code in IDL using LINFIT.

          My immediate problem is that I am now in Italy with just an Ipad so I can’t access IDL. I am pretty sure I have it right, but of course I may be wrong. Send me your result if you like clive dot best at gmail dot com You can see the calculated model anomalies compared to Hadrut 4 anomalies below

          • Paul_K says:

            Hi Clive,
            Well I’m very puzzled. Apparently we are using different data, but I don’t understand how or why. Your temperature values seem too low to be average surface temperatures. To make a direct comparison with your graphs I downloaded Willis’s spreadsheet called ” CMIP5 Models Air Temp One Member”. First 3 monthly records of Series 13 (for 1861 onwards) are…
            I will investigate a bit, and see if I can discover the source of difference.

  9. Paul_K says:


    No, the data I am using are different.

    I picked up Willis’s data from this posting:-

    The Excel file I downloaded came from this reference in the posting:-

    “The one-run-per-model data is here in a 1.2 Mb file called “CMIP5 Models Air Temp One Member.xlsx”. -w.”

    Where did your csv file come from?

    • Clive Best says:

      Sorry for delay in replying.

      I downloaded the .xslx file from Willis immediately after he made it available. I then converted the first to .csv for portability. So my data is exactly the same as “Willis’s Collation CMIP5 Models” (5.8MB file) sheet1 – which is global air temperature for models run on RPC4.5. Unfortunately I can’t rerun the analysis on the other file until after Easter, when I get back to the UK.

  10. Ron Graf says:

    Jochem Marotzke replied to Ken with data. It thus seems likely he might be responding to individual requests from those not named Steve Mc.

  11. Paul_K says:

    Let me make a more general comment on your post, while we continue to figure out the strange difference in datasourcing.

    I think that you raise one issue which is very important – the functional form of natural variation. The spectral characteristics of the observed temperature dataset(s) are not well matched by the AOGCMs. In particular, the quasi 60 year cycle seen in the instrumental temperature data can be tracked back for hundreds of years in long-term local temperature records, and for thousands of years in high resolution proxies ( They can also be tracked reliably through MSL (tide gauge) data back to at least 1700 (Jevrejeva 2008), and yet the PIControl runs show no evidence of such cycles.

    Natural variation in the GCMs is determined largely by ENSO events, and results in something akin to AR(1) noise in most, but not all, GCMs. Most of the GCMs do however manage to produce something which approximates the 60 year cycle in the 20th century by the judicious adjustment of aerosol forcing from mid-century onwards.

    GCM modelers therefore explain these cycles (in the modern instrument period) as partially forced with some stochastic variation about the forced response, which variation has characteristics of long-term persistence by dint of high-lag autcorrelation. In fact, they have little choice in retaining this view to avoid a logical inconsistency, since their alternative would be to acknowledge that there is a critical climate feature which the GCMs are getting badly wrong. So “the mainstream view” continues to be that the 60-year cycles are not predictably recurrent events.

    You may believe that this is an ill-founded position. I certainly do. I also think that the view is slowly changing under the massive weight of the contrary evidence. However, to the extent that this view persists, I do not think that you can effectively challenge the M&F paper by invoking the 60-year cycles as a problem in their analysis – unless you are prepared to challenge simultaneously the mainstream view on this issue that their presence in the modern series is little more than a stochastic accident.

    • Clive Best says:

      Indeed the models are dependent on aerosols to reproduce 20th century warming, otherwise they would be oversensitive to CO2. The fine tuning of volcanic eruptions and anthropogenic aerosols almost reproduces the 60 year cycle, however, interestingly though the net effect is to dampen the CO2 forcing of the models

      I agree with your other comments. However, the modelers are struggling to explain the current pause since there has been no recent volcanic activity. Gavin Schmidt was even proposing carbon smog emissions in China as a solution at one point. If the underlying cause of the pause is really the downturn of the 60y oscillation then it will last at least another 10 years. M&F would be hard pushed to dismiss a 30 year statistical fluctuations. However, whether we can wait that long to find out is another matter!

  12. Ron Graf says:

    Paul, you bring up important but separate issues:

    1) The models do not deal well with the 60-year signal that even Mann now says is there.

    2) M&F’s claim that the models are off as now just like they have been in hindcasting does not change by adding any new insights if the pause is also explained equally be the PDA/AMO. But the problem I have had with M&F is that one could have known everything they validly claim by a simple analysis of the observed record. The models have no prognostic value to the past.

    3) If the 60-year signal is responsible for the pause it was also responsible for the 1940s and 1980s rise, something that Mann mostly ignored. Perhaps a paper needs to be written that not only state’s this obvious fact but also makes a forecast from it.

    4) If the models adjust for the PDO/AMO wouldn’t this tighten the error bars and make any other bad assumptions more obvious?

  13. Paul_K says:

    Ron Graf,
    I promised you a lengthy comment on Lucia’s which I have not yet delivered on. I decided that I need to place it as an article because there is no way I can show what is happening in a limited comment. The M&F paper is unequivocally flawed, but it will take me a day or two to collate the material which proves the fact beyond any doubt.
    With respect to your points above:-

    1) Agreed.
    2) Well I agree that the M&F argument is a little bizarre, coming down to the argument that the models are as bad today as they have always been over a 15 year period. (I will hopefully demonstrate for you eventually that the argument in support of this conclusion is completely unsupportable, even though I think that the conclusion itself has more than a grain of truth in it.) However, in your suggesting that “one could have known everything they validly claim by a simple analysis of the observed record”, I think that you are misunderstanding what I will call “the mainstream stance” on natural variability. The stance is that the observational record is just one realisation that happened to occur from a large possibility of outcomes, each of which would be fully compatible with unpredictable natural variability superposed on a deterministic forced response. If the natural variability is fully stochastic and only predictable within the broad limits of an as-yet-imperfectly-understood statistical model, then matching the observed data is neither a necessary nor sufficient condition to establish the validity of a model. (It IS a necessary condition that a model should be able to encompass the observation-space within its possible outcomes.)
    3) Agreed. Several papers have already been written on exactly this subject, (e.g Loehle and Scafetta, Scafetta, Wyatt and Curry, Kosaka and Xie and many others) but they are not widely accepted by the mainstream community, largely because the “predictable recurrence” of the 60 year cycle is not yet accepted as part of the mainstream position.
    4) Yes. However, there is some evidence that the 60 year cycles are forced cycles. Accepting that they are predictably recurrent is only part of the problem. Deciding where the exogenous forcing comes from, if they are indeed forced, makes a big difference to how they are included into the base case.

  14. Ron Graf says:

    Yes. Thank you. I did not realize that degree that the variability was assumed to be unpredictable. If the phenomena is unknowable I suppose the scientist is unaccountable.

    Ken, replied with his results using M&F’s supplied data here:

Leave a Reply