I have calculated from scratch the global averaged temperature anomalies for the 73 proxies used by Marcott et al. in their recent Science paper. The method used is described in the previous post. It avoids any interpolation between measurements and is based on the same processing software that is used to derive Hadcrut4 anomalies from weather station data. Shown below is the result covering the last 1000 years averaged in 50-year bins. I am using the published dates. Re-dating is a separate issue discussed below.

Figure 1: Detail of the last 1000 years showing in black the global averaged Proxy data and in red the HADCRUT4 anomalies. The proxies have been normalised to 1961-1990. Shown in blue is the result for the Proxies after excluding TN05-17.
There is no evidence of a recent uptick in this data. Previously I had noticed that much of the apparent upturn for the last 100 year bin was due to a single Proxy : TN05-17 situated in the Southern Ocean (Lat=-50, Lon=6). The blue dashed curve shows the 50 year resolution anomaly result after excluding this single proxy.
Figure 2 shows the anomaly data using the modified carbon dating (re-dating). This has been identified by Steve McIntyre and others as the main cause of the up-tick. However I think this is only part of the story.

Figure 2: Global temperature anomalies using the modified dates (Marine09 etc). Proxies are averaged in 50 year time intervals.
The new dating suppresses the anomalies from 1600-1800. There is a single high point for the period 1900-1950. The much larger spike evident in the paper around 1940 (see also here) is in my opinion mainly due to the interpolation to a fixed 20 year interval. This generates more points than measurement data and is very sensitive to time-scale boundaries. It is simply wrong to interpolate the proxies to a 20 year time-base because most of the proxies only have measurement resolutions > 100 years. I believe you should only ever use measured values and not generated values.
There is no convincing evidence of a recent upswing in global temperatures in either graph based on the published or on the modified dates. I therefore suspect that Marcott’s result is most likely an artefact due to their interpolation of the measurement data to a fixed 20 year timebase, which is then accentuated by a re-dating of the measurements.
updated 24/3 : include re-dating graph
PERL code used in the previous post and this one can be downloaded here
- convert.pl reads the spreadsheet and generates “station files”. or download them here
- marcott_gridder.pl generates a 5×5 grid with 50/100 year binning
- marcott_global_average_ts_ascii.pl generates Global , NH and SH averages.
- Step1. >perl convert.pl
- Step2. >perl marcott_gridder.pl | perl marcott_global_average_ts_ascii.pl > redate-results would generate the results.
- You can avoid the the first convert.pl step by downloading the generated station files here. Stations-new contains the redated stations. Stations_files contains the published dates.
The last 2 scripts are modified versions of the Met Office analysis software for CRUTEM4 – British Crown Copyright (c) 2009, the Met Office.
Well done, sir. I’m glad you chose to look into that paper. One clarification: when you say, “Figure 2 shows the anomaly data using the modified carbon dating (re-dating).” are you referring to the changes of order one thousand years in the core-top date that Steve McIntyre claims Marcott el al. performed, or to the changes of a few years that occurred when they re-calibrated their carbon dating, or both at the same time?
Yes – I am referring to exactly the same set of dates as Steve McIntyre. The excel sheet provided as supplementary material to the science paper contains 2 separate columns of dates “Published Dates” and “Marcott” dates labelled variously as :Marine09, SHcal04 etc. One of these proxies indeed has been re-calibrated to be 1000 years earlier.
My main point however is that it is simply wrong to generate pseudo-data by interpolating measured data to a 20 year time-base, especially because the resolution of measurements is often 100-300 years ! The re-dating just compounds the error.
Clive, do you know off hand how they performed the interpolation?
Sorry if I’m being lazy & it’s available in the article.
As far as I know they use linear interpolation between individual proxy measurements. This then puts all proxies on the same 20 year time-base. The area average is then done on this 20 year scale. So extra data points are generated.
Hi Clive, It’s my impression that it’s all linear interpolation too.
However, in a case like this, that is an extremely poor approach. As I recall, Gergis used LOESS interpolation, which sounds more reasonable, but probably isn’t much better in practice.
The problem is that, when you have a frequency limited series, polynomial based interpolation alias frequencies from bands where signal is present into bands where no signal is present. This has the predictable effect of attenuating the original signal in the frequency band of interest. And because the number of proxies vary over time, the amount of attenuation isn’t a constant, which leads to a systematic effect that varies over time (and in relation to number and quality of proxies).
The other problem is the data are irregularly sampled, and that’s not a problem that’s found it’s way into most toolboxes yet.
Ideally you’d use a frequency-domain based approach. For regularly spaced data, you’d use “Fourier interpolation”, which exists in standard toolboxes. The more general case is not, as far as I’m aware, present in general toolbox form.
However, algorithms that work do exists.
I think it is always wrong to interpolate proxy data because any scheme that increases the time resolution over measurements is bound to introduce biases.
The best you can do is to make a time histogram with fixed time binning.
Here is a map interface to the raw proxy data. II know the graphics could be better – but it does show the limited space and time coverage of the raw data !
Clive, actually if the data are really band-width limited, you can noiselessly interpolate using Fourier interpolation or other similar frequency-domain algorithms. That follows from Shannon’s Theorem.
You can’t expect polynomial interpolation to work though, which is why I’m interested in seeing how they treat this in their code…
I agree that Fourier interpolation would be better representation. Shown below are Fourier smoothing applied to the original and re-dated data.
I still think that there is no need to interpolate all proxies to a fixed time base. You can see how noisy the individual proxies are here
Thanks Clive, that’s an interesting way to do it. Your perl scripts work and I have got your fig 2. Though I had to fiddle a bit with marcott_gridder.pl to change it from 100 year binning to 50 years (4 comment-switches – would it be possible to make the bin size a parameter set just once at the top?). It’s interesting that an amateur blogger can provide a working code but top climate scientists can’t.
It’s good to have another confirmation that the uptick is an artefact, though I think we new that – you only have to look at the data, or Marcott’s thesis, or Marcott’s own figs S5 and S6. Again this seems to be beyond the abilities of the entire climate science community.
The next step is to try to reproduce their spurious uptick – have you got anywhere with that?
Sorry – The code was a complete hack. I should really have cleaned it up and commented it properly !
I think the spike can be understood by interpolating just one Proxy TNO5-17. We have only got 2 measured points.
Date Anomaly deg.C
1904 2.3
1950 4.5
Interpolation then gives us.
1900 2.0
1920 3.1
1940 4.0
1960 5.0
i.e. a linearly increasing spike !
These values are meaningless because the random variation on this one proxy is +-2 degree over the last 10,000 years. Interpolation simply invents a trend where there isn’t any.
Nick Stokes has an “R” program which generates a Marcott like spike. R works on fixed vectors and they are interpolated to a 20 year time-base (as Marcott) – see http://moyhu.blogspot.co.uk/2013/03/next-stage-of-marcott-et-al-study.html and the previous post
Marcott did not interpolate, rather he performed 1000 “perturbations” on the raw data, which permutated each datum 1000x time-wise within the age-uncertainty of that datum. But Marcott set the age-uncertainty of the 1940 bin to zero. Therefore the 1940 bin is protected from the homogenization which affects all other bins, so its uptick is protected.
Marcott writes in the supplementary material.
What is the incomprehensible “first of the perturbed time series” ? I (perhaps mistakenly) strongly suspect it is actually be the “unperturbed proxy measurement ? If that is not linear interpolation to an unjustified finer grid, then I don’t know what is !
OK – sorry. I now understand what you are saying! His procedure smooths out the bulk of the central section but still has fixed “skipping rope” ends at the start and the beginning. OK – but I still think it is more honest to rely on measurement data rather than some Monte-Carlo smoothing. It may make no real difference to the overall trend but the press release to the media – backed up by interviews with the authors purported to confirm that current temperatures were exceptional. This may possibly be the case but please don’t use this paper as evidence. The normalisation of the Marcott anomalies to Hadcrut4 is still rather ad hok.
Make special note that each perturbation is bounded (max/min time values) by the age uncertainty of the datum. Observe that Marcott set the age uncertainty of the 1940AD data to be zero. So the final 1940AD bin was processed differently than all others — it did not get flattened as all other bins were. Cheers.
Thanks NZW. Although everyone is in agreement that the uptick is spurious, we now have at least three possible explanations for the mechanism:
1. Clive’s suggestion that it comes from proxy 46, TN0517
2. McIntyre’s suggestion that it comes from date shifts and different proxies running out at different times.
3. NZW’s suggestion that it comes from a perturbation process that fixes 1940.
Of course these are not exclusive – it could be a combination of all three.
Nick’s code gives a spike without doing the perturbations, so your explanation 3 is not the only mechanism.
The date shifts that McIntyre has identified, were needed to give the 1940-bin the desired warm value. (Note that such selection carries with it an anti-selection of neighbouring bins which are thus anomalously cool) Then the perturbation process smoothes everywhere except the 1940 data, so that the 1940 uptick is the only up-or-downtick remaining with such a steep slope.
When I processed the Marcott data I got Nick’s trendline (or close to it) before the perturbations were applied. My own trendline was jumpier, so maybe Nick did some smoothing, is my guess.