## Abstract

Deriving global temperatures anomalies involves the surface averaging of normalized ocean and station temperature data that are inhomogeneously distributed in both space and time. Different groups have adopted different averaging schemes to deal with this problem. For example GISS use approximately 8000 equal area cells and interpolate to near neighbor stations. Berkeley Earth (Robert Rohde, 2013) fit a temperature distribution to a 1-degree grid, while Hadcrut4 (Osborn & Jones, 2014) use regular binning onto a 5-degree grid. Cowtan and Way (Cowtan & Way, 2014) then attempt to correct Hadcrut4 for spatial bias by kriging results into sparse coverage regions, guided by satellite data. In this paper we look at alternative methods to these which are based on averaging over the 3D spheroidal surface of the earth. It is shown that this approach alone removes any spatial bias, thereby avoiding direct interpolation. A spherical triangulation method is described which additionally has the benefit of avoiding binning completely by using each data point individually. Longer term 3D averaging is investigated by using an equal area Icosahedral spherical binning. New monthly and annual temperature series are presented for each new method based on a) merging CRUTEM4 with HADSST3 (HADCRUT4.5), and b) merging GHN V3C with HadSST3.

## Introduction

Hadcrut4 (Osborn & Jones, 2014) is one of the most widely used global temperature records and was originally adopted by the IPCC as their official measure of climate change. It is based only on direct measurements and avoids any extrapolation into un-sampled regions. Hadcrut4 is based on two main sources. CRUTEM4 (Jones P.D., 2012) covers land based weather station data collected and processed by the Climate Research Unit (CRU) at the university of East Anglia, and the second source is the Sea Surface temperature data HADSST3 (Kennedy J.J., 2011) processed by the Hadley Centre. An independent set of station data is also maintained by NOAA NCDC (Lawrimore, 2011). The latest set V3C has similarly been processed and combined with HADSST3 to compare results.

The algorithm used to calculate the Hadcrut4.5 global average temperature anomaly for each month uses a 5×5° grid in latitude and longitude. The baseline climatology used to derive temperature anomalies for each station are the 12 monthly average temperatures measured between 1961 and 1990. All such stations anomalies within the same 5×5° bin are then averaged together, as are SST values. The monthly global average is then the area weighted (cos _{lat}) average over all occupied 2592 bins. The yearly average is simply the 12 month average.

In this study we look at alternative methods to calculate the global average, based only on using measured data without any interpolation. Two of the methods use the 3D spherical surface of the earth rather than the usual 2D latitude, longitude projection. Direct triangulation methods have the advantage that they avoid any averaging and use each individual station data. Finally we compare the results of each method to the standard monthly and annual averages with those of Hadcrut4.5, and also to the interpolated (kriging) results of Cowtan & Way (Cowtan & Way, 2014).

## Method 1. 2D Triangulation of measurements in (Lat,Lon)

The basic idea is to form a triangular mesh of all monthly measurements and then treat each triangle separately to avoid any binning.

IDL (Harris) is used to form a Delauney triangulation in (lat,lon) of all locations where there are measurements recorded in each month. The grid itself will vary from one month to the next because station data is not contiguous. The algorithm used to calculate the global average is the following

- Each triangle contains one measurement at each vertex. We use Heron’s formula to calculate the area of the triangle.
- Calculate the centroid position and assign this an anomaly equal to the average of all 3 vertex values. This centroid value is then used as the average anomaly for the whole triangular area.
- Use a spatial weighting for each triangle in the integration equal to cos(lat)*area, where Lat is the latitude of the centroid of the triangle.

The global average is then

where is the spatial weighting.

An example of such a triangulation grid is shown in Figure 1.

## Method 2. 3D Spherical Triangulation

The second method extends triangulation to 3 dimensions. This has the advantage that it can now provide a full coverage of Polar Regions. This is an elegant method for spatial integration of irregular temperature data by using spherical triangulation over the earth’s surface. The method treats each measurement equally and covers the earth’s surface with a triangular mesh of station & SST nodes. Unlike linear 2D triangulation (described above) spherical triangulation also spans polar-regions because it uses a 3D model. Spherical triangulation returns the 3D Cartesian coordinates of each triangle vertex. The temperature of the triangular area is then set to the average of each vertex. The global average is the area weighted mean of all triangles. For the annual average we take the 12 monthly average of each global average. This is because the grid changes from one month to another. We investigate how to solve this in the third method.

Spherical Triangulation essentially also solves the coverage bias problem for Hadcrut4, without any need for interpolation or use of external satellite data. Figure 3 shows a comparison of the Spherical Triangulation data to the Cowtan & Way data (Cowtan & Way, 2014). The agreement between the two is remarkable. This means that HADCRUT4 can already describe full global coverage, if it is treated in 3 dimensions instead of 2.

## Method 3. Icosahedral Grids

The last method addresses the problem of defining unbiased fixed spherical grids in 3D. All the major temperature series use latitude, longitude gridding to average spatial temperatures. Problems arise near the poles, because bin sizes reduce as cos(lat), and the pole itself is a singularity .

Spherical triangulation of monthly global temperatures is the most direct way to present temperature data, but it has one drawback. You cannot make annual or decadal averages directly on the triangular mesh, because it is forever changing from one month to another. You can only average monthly global averages together. To make such local averages you really need a fixed grid. So does that mean that we have to give up and return to 2D fixed lat,lon grids such as those used by CRU, Berkeley, NOAA or NASA ? The answer is no because there is a neat method of defining fixed 3D grids that maintain spherical symmetry. This is based on subdividing an Icosahedron (Ning Wang, 2011).

We start with a 20-sided icosahedron, which is a 3-D object all of whose sides are equilateral triangles. We then divide each triangle into 4 equal triangles by connecting each midpoint edge. These points are then extended outwards from the centre to lie on the surface of a unit sphere. The process is repeated n-times to form a symmetric spherical mesh as shown in Figure 4. It can be shown that such a grid formed from an icosahedron is the most accurate representation of a sphere (ref).

A level-4 icosahedron with 5120 triangles provides the best match to the actual distribution of Hadcrut4 global temperature data. An example of such a mesh, centered on the North Pole, is shown in figure 5.

The global anomaly is calculated by binning the data onto the grid in exactly the same way as is done for the 5×5 degree grid. A station lies within a given triangle if it is encircled exactly once. This is called the winding number. All stations within a given triangle and within a given time interval are then averaged. Such averages can be done monthly, annually or for each decade on the fixed grid. The global average is simply the numerical average over all bins, without any need for area weighting because they are all of the same area. Such grids allow annual and decadal averaging of temperature anomalies. Table 1 shows six successive decadal temperature averages calculated in this way using GHCNV3 and HADSST3, centered on Africa, while Table 2 shows the same decadal trends centered on the Arctic.

Western Hemisphere |
||

1951-1960 (50s) | 1961-1970 (6os) | 1971-1980 (70s) |

1981-1990 (80s) | 1991-2000 (9os) | 2001-2010 (2000s) |

##### Table 1: 10-year averages of temperature anomalies for GHCNV3+HADSST3 on an icosahedral grid normalized to a 1961-1990 baseline.

Arctic |
||

50s | 6os | 70s |

80s | 9os | 2000s |

##### Table 2: Decadal averages for Arctic regions.

The results show a clear decadal warming trend after 1980 especially in the Arctic. Spatial results can be viewed in 3D e.g. http://clivebest.com/webgl/earth2016.html

## Results

CRUTEM4 station data and HADSST3 SST data have been processed using the three methods described above. These global averages have then been compared to Hadcrut4.5 data. Of the 7830 station data some 6857 have sufficient information to calculate anomalies and position them. A very small random perturbation has been added to the position of each SST data in order to triangulate them. This is because otherwise they would lie on exactly parallel latitude lines and the triangulation algorithm fails. Example triangulations are shown in figures 1 and 2.

A comparison of the annual values calculated by all 3 methods to the quoted Hadcrut4.5 values is shown in Figure 6.

If instead of CRUTEM4 we use the NOAA GHCN V3C station data we get almost the same result. There are 300 more stations in V3C and although there is a large overlap, the homogenisation procedures are different. Figure 7 shows the comparison between the two plotted on the 1961-1990 baseline.

In general the agreement between all three averaging methods is good. However there is a systematic difference after ~2000 between Hadcrut4.5 binning and the triangulation methods. This is due to the different processing procedures in Polar Regions. The spherical triangulation gives slightly higher net annual values than Hadcrut4.5 (up to 0.05C) after 2000.

The monthly comparison is shown in Figure 8, excluding the icosahedral binning. The agreement is good but the same systematic effects are observed for the same reasons.

The most interesting result is that Spherical Triangulation reproduces almost exactly the Cowtan and Way result. This shows that triangulation alone resolves any Hadcrut4 coverage biases in an internally consistent manner, without the need to use interpolation or bin averaging.

Another advantage of triangulation is that land ocean borders are respected in a consistent way. There is always a problem when averaging across a grid cell containing both land and ocean data. Ideally the average should be weighted by the fraction of land/ocean within the cell. This problem is avoided in triangulation because all measurements contribute equally irrespective of location.

## Conclusions

We propose that for monthly averages the spherical triangulation method is the most accurate, because it does the best job of covering the poles. This conclusion is supported by the observation that it reproduces Cowatn & Way’s result, without the need for satellite data or kriging. The drawback of triangulation is that the grid is forever changing with time making spatial averaging difficult. For this reason icosahedral grids are proposed as optimum for annual or decadal averaging, although differences are small compared to latitude, longitude gridding. The traditional Hadcrut4.5 5×5 degree lat, lon binning is the most direct method, but it does slightly underestimate recent trends, which are concentrated in the Arctic. This is also partly because (lat, lon) cells can never span the poles.

All relevant software can be found at http://clivebest.com/data/t-study

### Acknowledgement

Useful discussion and advice from Dr. Nick Stokes https://moyhu.blogspot.co.uk/

## References

Cowtan, K., & Way, R. G. (2014). Coverage bias in the HadCRUT4 temperature series and its impact on recent temperature trends. *Q.J.R. Meteorol. Soc.* *, 140*, 1935–1944.

Jones P.D., L. D. (2012). Hemispheric and large-scale land surface air temperature variations: an extensive revision and an update to 2010. *Journal of Geophysical Research* *, 117*.

Kennedy J.J., R. N. (2011). Reassessing biases and other uncertainties in sea-surface temperature observations since 1850 part 1: measurement and sampling errors.Part 1&2. *J.Geophys. Res.* *, 116*.

Lawrimore, J. H. (2011). An overview of the Global Historical Climatology Network monthly mean temperature data set, version 3. *J. Geophys. Res.* *, 116*.

Ning Wang, J.-L. L. (2011). GEOMETRIC PROPERTIES OF THE ICOSAHEDRAL-HEXAGONAL GRID ON THE TWO-SPHERE. *SIAM J. SCI. COMPUT* *, 33*, 2536-2559.

Osborn, T., & Jones, P. (2014). The CRUTEM4 land-surface air temperature data set: construction, previous versions and dissemination via Google Earth. *Earth System Science Data* *, 6*, 61-68.

Robert Rohde, R. A. (2013). Berkeley Earth Temperature Averaging Process. *Geoinfor Geostat* .

**Note:** I wrote this work up as a paper for possible publication. However journal charges have become so ridiculously expensive for any individual that I will simply post it here instead.

Congratulations, Clive. I agree with you of course on triangular mesh; we’ve often discussed that. And I agree too on icosahedra. In the past I’ve used a cubed sphere in a similar way, with a similar bisection process. I’ve been slowly working on a post (any day now) which is similar for icosahedra. Mainly the idea is that, for routine use, you can escape from geometric details by just recording in which cell each station from the inventory lies, and the area of the cell. Then for each month, it’s just a lookup. I think area-weighting is still needed, because there is still some variation in area.

There remains the issue of empty cells. If they are just omitted, they are implicitly assigned the global average, and this is often inappropriate – Cowtan and Way’s paper was basically about fixing that. I use a diffusion method for the cubed sphere to get an average of neighbours. But there is a neat way that follows from the bisection method that you are using. You can average at each level of bisection. Then at the finest level, you can assign to each cell the average of data within it, if it has data, or if not, the average of the smallest cell from a coarser bisection that contains it and has data.

I think the triangulation method still has a land-ocean problem, because around coasts some triangles have nodes of mixed types, and within the triangle, you are interpolating with that land/ocean mix. It is possible to refine the mesh so that such triangles are greatly reduced in area. I described that here.

ps I was puzzled by your 2562 triangles. Doesn’t it have to be a multiple of 20?

The link to “all relevant software” gives a 404 error.

I need to setup a proper readme file with explanation. Will be there soon.

The link should work now. You might be interested in the Icosahedral code which generates the grid to level ‘n’ and handles all the binning. They were written by N. Teanby in 2005

http://clivebest.com/data/t-study/Zicos.zip

Although they are written in IDL it should be fairly easy to adapt them for R.

Thanks Nick,

The cubed sphere is an interesting idea. I suppose it could also work for an octoherdron or a dodecahedron, but the maths would be a lot harder. There will always be an area weighting problem, but I think it is insignificant for a isocahedron generated mesh. Yes 2562 is a mistake. It should have been 5120. I subdivide the icosahedron 4 times, so the correct figure is 20*256 !

The triangulation method does a better job than binning when handling mixed land and ocean, because it treats each measurement individually. So for a triangle with one land and 2 ocean points, the average favours ocean and vice versa for 2 land and one ocean.

Your refinement of the grid using a land mask to refine the triangles along the coast is very clever, but you then have to assign an interpolated value to the new land/ocean intercept nodes.

Clive,

” I think it is insignificant for a isocahedron generated mesh”When you do the first division, the centre triangle has area 0.180, and the three corner ones 0.149 (on unit sphere). Not huge, but not nothing, and the ratio of differences persists.

“but you then have to assign an interpolated value “Yes, but it’s a land-only interpolation on the land side (same on sea). I haven’t persisted, because it doesn’t make that much difference. The split triangles mostly balance out.

A dodecahedron actually gives least distortion, but you can’t do much with a pentagon face, except triangulate, which is then equivalent to icosahedron.

Re “journal charges have become so ridiculously expensive for any individual that I will simply post it here instead”

– QJRMS (in which Cowtan & Way was published) has no page charges,

– International Journal of Climatology has no page charges,

– Climate Dynamics has no page charges,

– Theoretical and Applied Climatology has no page charges,

– &c

Thanks,

Climate Dynamics has no up front page charges, but if the paper gets through open review then you still have to pay several hundred euros to get it published. Initially tou had to pay up front before review.

QJMRS looks interesting but you still have to pay for colour figures. I will look at the others though.

Congratulations Clive.

Better precision allows opportunities for more investigative tests and better science. With the global treasury and lives on the line this ought to be worth publishing in a journal.

Thanks Ron.

“Berkeley Earth (Robert Rohde, 2013) fit a temperature distribution to a 1-degree grid”

Ah No.

The solution is a continous surface.

After the continous solution is done we can sample that solution in any way we want

All that is needed is an elevation map.

For the gridded solution, we take a 1deg elevation grid: Then the value of the elevation

over that grid is used along with the functions for latitude and season to derive a value

for that grid.

for google we did a 1km grid. again, the same continuous surface is used for the climate

and the weather and the value for elevation is taken from the 1km grid.

If you wanted to you could use a continuous elevation surface, or any kind of polygonal

stucture you want.

In short. we grid as a POST PROCESSING step. not as a part of the averaging process.

How do the different techniques perform in cross-validation? That will give an indication as to whether it is appropriate or not.

I don’t understand what you mean by cross-validation. Please explain.

I don’t know what he means by cross-validation in your context either. But this is the way it works for a physics-based model. Say you split a time-series data set into two intervals. One interval is used for training and the other interval is used for test (and then reverse the roles). If the parameters of the model remain relatively invariant along with a reasonable fit after reversal, then you have gained confidence in the model, and that there is likely little over-fitting.

http://contextearth.com/2017/08/08/enso-split-training-for-cross-validation/