Yacht Races; an Infographic

Thanks very much to East Freemantle Yacht Club who recently sent me this infographic detailing the great yacht races of the world, I'll let them introduce it properly:

The world’s oceans provide the setting for some of the most difficult and prestigious races in sailing, some of which have been in existence for a century or more. From the world-renowned America’s Cup, which has been sought after by yachting enthusiasts for more than 150 years, to more recent big prizes such as the Volvo Ocean Race and Vendée Globe, this infographic by East Fremantle Yacht Club (http://www.efyc.com.au/) traces the origins of these time-honoured sailing competitions. It also gives the reader a few interesting facts about each race, in addition to acknowledging the reigning champions.

Yacht races infographic from East Freemantle Yacht Club

F1 Tyre Saving

This is a guest post written by our friend Bill as a result of some discussion around the race traces I've been posting and in particular how straight the lines are in some races. Over to you Bill...

Tyre saving has become a hot topic in F1 over the past few seasons. I've heard engineers remind drivers to save tyres over the radio, and drivers blame it for poor race performance. But what should a team and driver do through a race to best exploit tyre saving? Is there anything that can be done?

To begin thinking about this, we need a couple of things in place - a model for tyre performance, and an idea of the size of the effects.

Models can be complicated, but luckily something simple seems to fit the evidence pretty well. If we assume that every lap at racing speed increases, by a constant amount, the minimum potential lap time a car can achieve on subsequent laps, then all looks good- we have an explanation for why lap times fall massively after a pit-stop (when tyres with many laps worth of this “tyre degradation” are replaced) and we see lap times that don't get much faster through a stint.

Why are constant lap times through a stint evidence of cumulative tyre deg? Well, we know F1 cars go faster when they are lighter - estimates in the public domain seem to hover around 0.03s/lap per kg. This means an F1 car should get quicker as it burns off fuel. So when it doesn't, we know a cumulative slowing effect must exist. If we assume all of this is tyre deg, then we instantly have an estimate for the size of the per lap tyre degradation effect- it has to be about the same as the gain expected from burning 1 lap of fuel. If fuel consumption is 1.75 kg a lap, then that makes tyre deg about 1.75kg\times0.03s/lap/kg = 0.05 s/lap lost every lap. With this model and 20 lap old tyres, we'd be going 20\times0.05 = 1 s/lap slower than if we were on fresh new tyres.

So what's tyre saving all about then? The idea seems to be that, if the driver goes slower at particular points on the lap, he can reduce the tyre degradation he accumulates on that lap. With our model for tyre deg, we know tyre deg slows him on all subsequent laps. Hence a reduction in tyre deg benefits him over the remainder of the laps he completes on those tyres. How a driver might save tyre deg during a lap is likely pretty complicated, and probably the focus of a fair amount of research in F1. Fortunately, we can make progress without it here - we can just model it as the function between the deliberate slowness, y seconds, a driver adds to his lap time and the per lap tyre deg, x seconds, he accumulates on that lap.

What shape should this relationship take? The upper and lower end points seem pretty obvious- it seems unlikely that tyres would get faster no matter how slow a car goes, and you'd expect Tyre deg to be maximal at a drivers flat out speed when he is doing no tyre saving. This suggests the most likely shape is something like an exponential decay:

x = c \mathrm{e}^{-\alpha y}

for  0 < y < \infty

\alpha = a positive constant.
y = the deliberate slowness a driver adds to his laptime to save tyres on a lap (s).
x = tyre deg accumulated on this lap that will affect all subsequent laps (s/lap).
c = tyre deg with y = 0; no deliberate slowness (s/lap).

So if we go flat out (y = 0) we accrue our maximum deg (c), if we go really slow (big y), we accrue close to zero deg. If there are L laps left on these tyres, the effect on total race time of going y s/lap slow on this lap is the cost y on this lap + the tyre deg effect on every remaining lap. If we let change to race time due to saving on this lap = \Delta T:

\Delta T = y + L c \left(\mathrm{e}^{-\alpha y} - 1\right)

We can look at the minimum of this by differentiating it and setting the result to zero. This yields:

\frac{\mathrm{d}\Delta T}{\mathrm{d}y} = 0 = 1 - \alpha L c \mathrm{e}^{-\alpha y}

Which results in :

y = \frac{1}{\alpha} \ln\left(\alpha L c\right)

This solution has a problem. y can be negative- which means it can ask the driver to go faster that his fastest, accumulating even more tyre degradation than at his flat out place. We've specifically disallowed this as unhelpful in our model, and believe the driver is flat out when he says he is at y = 0. This makes our real solution for the minimum:

y = max\left(0 , \frac{1}{\alpha} \ln\left(\alpha L c\right)\right)

This y doesn't depend on our behaviour on any other lap, so the fastest way to the end of the race is to go this optimal amount slower on every lap.

Was all this worth bothering about? Lets put some numbers in and see.

From our argument above, lets assume c is about the same as our reducing fuel load weight effect = 0.05 s/lap.

If we set \alpha = 2, then going 1s/lap slower than flat out saves 0.007 s/lap of tyre deg -- which doesn't seem ridiculous.

With these settings, optimal slowness, y s/lap, looks like this:

Optimal tyre saving bar chart.

Fig. 1: Optimal tyre saving with 20 laps remaining and \alpha = 2.

The most striking feature of this is that it is zero towards the end of the stint. This suggests the driver shouldn't be doing any saving from lap 11 onwards - just rinsing his tyres for all they are worth. It's just not worth going slow at all from here on in as there aren't enough future laps to recoup how slow you had to go on this lap to get the performance. All the important tyre saving is done at the start of a stint.

Slightly less expected is the behaviour at the start of a long stint with a lot of laps left - you don't go that much slower than on the previous lap. The function is convex in this area. Despite the massive gain you get by having your saving last for a lot of subsequent laps, you're already going quite slow and are well into the greatly diminishing part of the exponential and so barely get any return for going a lot slower.

So how would a car driven like this stack up in a race with a car driven flat out from the start? I've compared those two, and a car driven with optimal constant slowness and optimal linear reducing slowness in the race trace below.

Race-trace for various strategies.

Fig. 2: Race-trace for various tyre saving strategies, one stint only.

Pleasingly, our optimal deliberate slowness model wins. It performs only a little bit better than optimal linear reducing slowness, but a load better than going flat out every lap. The race trace shows the optimally driven car drops back by over a second over the first few laps, but then catches up and more as he puts his saved tyre performance to use.

This profile (and the more extreme ones for higher deg) make for some interesting possibilities. In an effective two horse race, there seems little disadvantage to the second car in going a little bit slower than the lead car at the start of the stint – you will save tyre performance and be quicker at the end of the stint. If the first car is driven optimally, he won't catch it before the end of the race, but if the first car has any issues at all (safety cars, missing a chicane, a slow lap...) he will not only close the gap, but will have a faster car than his opponent for the remainder of the race. Moreover, if the other car has underestimated the actual tyre deg rate, he will be driving closer to optimal and be able to catch, and have the chance to pass him, before the end of the race.

The model also gives us a clue as to why tyre saving seems to be a relatively recent hot topic. With just a small reduction in tyre deg (to 75% of our estimate), the optimal slowness is always zero and a driver should be going flat out from the start. No tyre saving helps.

As ever, the real situation is likely to be more complicated than we have modelled. We've ignored the probability of being slowed by other cars and we've assumed tyre deg behaviour is constant and known throughout a race, rather than variable and hard to predict. All of these are likely to be important factors, and must make for some interesting race day strategy debates within teams.

The (not-so) Exponential Growth of Knowledge

A couple of weeks ago I enjoyed a dinner in London with other alumni of University College Oxford. In speaking about the state of the college, the Master of Univ said that it was a constant struggle to decide what to include in undergraduate courses because,

"...our knowledge increases exponentially."

I know the Master was speaking figuratively, but being an engineer I started to wonder if knowledge actually does increase exponentially, and what conditions would be necessary to make this happen?

Where to start? Well, let's start with Einstein, or rather one of Einstein's collaborators, John Archibald Wheeler. In 1992 he said,

"We live on an island surrounded by a sea of ignorance. As our island of knowledge grows, so does the shore of our ignorance"

How fast does this island of knowledge grow? Let's start by making some assumptions about Wheeler's universe, namely:

  1. We expand the island by reclaiming 'knowledge land' from 'ignorance sea'.
  2. We always have enough resources on the island to reclaim land, but...
  3. We don't have any boats, so we have to stand on the island to do the island expansion work.

All of these assumptions add up to the realization that the rate of expansion of the island is proportional to the length of the shoreline. So if our island is round, or roughly round, and the area of the island corresponds to the amount of knowledge, K, then it will have a shoreline length, s, equal to the circumference of the circle. So we can write:

\frac{dK}{dt} = As = A\cdot2\sqrt{\pi K}


John Archibald Wheeler's Island of Knowledge surrounded by the Sea of Ignorance.

John Archibald Wheeler's Island of Knowledge surrounded by the Sea of Ignorance.

With an ignorance-reclaiming-speed of A m/s. This doesn't give us the exponential rate that the Master spoke of, it's merely quadratic:

 K(t) = A^2\pi t^2 .

Perhaps if the island were a more convenient shape we could achieve exponential growth? In fact a circle is the worst possible shape for the island because it has the lowest ratio of shoreline to area of any 2D shape. The best possible shape would be long and thin, infinitely thin in fact, then area and shoreline would be proportional, and so our island would grow exponentially, wouldn't it? Well, no, it wouldn't, at least not for very long. In order to maintain exponential growth the island must stay infinitely thin and so it can only grow at its ends, but this isn't really the land reclaim model we started off with. You can hardly say that shoreline length is the limiting factor in the growth of an island if you insist on reclaiming land only at the infinitely thin ends of the island! In fact we find an interesting result that if we grow uniformly from each part of the shoreline then no matter what clever shape we start off with we'll always end up with a circular island! Even if we start with exponential growth of the island (perhaps from some clever fractal geometry), we will soon settle to the growth rate given above, which increases with time, but is nonetheless far from exponential.

In fact there may be a paradox even in the Master's statement of the problem: if knowledge grows exponentially, then the growth rate must be proportional to the amount of knowledge. Growth in knowledge requires research, and researchers, but the Master's statement was itself concerned with the fact that each year we must select a smaller fraction of our ever growing knowledge to teach to the next generation of researchers. If this fraction is decreasing, then it looks like we're on a circular island of knowledge - it will grow ever faster, but at the same time, ever slower than exponential growth.

"Going Downstairs" in Professional Darts

A week ago I flicked on the TV and caught the tail end of a quarter final in the World Darts Championship. One of the pundits (with lots of gold chains round his neck) was saying "if you're not hitting twenties, go for 19s, and if that doesn't work try 18s, even 17s if you have to!". The message was that if you start missing the treble 20 then you should switch it up and go for lower numbers until you found a treble you could hit more reliably. This got me thinking, do darts players actually do better if they switch to aiming at treble 19 after missing a couple of treble 20s? To do so they would surely have to get more accurate after switching to 19s than they were when aiming at 20s, to compensate for the lower score for each dart. I mentioned this to Dave Millican at work and he suggested that 19 might be a safer target for a player with an accuracy problem because the segments either side of it aren't such low scores as those surrounding the 20. This is a good point, the 20 is flanked by 5 and 1, whereas the 19 has the slightly friendlier 7 and 3 for company. Although the same logic doesn't support what the pundit was saying about going for 18 if 19 wasn't working for you, as the 18 has the 1 and the 4 either side which is worse than the 20!

A quick search on Google Scholar revealed a very interesting paper written by researchers at Stanford who have looked into this question and found some interesting results. They calculated the optimal place to aim on the dartboard for players of varying levels of skill. Skillful players consistently land their darts close to the point that they're aiming at, so when aiming at a fixed point their darts land in tight groups; in other words, the standard deviation of the distance between the landing point and the aiming point is small. Rubbish players' darts land all over the place; they have a high standard deviation. Tibshirani, Price, and Taylor found the best place for any player to aim at by maximising the integral of the score over the area in which their darts are likely to land, weighted by the likelihood that their darts land there, for any given aiming point. As you might expect, they found that a very good player, with a standard deviation of only a few millimetres, should aim at the treble 20 for maximum points. A rubbish player, with a very high standard deviation, should aim very near the centre of the board, to minimise the chance of missing the board altogether! The interesting bit is what happens in between very good and rubbish. As standard deviation increases from zero, the optimum aiming point moves slightly up and to the left, to favour hitting the 5 instead of the 1 on the occasional wild dart. Then, when standard deviation increases beyond 16.4mm, the optimum aiming point jumps to treble 19! As the standard deviation further increases, perhaps after the third pint, the optimal aiming point curves upwards and then around to the right until it settles just to the left of the bullseye.

Figure 1: movement of the optimum aiming point as a player's standard deviation increases (reproduced with permission from A Statistician Plays Darts , Tibshirani, Price, Taylor - JRSS Series A, Vol. 174, No. 1, 213-226, 2011)

So it seems that our pundit was right in one case, it is worth "going downstairs" for the treble 19 if your aim on treble 20s degrades past a certain point. But in terms of maximising expected score, it's not worth switching to 18 if your aim on 19 isn't good, nor to 17 if your aim on 18 isn't good. Of course there may be psychological factors, and a certain bias due to a player's dart distribution being skewed at an angle and therefore making her better at hitting trebles at one angle than another (this is also investigated in the paper).

In terms of whether darts players do the right thing during matches (statistically speaking) there is still a question mark over switching to treble 19. Players obviously only switch to 19 if they start missing treble 20s, in other words their internal estimate of their standard deviation rises and they react by switching to 19. The question is how does this internal estimate compare to the true value, and how close to the ideal threshold of 16.4mm do they switch between 19 and 20? This is much more difficult to answer than our first question. To do so we would need dart by dart position data for a player over many many games, including many occasions where the player switched to 19, which most players don't do that often. Perhaps I should write a video analysis algorithm to watch and datafy TV footage of darts and then crunch the numbers to see which players' mental estimates are closest to the truth. My guess would be that some players tend to switch too soon because of one random dart going awry when in fact this is not sufficient evidence that the underlying standard deviation has actually changed. On the other hand, maybe the player has a good idea of not only what his aim is like at the moment, but whether he's feeling more tense and therefore about to get worse, enabling him to preemptively switch to 19 before hitting the 1!

Start-line Bias

Two weeks ago I did a yacht race in Shoreham and there was some interesting discussion before the start about which end of the line was favoured. Normally, when the windward mark is directly upwind from the start line, picking the favoured end is very simple. In this case it was a bit trickier; the start line was not square, which is not unusual, but also the 'windward' mark was so far off to the starboard that the port layline for the windward mark actually bisected the start line. In this situation, starting from the starboard end of the line would mean that you would have to tack, whilst starting from the port end of the line would mean that you could reach to the mark in one. In the case of this race in Shoreham, the port (windward) end of the line was actually closer to the windward mark than the starboard end (as pinged by our GPS), so not only did it offer a faster point of sail, but less distance to sail as well: no-brainer, start at the port end! The more interesting question is what would have been the best part of the line if the windward end had been further from the mark than the leeward end?

Figure 1: the startline as it actually was, C-B, and the more interesting case, A-B.

Figure one shows the windward mark, W, the start line as it was in Shoreham, C-B, and the more interesting case of a squarer startline, A-B. When the windward end is further from the top mark than the leeward end there is a trade-off between the extra distance you have to sail and the extra speed you get on a lower point of sail. So how do you resolve this trade-off and find the optimal starting point?

The answer lies (as do the answers to so many sailing questions) in the polar diagram for the boat. The polar tells us how fast the boat will sail at every wind angle, so we must be able to use it to find the point on the start line which will get us to the windward mark in the least time, we just need the right bit of geometry.

The right bit of geometry is shown in figure two; step one is to draw/trace/superimpose the polar diagram centred on the windward mark, but we draw it upside-down relative to the wind direction, scale doesn't matter. Next we transpose the startline (maybe with a parallel rule) towards the windward mark until it just touches the polar. Finally, we draw a line through windward mark and the point where the polar and the transposed startline touch, extending it until it intersects the actual startline. The point of intersection between this line and the startline is the optimal place to start, easy!

Figure 2: geometric construction to find optimal starting point, X. The upside-down polar is shown centred on the windward mark in blue. The port layline is shown in red. Transposed start-lines are shown in green.

Figure 2: geometric construction to find optimal starting point, X. The upside-down polar is shown in blue, centred on the windward mark. The port layline is shown in red. Transposed start-lines are shown in green.

In the case of a one-design fleet sailing on the course shown in figure two, a boat starting from X will take about 4.5% less time to reach than a boat starting on the layline (from L). So if these boats do 6kn (knots) when hard on the wind (i.e., along L-W) and the distance L-W is one mile, then the boat starting from L will reach the windward mark after 10 minutes. By which time the boat that started from X will be 27 seconds ahead, which is a lead of 83m!

Looking at it another way, the J92 I was sailing on in Shoreham is 9.2m long, so it does one length every 3s when travelling at 6kn, which means that even if the windward mark was only 250m from L, the boat starting from X would still be able to round the windward mark clear ahead of the boat that started from L without having to give mark room!

The table below lists the distance sailed, speed, and time to reach W for each of four boats that start at A, X, L, and B:

starting point distance (nM) speed (kn) time
A 1.15 7.16 9:40
X 1.10 6.91 9:33
L 1.00 6.00 10:00
B 0.95 5.33 10:41

So clearly working out where on the line to start is well worth it in situations such as this, if we had assumed that the line can only be biased towards one end or another then we would have been over two lengths behind where we could have been had we started at X.

Figure three shows us practising our reach to the windward mark, that's me in red and white easing the jib sheet:

Figure 3: Upstart on a practise reach from the line to the windward mark.

Figure 3: Upstart on a practise reach from the line to the windward mark.

The Quickest Way to Rack Pool Balls

Whilst playing pool in the college bar years ago my friend Chris posed an interesting question: after removing the balls from the end of the table and randomly dropping them into the triangle, what is the minimum number of 'moves' necessary to rearrange the balls into the correct pattern? Where a 'move' is swapping the positions of two balls.

It turns out that if the black ball starts off in one of the centre positions (marked with an 'X' in Fig 1) then you can always get to the correct pattern in 3 swaps or fewer. If the black starts off in one of the outside positions then you can still get to the correct pattern in 3 swaps 75% of the time, the remaining 25% require one additional swap. An explanation of this result is given below...

The correct pattern, as defined by the World Eightball Pool Federation, is shown on the left in Fig 1:

The official pattern (left) and symmetrical variations of it (right).

Fig 1: The official pattern (left) and symmetrical variations of it (right).

To make our lives easier, we will also allow the three patterns on the right of the figure, as these are all symmetrical, in colour or side, versions of the official pattern. The table is symmetrical, so these 4 are all equivalent. We also note that turning the triangle 1/3 of a turn either way is very easy, so we will not count this as a 'move'. This means that all 3 rotated versions of each of the above 4 patterns are counted as correct. In total then we have 12 patterns which we consider to be 'solved'.

Let's number the positions, starting from 1 at the bottom left corner, and counting to the right along the bottom row, up to 5, then left to right along the second row, from 6 to 9 etc... If we represent each red ball as a '1', each yellow ball as a '0', and the black as a '2', then we can write down the pattern of balls in the official arrangement (left of Fig 1) as a list of numbers, or vector, namely:

[1 0 0 1 0 0 1 0 1 1 2 0 0 1 1]

If we swap the bottom left red with the yellow on its right (positions 1 and 2), we get a slightly different pattern:

[0 1 0 1 0 0 1 0 1 1 2 0 0 1 1]

If we have these two vectors on a computer we can easily find which elements don't match by using the 'not equal to' operation, in Matlab this is denoted by '~=', so for example:

[1 0 0 1 0 0 1 0 1 1 2 0 0 1 1]


[0 1 0 1 0 0 1 0 1 1 2 0 0 1 1]


[1 1 0 0 0 0 0 0 0 0 0 0 0 0 0]

In this way we can identify which, and how many, balls differ in position from one pattern to the other. One swap changes the positions of two balls, so if we divide the number of differences between any two patterns by two we get the number of swaps necessary to get from one pattern to the other. We have to round this number up, if it's not an integer, because if two patterns differ by the positions of three balls (one of which must be black) then two swaps will be required to change one into the other. We can't do one and a half swaps!

It is a fairly simple operation to compute all patterns that are possible with seven '1's, seven '0's and one '2'; there are 51,480 of them. Using the above method we can find the number of swaps necessary to change each of the 51,480 possible patterns into each of the 12 solutions. Taking the smallest of these 12 numbers for each pattern we find the answer to our question:

If the black is in one of the central positions (which happens in 3 of every 15 patterns) then the pattern can always be solved in 3 swaps; if the black is not in the centre then the pattern can be solved in three swaps 75% of the time, and always in 4 swaps!