Going and Trip Preferences

Introduction

At Newbury last Saturday Al Ferof ran in the Denman Chase and  Smad Place in the Novice Chase. Previously both horses had run below form when faced with, in Al Ferof’s case, the same 24 f Trip, and in Smad Place’s case, the same Heavy Going  they faced on Saturday. In Al Ferof’s case he ran 7lb below his previous best in the 24f King George VI Chase at Kempton on Boxing Day 2013, whilst Smad Place ran  13lb below his previous best on Heavy Going in the Long Walk Hurdle at Ascot in December 2012. So how important is previous form over the same Trip, or on the same Going, in assessing prospects when the same Trip or Going is faced again? On Saturday in Al Ferof’s case he again ran below form, wheras Smad Place went on to win his race, posting his highest chase rating to date.

Data, Universe & Method

The source of the data is Raceform Interactive (RFI) with the analysis carried out in the R statistical environment. National Hunt (NH) races in the database from  2006 up to early February 2014 are considered. Each time a horse runs its Racing Post Rating (RPR) is compared with the maximum RPR the horse has achieved up to and including race date. The difference between the two numbers is defined as the RPR relative (RPRrel). The maximum value RPR can achieve is zero. Following Timeform’s definition, if a horse runs within 5lb of its maximum rating it is considered to have Run To Form (RTF) – horses are then classified into two categories – either having RTF (given a 1) or below (given a 0). The Going on which races took place were classified into four categories: Quick, Good, Soft and Heavy. The distance over which races were run were classified into five Trip categories of: up to 17f, between 17f and 21f, between 21f and 25f, between 25f and 29f, and 29f plus.

To assess the importance of previous form given the Going or Trip, the RTF classification as at a particular race (1 or 0 ) is then compared with the RTF classification achieved (1 or 0) in its next race. All of the previous races at the Going or Trip are used to determine the classification, so if a horse has RTF on any occasion in the past it is allocated a 1. Horses that haven’t run on the Going or over the Trip are excluded.

In the jargon the result is a 2×2 contingency table for each Going and Trip category  that can be assessed for statistical significance using a chi-squared statistic.

Going Analysis

Each of the 2×2 contingency tables for the four Going categories is presented in Table 1 below, with the number of observations given in parentheses.  Classification by previous RTF is given in the first column, by subsequent performance in the second and third columns. Horses that have RTF in the past have about a one in three chance of repeating, for horses that haven’t RTF the chances are just over one in four. The results are highly statistically significant.

Quick (29,127) Below RTF   Good (134,422) Below RTF
Below 74% 26% Below 72% 28%
RTF 65% 35% RTF 68% 32%
Soft (46,127) Below RTF   Heavy (32,971) Below RTF
Below 74% 26% Below 75% 25%
RTF 68% 32% RTF 68% 32%

Table 1: 2×2 Contingency Tables (RTF vs. Below Form) for Going Categories

The extent of the difference in performance can be see in the graph below, which shows median differences for the four Going categories. Extremes of Going exhibit greater differences, with a range of between 1 and 3lbs.

GoingRTF

Trip Analysis

Each of the 2×2 contingency tables for the five Trip categories is presented in Table 2 below, with the number of observations given in parentheses. Classification by previous RTF is given in the first column, by subsequent performance in the second and third columns. Horses that have RTF in the past have a decreasing chance of delivering  a RTF performance next time out as Trip distance increases. This is probably because horses become more exposed as they are stepped up in Trip. In all cases, however, horses that have previously RTF outperform those that have not.  The results are again highly statistically significant.

up to 17f (86,133) Below RTF   17f – 21f (91,666) Below RTF
Below 66% 34% Below 74% 26%
RTF 61% 39% RTF 68% 32%
21f – 25f (65,248) Below RTF   25f – 29f (9,036) Below RTF
Below 78% 22% Below 80% 20%
RTF 71% 29% RTF 75% 25%
29f plus (1,565) Below RTF
Below 85% 15%
RTF 79% 21%

Table2: 2×2 Contingency Tables (RTF vs. Below Form) for Trip Categories

The extent of the difference in performance can be see in the graph below, which shows median differences for the five Trip categories. Differences are greatest when horses race between 21f – 29f. There is no difference in medians at 29f plus, however there is a difference in means.

TripRTF

Summary

If a horse has previously RTF over a particular type of Going or Trip in NH races, it is significantly more likely to do so than horses that have not. Differences in probabilities are between 5-7%, ratios substantially higher. The difference in median performance ranges between 0 and 3lbs. One possible confounding influence is the effect of horses that run consistently to form regardless of Going or Trip. However if this were the overriding effect it might be expected that the difference in performance seen in the graphs would be constant across Going and Trip categories. This is not the case. Analysis using ANOVA (analysis of variance) would answer this concern more fully.

Running to Form: Days Off, Consistency and Race Type

Introduction

Prior to the recent running of 2014’s Cleeve Hurdle, Timeform’s Micheal Williamson blogged on the relationship between days off and the chances of horses running to form, with the backdrop of Big Buck’s long-awaited return to action. By coincidence I was in the middle of some analysis on the same subject, partly in anticipation of the start of the Flat season. The focus of this blog piece is the relationship between days off, consistency, race type and runs to form (RTF) for Flat and National Hunt (NH) races.

Data, Method & Universe

The source of the data is Raceform Interactive (RFI) with the analysis carried out in the R statistical environment. Flat and NH races are considered separately. Races in GB and Ireland only since 2007 up to mid-January 2014 are considered. Each time a horse runs its Racing Post Rating (RPR) is compared with the maximum RPR the horse has achieved up to and including race date. The difference between the two numbers is defined as the RPR relative (RPRrel). The maximum value RPR can achieve is zero. Following Timeform’s definition, if a horse runs within 5lb of its maximum rating it is considered to have Run To Form (RTF). Horses are classified as having either high or low consistency according to the percentage of times a horse has RTF relative to a cut-off of 50%. The choice of 50% is arbitrary. No allowance is made for the number of times a horse runs, or its age in assigning a consistency classification, even though the older/more often a horse runs the more likely it is to run more than 5lb below its previous best. Races are classified as either handicaps, pattern races or other. For Flat races the analysis is restricted to  horses aged 4 and above. The reason for this is to reduce the influence on the analysis of the return to racing after a long break of unexposed,  previously immature horses. For similar reasons in NH races the analysis is restricted to horses aged 5 and above. Days off between races are classified into the following 6 buckets: up to 10 days, 10 to 29 days, 30 to 59 days (1m to 2m), 60 to 179 days (3m to 6m), 180 to 364 days (6m to 1y) and 365 to 730 days (1y to 2y).

Analysis & Results

Days Off & Performance on the Flat

Table 1 shows the average RPRrel for each days off bucket.  The number of runs per days off bucket is also given.  Horses run closest to form with a break of less than a month between races, there is then a gradual deterioration in performance as the time between runs lengthens until an improvement of ca. 0.5lb with a break of between 6 months and a year, it is possible this is the period when horses are returning after a voluntary winter rather than enforced mid-season break.  Horses with a break of more than a year perform ca. 3lb worse on return than average.

Days Off Category Runs Runs% RPRrel avg (lb)
1 < 10 days           38,561 17.8% -19.4
2 10 to 29 days         110,358 50.9% -19.3
3 1m to 2m           32,299 14.9% -19.8
4 3m to 6m           19,995 9.2% -20.8
5 6m to 1y           13,012 6.0% -20.3
6 1y to 2y             2,602 1.2% -22.8
TOTAL         216,827 100.0% -19.6

Table 1: Breakdown of  runs and RPRrel by days off category – Flat

Days Off, Performance & Consistency on the Flat

Graph 1 below shows the deterioration in performance by bucket of days off by consistency. The rate of deterioration is similar for both categories. High consistency horses typically run ca. 13 lbs better than low consistency horses.

G2PerfConsisFlat

Graph 1: Days Off, Performance & Consistency on the Flat

Days Off, Performance & Race Type on the Flat

Graph 2 shows the deterioration in performance according to race type. In handicaps the deterioration in performance for a lay-off of more than a year is marked. Pattern race performers are not as affected by breaks, although also perform less well after a break of more than a year. The ‘other’ category is a ragbag of maidens, claimers, sellers and conditions races with results that are perverse relative to handicaps and pattern races.

G2PerfRtypeMk3

Graph 2: Days Off, Performance & Race Type on the Flat

Days Off & Performance for National Hunt Races

Table 2 shows the average RPRrel for each days off bucket for NH races.  The number of runs per days off bucket is also given.  Horses run closest to form with a break of less than 10 days or with a break of between 1 and 2 months. Longer breaks are associated with a deterioration in performance, with no equivalent upwards blip to that seen in Flat racing associated with breaks of between 6 months and a year. The rigours of National Hunt racing suggests racing that requires a longer recovery period and where improvement for the first run of the season should be expected. Horses running after a break of more than a year perform slightly worse than 3lb relative to the average.

Days Off Category Runs Runs% RPRrel avg (lb)
1 < 10 days           19,818 7.6% -17.6
2 10 to 29 days         117,268 45.2% -18.1
3 1m to 2m           57,475 22.2% -17.7
4 3m to 6m           33,380 12.9% -19.0
5 6m to 1y           23,425 9.0% -19.4
6 1y to 2y             7,826 3.0% -21.6
TOTAL         259,192 100.0% -18.3

Table 2: Breakdown of runs and RPRrel  by days off category – NH

Days Off & Consistency for National Hunt Races

Graph 3 below shows the deterioration in performance by bucket of days off by consistency. The rate of deterioration is similar for both categories. High consistency horses typically run 15 lbs better than low consistency horses. National Hunt and Flat exhibit similar patterns.

G3PerfConsisNH

Graph 3: Days Off, Performance & Consistency for National Hunt

Days Off, Performance & Race Type for National Hunt Races

Graph 4 shows the deterioration in performance according to race type. Handicaps show deterioration in performance as the length of lay-off increases, with a marked drop for lay-offs of more than a year. In pattern races there appears to be a sweet spot of between 10 days and six months where horses have produced their best performances. Perhaps the recent trend for top class horses to be campaigned sparingly is justified. Lay-offs of longer than six months are more negative. In NH races the ‘Other’ category is less of a rag bag, containing maiden and novice races. The results here probably reflect progressive horses posting new ratings highs and are therefore of less use in determining what relationship exists between days off and performance.

G4PerfRTypeNH

Graph 4: Days Off, Performance & Race Type for National Hunt

Summary

There is strong relationship between days between runs and running to form. There are substantial differences in the results from handicaps versus pattern races, and whether a horse has exhibited consistency in the past, and all of these factors need taking into account in deciding upon the prospects for a horse in the context of how many days since it has last raced. There are some differences in the results between the Flat and National Hunt, and these differences are consistent with NH types needing a somewhat longer recovery period and benefiting from their first run of the season. In common with Timeform’s findings there is little evidence of the received wisdom that modern training methods mean long layoffs are less of an issue is true, even when the data is split into pre- and post-2010 time periods. It is probably the case that the successful return to action of a few high profile, high consistency pattern race performers, who on the evidence presented above are likely to run reasonably close to their previous best, has caused this view to gain common currency.

High Class Novice Chase Candidates: Numbers, Yard Concentration 2008-13

In the last week Nicky Henderson complained about the programme for Novice Chasers, his comments culminating with the line “And that’s why there will be no chasers in three or four years time” .  A forthright summary of his comments can be found on Dan Kelly’s excellent blog here , which firstly covers the ongoing concerns about the Betfair Chase distance, then moves on to the Novice Chase programme in the context of Nicky Henderson’s comments. So leaving the programme book aside, how does the pipeline of high class horses going Novice Chasing look year by year? Using Racing Post Ratings (RPR) the number of horses rated above 145, 150, 155 and 160 is given in Table 1 below for each of the years 2008-2013 inclusive. To qualify horses must be with GB based trainers, never have run in a Chase, achieved the rating at a GB track and have run within twelve months of the end of April of each of the years considered. These filters are designed to capture high class Hurdlers that are candidates for Novice Chasing. The filters will include Hurdlers that won’t go Chasing, and excludes recruits to Novice Chasing from overseas, so the list isn’t complete. Still, these effects should be the same year on year and not affect a year on year comparison. Table 1 shows the pool of candidate horses has varied between 46 and 70 in the last six years, with no clear trend. The numbers for 2013 suggest a healthy pool of candidate horses for Novice Chasing relative to the recent past.

Year RPR 145+ RPR 150+ RPR 155+ RPR 160+
2008 46 29 15 9
2009 70 41 23 10
2010 59 34 18 11
2011 63 38 21 13
2012 55 35 18 13
2013 64 40 22 13

Table 1: High Class Novice Chase Candidates 2008-13

Using horses rated 145+, how has the concentration of horses by training yard changed over the last six years?  Table 2 shows the number of training yards that have 1 only, 2 to 5 and at least 5 high class Novice Chase candidates. So in 2008 17 yards had one candidate. In 2013 there were 19 such yards. No real pattern exists year by year. However it is in the yards with at least one candidate that the picture has changed. In 2009 there were 11 yards with 2 to 5 candidates. By 2013 this had dropped to just four yards. The view that high class Novice Chase candidates have become increasingly concentrated at the largest training yards is borne out by the data. Table 3 shows the same information but represented by total number of horses. The number of candidates in 2013 at smaller yards is the lowest it has been in the last six years and the number in the larger yards the highest. Increasing yard concentration exists.

Year 1 horse only rated 145+ 2 to 5 horses rated 145+ 5 plus horses rated 145+
2008 17 7 1
2009 13 11 3
2010 17 8 2
2011 19 6 3
2012 11 8 3
2013 19 4 4

Table 2: Number of yards with Novice Chase candidates rated 145+

Year up to 5 horses 5+ horses Total horses rated 145+
2008 39 7 46
2009 39 31 70
2010 41 18 59
2011 37 26 63
2012 31 24 55
2013 29 35 64

Table 3: Novice Chase Candidates Yard Concentration

The falling field sizes in Novice Chases cannot be blamed upon the number of horses that could go Novice Chasing. Candidate numbers are healthy. So either the programme book or yard concentration is to blame. The changes made to the Novice Chase programme in the last year or two should have led to an increase in field sizes. The only explanation for their falling in the 2013-14 so far is the refusal of the larger yards to race their best horses against each other. The campaigning of horses is largely a matter for the trainers and their owners. However, if the BHA react to the concentration of the best horses in a few yards by making changes to the programme book to reflect campaigning realities, it is difficult to imagine this leading to a dearth of Novice Chasers in a few years time. Some trainers would argue that Novice Chasing is different from Novice Hurdling and their concern is primarily one of horse welfare.  The first implication is that anyone arguing the opposite position does not have horse welfare at heart. Not a position anyone wishes to inhabit lightly. The further implication is that a series of uncompetitive races should exist so that high class horses can learn the ropes. This will then benefit their long-term career, which, in turn, benefits racing. Perhaps to address both small field sizes and welfare concerns a series of zero prize money Australian style ‘Barrier Trial’  Novice Chases at racecourses could be introduced, with the full cost of hosting these races borne entirely by the owners. No handicap marks would be awarded and no betting available. These trials would allow for legitimate schooling in public in near race conditions. Lowly handicapped horses could take part knowing their handicap marks will be unaffected, better horses could make their own way home, learning the ropes as desired by trainers.  The quid pro quo would be that the Novice Chase programme would be further reduced. Welfare concerns are addressed by the existence of Barrier Trails, whilst field sizes in Novice Chases would increase because of the reduced number of races, improving the viewing spectacle for the racing public.

Winning Distances: Trip, Going, Field Size, Race Class & Handicapping

Introduction

The distance that horses finish relative to each other in horse racing is an important consideration in deciding what rating to apply to each horse post-race by public and private handicappers. Whilst it is obvious that trip will affect winning distances, what of going, field size and race class? To what extent do these factors make a contribution, and does the official handicapper take these factors into account in handicapping horses post-race?

Method, Dataset & Definitions

The analysis that underpins this piece was carried out in the R statistical environment accessing Raceform Interactive (RI) data for turf handicap races that took place during the 2011, 2012 and 2013 flat seasons in Great Britain. Races were placed into categories as follows:

Trip – Sprint (up to 6.5f), Mile (6.5-9.5f), Mid-distance (9.5-12.5f) and Long-distance (12.5f+)

Going – Heavy (HY), Soft (S), Good-Soft (GS), Good (G), Good-Firm (GF) and Firm (F)

Field Size – Tiny (fewer than 4 runners), Small (5 to 12 runners) and Large (more than 12 runners)

Race Class – High (Classes 1, 2 and 3) and Low (Classes 4, 5 and 6)

The number of races that took place in each category is given in Table 1 below.

Trip 2011 2012 2013
LONG 494 483 510
MID 450 421 448
MILE 1145 1103 1170
SPRINT 350 339 348

Table 1: Number of races by year by trip

Ground classifications used were those applied by RI rather than the official going. Proportions of races by Going category are given in Table 2a below. The effect of the wet weather in 2012 and dry summer in 2013 can be seen in the proportion of races that took place on the Soft and on Good-Firm in each year.

Going 2011 2012 2013
F 0.0% 0.6% 0.6%
GF 14.4% 14.6% 22.3%
G 65.5% 41.1% 57.4%
GS 14.8% 21.7% 12.6%
S 4.7% 17.1% 5.9%
HY 0.5% 4.9% 1.2%

Table 2a: Proportion of races by year by going category

The number of races by year by Field Size is given in Table 2b below. Field Sizes fell in 2013. The argument that fast going was responsible for the drop in Field Sizes is spurious. In 2011 there were 287 races on GF with small Field Sizes. In 2012 this decreased slightly to 262 races. In 2013 there was a substantial increase to 475 races. Other factors are responsible for the drop experienced in 2013.

Field Sizes 2011 2012 2013
LARGE 576 574 491
SMALL 1863 1772 1985
TINY 52 32 120

Table 2b: Number of races by year by field size

In Table 2c below the relationship between Race Class and Field Size is shown. As expected there is a greater proportion of High Class races with Large Field Sizes. In the analysis that follows races categorised as TINY were excluded from the analysis.

Race Class LARGE SMALL TINY
HIGH 558 963 29
LOW 1083 4656 175

Table 2c: Number of races by class and field size

 

Winning Distances and Going

Graph 1 below shows winning distances by Trip for each Going category. Winning distance is defined as the distance between the winner of a race and the horse coming third.  There aren’t many races that take place on Firm Going and as a result its black line representation on the graph should be treated with caution. Notice how winning distances are similar for the GS, G and GF categories, wheras for Soft and Heavy Going winning distances are quite different. There is also a non-linear relationship between winning distance and Going as Trip increases in distance. The minimum number of categories of Going that best describes winning distances is three:  Heavy, Soft and an amalgamation of the other Going categories. We know from Table 2a that few races take place on Heavy going. As a consequence excluding these races, rather than amalgamating with the Soft going category, will improve the balance of the analysis that follows.

plot1

Graph 1: Winning distance by going category

Winning Distance and Race Class

On average winning distances are higher in Low Class races. Graph 2 below shows median winning distance by Trip by Race Class. The relationship is linear with trip for Low Class races. For High Class longer distance races the median winning distance is lower than for mid distance races. This is counter intuitive. It could be explained by High Class long distance races being run at a different pace – more of a crawl and sprint, resulting in compressed winning distances, rather than an end to end gallop.

DistClass

Graph 2: Winning distance and Race Class

Winning Distance and Field Size

Winning distances are higher in Small Field Size races. Graph 3 below shows the median winning distance for Small and Large Field Sizes. It is possible the Field Size and Race Class winning distance effects are related due to the high relative proportion of High Class races with Large Field Sizes.

DistFSize

Graph 3: Winning Distance and Field Size

Contributions to Winning Distances

The information presented above shows that winning distances are affected by Trip, Going, Field Size and Race Class. Since some of these categories are related to each other analysis of variance (ANOVA) is used to attempt to disentangle the effects and see if all or just a subset of categories are important. In addition we can identify interaction (non-linear) effects, such as that between winning distance and Going. In Table 3 below a summary of the ANOVA table is presented. Apart from the obvious result that Trip and Going are highly significant in terms of explaining winning distances, Field Size and Race Class are important in their own right. In addition two interaction variables are included – Trip with Going and Trip with Race Class. The former is intuitive, the latter less so.

Category                    F Value                   p-value

Trip                              187.626                  2e-16

Going                             91.278                   2e-16

Field Size                      85.227                  2e-16

Race Class                    64.553                   1e-15

Trip*Going                    8.237                   2e-05

Trip*Class                      6.904                  0.000122

Table 3: ANOVA table of contributions to Winning Distance

Winning distances and Subsequent Handicap Changes

The official handicapper has detailed his policy with respect to handicapping here.  Given the wide range of inputs that he states go into his handicapping decisions, we should find a relationship between changes in handicap mark and the race categories examined in the previous section. A variable that takes into account handicap mark changes and winning distances is defined as follows:

lbPerL = (Mark change for winner – Mark change for 3rd)/(Winning distance 1st-3rd)

Graph 4 below shows winning distance on the x-axis and handicap changes winner to third on the y-axis. Whilst there is a relationship (correlation=0.6) there are other factors in addition to winning distances that are used to revise handicap ratings.

DistORchg

Graph 4: Winning distance and handicap changes

Pounds Per Length and Going

Handicap changes per length are lower for races that take place in Soft going. The median difference is 0.25 lbperL. So for with winning distances of 2 lengths, median handicap changes in Soft going are ca. 0.5lb less than on quicker Going.

ORchgGoing

Graph 5: Pounds per Length and Going

Pounds Per Length and Race Class

Handicap changes per length are higher for High Class races. The difference is 0.34 lbperL. With winning distances lower in High Class races, it appears as if the handicapper applies a standard handicap increase to the rating of winners regardless of Race Class.

ORchgClass

Graph 6: Pounds per Length and Race Class

Pounds Per Length and Field Size

Handicap changes per length are higher for races with larger Field Sizes. The difference is 0.33 lbperL. As with Race Class, it appears as if the handicapper applies a standard handicap increase to the rating of winners regardless of Field Size.

ORchgFS

Graph 7: Pounds per Length and Field Size

Understanding the Contributors to Handicap Changes

ANOVA is used to check if the differences seen in the graphs above are statistically significant. Table 4 below shows the handicapper does take into account Going, Field Size and Race Class in the handicap changes he applies to winning horses – the p-values show that each category explains a significant component of the lbperL variable. In the next section we examine if sufficient account is taken of the different race categories.

Category              F Value                 p-value

Trip                          106.119                < 2e-16

Going                         27.191               1.90e-07

Race Class               45.673               1.52e-11

Field Size                  42.151               9.10e-11

Table 4: Contributors to Winning Distances

Is Sufficient Account Taken of Different Race Categories?

If the handicapper takes sufficient account of race categories it should be the case that horses run equally well in their next race. The variable PctBtn (thanks to Simon Rowlands of Timeform for suggesting this variable, for example here) is defined as the percentage of horses beaten next time out by the winner of each race. If the handicapper has done his job, there should be no difference in the average PctBtn variable by race category. ANOVA is used again. Table 5 contains the results. The results for Field Size are statistically significant. It appears as if the handicapper does not raise the handicap mark of winners of large Field Size races by enough, since they beat a higher proportion of their rivals next time out than winners of races in other categories.

Category                   F Value                   p-value

Trip                             0.442                       0.7230

Race Class                0.098                       0.7545

Field Size                  4.821                        0.0282

Going                        0.668                        0.4137

Table 5: Contributors to Winning Distances

Summary

In addition to the obvious effect of Trip and Going on winning distances, Field Sizes and Race Class are also significant contributors. Whilst the handicapper appears to take these factors into account in setting handicap marks, in the case of large fields size handicap winners it appears that winners are insufficiently penalised. It is a small step to suggest that placed horses from large Field Size races are worthy of particular attention next time out.