I only have a few memories of my great-grandfather, Horace Barker. He was one of only three people I met whom I know to have been born in the 19th Century and he died a few weeks before my 5th birthday so we didn’t have a lot of time to get to know each other.
There are only a few facts about him I can recall: he was a kindly old man in his late seventies, married to my great-grandmother, Hilda. They lived in a bungalow with an immaculate garden and a greenhouse full of the sweetest tomatoes you’ve ever smelled. Unfortunately, my own insight ends there and I have to rely on other data sources to complete my picture of him.
If you know where to look, you can find out more about him. Various census sheets and official documents confirm that Horace was born in Pemberton, Wigan on 29th October 1897, he was a coal-miner, man and boy. He married Hilda in 1921 and together, they had a daughter, Marjorie, on 21st March 1924. He went to seek his fortune in Canada for a few months in 1929 but while he was there, Wall Street crashed – which may have influenced his decision to return home. At the outbreak of war in September 1939, Horace was recorded in the National Register as living at ‘Marus Bridge Shop’ and working as a “Colliery…Chargehand” (under-supervisor) and also a first-aider and an Air Raid Patrol warden. His wife Hilda was listed as a ‘Grocer and Confectioner’.
He was hardened by his experiences at the unforgiving coalface and later, as Colliery Manager, he bore the responsibility of the lives of the men who worked under him. The daily obligation to make life-or-death decisions undoubtedly shaped his outlook – and it’s no surprise to reflect that coal-mining was a formative part of some of the most revered working-class heroes of his generation; men like Matt Busby and Bill Shankly.
It’s not a fatuous comparison. Pop’ (as he was known in later life) once told his grandson – my Dad – during a callow attempt to make ambitious structural changes to a farm building “tha’ll ne’er do it” [You’ll never do it], knowing well that saying so would provide the extra determination to succeed. It worked. Like Shankly and Busby, he was what footballers would call a ‘psychologist’: adept at understanding and motivating others with a mixture of high standards and a gruff, uncompromising demeanour. By all accounts, he was a formidable character – and it’s easy to see why he needed to be.
Today, more than half a lifetime after his death, the world is a vastly different place. Fossil fuels – and their effects – are (literally) unsustainable and we’ve made great strides to power out future by harnessing the natural resources around us. That which we used to have to mine out of the ground to add value to our lives is necessarily diminishing in long-term value. And yet, over the same five decades, humanity has also created something in such a vast quantity that it now forms the most valuable mined resource than anything based in carbon.
Since 2017, it’s become widely accepted that data has become the world’s most valuable commodity, overtaking that long-standing former favourite, oil. The world’s most valuable companies now trade in quadrillions of bits, rather than billions of barrels. Carbon is just so finite, so boringly elusive, so…analogue. Data is different: it’s so dynamic, so ubiquitous, so…sustainable. And, just as with coal, the very juiciest bits of all this data, that inform decisions which can make or break fortunes, are there to be mined from the vastly more voluminous, less valuable stuff, all around it.
To do that, you need to be able to find relevant data, verify its accuracy and understand its meaning. For this you must also have a clear understanding of the problem that the data is being used to solve. You must also be aware of the statistical pitfalls of sticking different data together and making logical conclusions that clearly show that the correlations in the data unambiguously answer the questions being posed. To those who are not familiar with it, data mining may seem like a very indistinct process, maybe even a pseudoscience. But it’s simply a case of trying to create a ‘picture’ of knowledge about a group or an individual, based on available facts, cross-tabulated with other known information, to build a profile. If that still sounds unhelpfully abstract, then re-read the first five paragraphs above and you’ll see that’s exactly what I was doing there; turning documented fact into reasoned propensity.
Obviously, data-mining is not remotely dangerous; the work is not back-breaking work and there’s little chance of contracting long-term health conditions due to the working environment but it’s essentially the same principle – although I’m not sure that miners of old would see it that way. In ‘The Road To Wigan Pier’, George Orwell describes at length the awesome physicality demanded of coal miners, even comparing them to Olympic athletes. Pop once said of his own brother-in-law (whom he considered to be a less capable individual) “I’d durst let him’t strike at mi arse wi’ a pick”. If you cut through the old Lancashire dialect and the, er, slightly industrial language, it was a scathing put-down: ‘I’d dare to him to swing a pick-axe at my backside’ – believing him be too weak to do any harm.
Horace “Pop” Barker’s miner’s lamp. Photo: Paul Bentham
We still have his miner’s lamp, although the reason for its presentation (long-service, retirement or just his actual working lamp, polished up) is now lost in the mists of time. There’s also a brilliantly evocative picture of him, arms folded, his coal-blackened face staring defiantly into the camera, taken at the pit-head – I believe at Chisnall Hall Colliery near Coppull. He died in 1978, before the final decline of the industry that sustained his whole life.
I often wonder what he would have made of the miners’ strike of the 1980s, of Arthur Scargill’s leadership of the National Union of Mineworkers, of Health & Safety law, of the demise of ‘Old King Coal’ and even of the shift to renewable energy.
More than anything, I’d love to explain to him the parallels between his industry and mine: the intricacies of data, profiling and algorithms. With the arrogance of (relative) youth, I might expect the ‘wonders’ of the digital age to blow his Victorian mind. I’d tell him how confidently I could pinpoint the addresses of all the greenhouse-owning pensioners in Standish, based on a few data sources and the internet. I’d like to think he’d tell me I’d “ne’er do it”.
But then I shouldn’t be surprised if it left him largely unimpressed – a lot of statistical inference could easily be termed ‘common sense’. If you’ve had any experience of retail, as he did, you soon develop a sense of what ‘type’ each customer is, based on their buying history and their responses to different stimuli. Grocers in 1939 didn’t need a suite of linked tables to understand which customers would be best suited to which products; their database was in their heads. Computers have merely added the capability to make the same predictions on a far greater scale and with ever-increasing complexity.
Nor would he necessarily be a stranger to the more contemporary concerns of wholesale data collation. As a coal miner in Wigan in the 1930s, he is likely to have been well aware of the famous Orwell book about his hometown. If he were to have discussed Orwell’s most famous novel, ‘Nineteen Eighty-Four’, just over a decade later, he would have become well-acquainted with that age’s most prescient description of data use and misuse – and a delicious historical irony would have followed. I remember the death of the aforementioned ‘pick axe’ brother-in-law in December 1983. At his memorial service, in the days after Christmas, the sermon made reference to the incoming new year (1984) and the parallels in the book of the same name that we should consider. Two or three weeks before Apple Computers famously did it, a vicar in Wigan was riffing on the warnings of the coming year.
There’ll always be a limit to what I can know about Horace Barker, and what I can reliably surmise, There are many closed-off avenues that, tantalisingly, could be re-opened with the provision of just a little more data. That’s the frustration of genealogy – the suspicion that one small discovery may set off a chain reaction of greater understanding. Exactly the same can be said of data mining – which makes the quest for the knowledge it can provide all the more enticing.
Racehorse Handicapping: Predicting the Unpredictable?
The role of a horseracing handicapper is to ensure that each horse in a race is carrying enough weight to offset their differing capabilities and their varying levels of form. It’s seen as a vital task because it means that, in theory at least, champion horses in the peak of their form are matched more evenly with their less illustrious competitors, ensuring a more tightly contested, less predictable race.
Taking the logic to its natural conclusion, the handicapper will only have done their job correctly if all horses in a race cross the line at the same time. While it’s possible (but still unusual) to have a dead heat in a two-horse or, in extremely rare cases, in a three-horse race, it’s functionally impossible for this ever to happen in a race involving a larger number of horses.
Famously, the Grand National is never a close race, using the definition of closeness as the difference between first and last places – indeed many horses fail to complete the course each year and the favourite rarely wins. There are just too many horses, too many obstacles, there is too much distance and arguably, there is too much that is unusual about the preparation to ever confidently hope to call a winner, let alone be able to harmonise the finish across the whole field. In probability terms, there are simply far too many unknown variables to trust any form of predictive modelling that would ever enable a handicapper to achieve the ‘Holy Grail’ of all horses crossing the finish line at the same time. In the face of such overwhelming statistical evidence to suggest its basic futility, why is handicapping necessary?
The answer is that ensuring a dead heat is not the point of handicapping at all. Handicapping is there to offset perceived differences in horses’ abilities and form. It acts as a regulator for betting, ensuring that favourites will not be favoured by the betting public by as wide a margin and that ‘dark horses’ will be viewed less darkly than they would be without handicapping. It serves the industry behind the sport, not the sport itself. There is no handicapping in Athletics purely because the sport exists primarily as a discipline to discern which athlete is the fastest (and by how much). Only the overlay of betting leads to the necessity of handicapping – something which many might see as a perversion of the conventions of pure sport.
Uncovering the ‘real’ reason behind the point of handicapping seems rather dull, irrelevant and perhaps even a little dispiriting but the subject is still of value because it acts as an interesting analogy that mirrors the issues of what can and what perhaps can’t be predicted – and to what extent, the distinction between the two states may become blurred.
Direct Marketing & Parallels with Racehorse Handicapping
The role of a Direct Marketer is to predict, accurately, the event of each customer choosing to make a purchase from an offering in a given time-frame – or not, as the case may be. As with handicapping, various models exist to discern the factors that most affect future behaviour. As with handicapping, these models are widely accepted as being able more reliably to predict the general level of behaviour than would otherwise be the case. As with handicapping, there are far too many variables to translate such improvements at the individual level. At this point, even the offer to give away £1,000 of vouchers with every £10 order will still only yield a certain percentage of response – it will not motivate every customer into action, often for a variety of what appear to be illogical reasons.
It may be suggested that the ‘Holy Grail’ of Direct Marketing is just as simple and just as unobtainable as the race where all horses cross the line together. It is an activity which is segmented using a profile which can determine only those customers who will order.
In reality, for this to occur, not only must this segmentation yield a 100% activation rate for the successful segment, but it must also be shown that all other segments will always yield a 0% activation rate – a practical impossibility.
Just as a handicapper may occasionally achieve a 2-way dead heat, a Marketer may occasionally achieve a 100% activation in a segment with a very small sample. In that circumstance, the Direct Marketer’s expectation is always that that offer, made more broadly, must be transferrable to other segments, uplifting their performance. The activity is then repeated through various other segments with the expectation that it keeps performing profitably until it fails. In short, the ultimate goal state of a Marketer can therefore never happen, as another sale can always be found.
Even if a model existed to find just the people who would only ever respond to a given stimulus (its magnitude), it would still be akin to believing “this is all the sales you can ever make”. It would be perfectly efficient, of course but it doesn’t necessarily mean that revenue is increased by all that much. It just clarifies the process of when to stop chasing the extra sales.
In reality, this a problem we’re highly unlikely ever to face. Customers are people and people are (at the individual level) incredibly difficult to predict. The ‘Holy Grail’ state just shows us what a perfect level of predictability would look like, which is useful when it comes to comparing and evaluating our own methods.
Applying a Predictive Model in Direct Marketing
As a contrast to the imaginary problem above, real-world examples of response rates across the segments of an activity tend to adhere to a more familiar principle: the law of diminishing returns.
This is taken from campaign data from a previous Spring/Summer campaign, using segments driven by our prior ‘Points Analysis’ method of segmentation and recorded from response codes given during telephone orders. For this reason (as it therefore ignores web orders from that campaign), the percentages are not relevant here, just the shape of the curve.
As with the ‘Holy Grail’ curve above, it starts off steeply, implying that this is a clear way to predict the responsiveness of one group over another. However, as the trendline (I’ve used a logarithmic trendline, by the way) progresses along the segments, it flattens so that by the lower segments, it almost represents an admission that the model can’t really say if the second to last segment contains significantly more predictable customers than the last segment.
Using the ‘revenue-building’ logic discussed above, this uncertainty can be (and often is) presented as a positive feature. As long as the responsiveness is at a profitable level, this ‘long tail’ becomes something of an asset, as it assures the Marketer that more sales can be added, with a positive ROI until the point on the axis where the curve touches the break-even point of response. The fact that these sales happen to come with decreasing levels of efficiency may be seen as a price worth paying.
One rather fundamental problem in the collation of the above chart was that the response metric was based on order-level, not customer-level responses. At this point, we need to be rather pedantic: the issue of predictiveness relies ultimately on the response of an individual to a stimulus, which is then grouped by the segments of similar individuals. Using the principles of RFM (the categorisation of customers by Recency Frequency and Monetary Value), order-level analysis conflates the effects of both R and F, when we require them to be viewed in isolation. To illustrate this point, consider that one hundred orders from a given segment may imply one hundred responding customers but it could in reality translate to just one very responsive customer – or any combination of reciprocal factors between.
Since then, we’ve adopted the more standard Binary segmentation model, which ensures the monitoring is at the customer-level, preferring the percentage metric ‘Activation’ (customers who ordered in a given season as a percentage of customers stimulated, by category) over the more traditional, order-level metric ‘Response Rate’ (orders received using a given response code as a percentage of catalogues circulated with that media code). The uncertainty factor of one customer ordering a hundred times versus a hundred customers ordering once each has been subsequently removed. We can now monitor precisely how many customers have ordered, as well as the number of orders those customers have placed, collectively and individually.
The Activation performance of the Binary list for the most recent Spring/Summer campaign, expressed for each group shows a similar curve, implying the same adherence to the law of diminishing returns as the older Points Analysis-derived curve above.
Once again, the asymptotic (flattening) curve implies a longer tail beyond the limits of the mailing list, which, using the methodology of the Binary process (with its allocation of decreasing points for customers ordering increasingly further back in time), also implies that further revenue can only be attracted at a less efficient rate. In effect, it’s almost telling us that after a certain point, we can mail anyone using this rationale and we’ll probably get the same return, whatever it is. This is hardly what you would call a predictive model.
All this is implied but none of it can be taken for granted, just as no segment that yields 100% Activation ever implies that the ‘Holy Grail’ has been achieved – there is always the question “what further potential is there?” to answer. It’s clear that we need other means of predictiveness to unlock the secrets of the deeper recesses of our mailing list.
The Limitations of the Binary System
Largely as a result of the paranoia/healthy scepticism (call it what you will) of putting all our eggs in the basket that is Binary segmentation, we have, since adopting Binary, also endeavoured to add a wider pool of customers to our recent mailings selections than merely those segments suggested by that system. It’s not unusual or ground-breaking to do so; it’s a practice that’s routinely done by even the most faithful proponents of Binary segmentation and it’s called deep-diving.
Using our previous (semi-proven) Points Analysis system as our deep-dive axis, we mailed representative samples from these deeper segments of customers and named them groups -1 to -5, in accordance with the Binary nomenclature.
What we found was that a huge proportion of the -1 group customers were activated (far more than we had anticipated), the equivalent of the Group 12 Binary segment, i.e. the best segment of the ‘Good’ portion of the list. Thereafter, the activation rate dropped massively for the -2 segment and continued to tail off gradually through to the -5 segment.
Perhaps it should come as no real surprise that there is a significant increase in activation in any Binary analysis from the 1 segment to anything that is essentially the ‘best of the rest’. I have to presume that a known increase in activation at this point in the list is not only common but probably also a phenomenon that is to be expected. Conversely, I have no idea if the level of disparity at this point is generally as great as we have found it to be. I rather suspect it isn’t.
There are two benefits to this figure being so notably high, which represent the twin roles of predictive segmentation I have already outlined. Firstly and most prosaically, it represents almost 7,000 activated customers and almost £300,000 of additional revenue. Secondly, it gives us a definition of customer type that we know we can continue to stimulate efficiently and it strongly indicates at what point this metric provides segments that are inefficiently stimulated. It also calls into question the wider viability of a system that seems to ignore a cohort of customers who are capable of yielding half as many activations as those it selects.
Ordinarily, as the Direct Marketing wheel turns and the results of one campaign’s test shape the standard practice in the next campaign, thoughts turn to the question of what methodology to test next. With such a statistical disparity as this, it’s also difficult to escape from the conclusion that the Binary model as it stands may not be wholly suitable for our requirements. This is not to say that the practice hasn’t been worthwhile or indeed that the notion of measuring campaign performance at the customer level isn’t of value. In fact the opposite is true: With ever more ordering methods, media codes as a means of recording performance are dying and, even if we could resurrect them, we would return to the same non-relational order-level analysis that tells us nothing about the customers on whom our business depends.
I would always advocate a customer-level metric, even if I might always wish for a method of segmentation that is more clearly suited to our list profile. The reporting disciplines required and indeed the limitations that customer-centricity can have on budgeting for additional in-season activities are all, in my view, a small price to pay for the insight the analysis can give to actual customers. As we move inexorably to a more sophisticated multi-channel interaction-based data model which encompasses customers’ web visits, email responses, retail transactions and even social media activity, it is clear that our basic ‘currency’, the only differentiating factor we have, to analyse anything of significance will eventually (and then always) be at the customer level.
Having said that, if we’re at the point of re-drawing the boundaries of what constitutes ‘very good’ customers from ‘good’ and so on, we can also have an eye on what shape of curve we’d like it to produce, based on recent customer behaviour tracked against information known about customers before that activity occurred. As I have already outlined, the process of measuring the performance of an activity has two basic roles: to assess both its magnitude and its efficiency. A curve that simply emphasises the magnitude of success is too steep and does little to imply where further success might be found. A curve which places too much concentration on efficiency tends to be too horizontal and very quickly can become practically non-predictive.
Obviously, there will always be customers who are more responsive than others in any database so it’s true to say that any curve will show degradation. In fact, as it’s a symptom of a correct profiling methodology, activation curves should have a degrading, downward-sloping shape from customers who are predicted to be the most responsive, down. It’s also fair to presume that if you measure a list against any given single metric, there will always be a ‘best of the rest’, chosen using a different metric which may out-perform the usual list, so at some point a secondary or even a tertiary segmentation metric should be considered. A problem can occur if those segments suggested by other metrics out-perform the primary-metric segments by too much. This may imply that a better, more appropriate primary profile would have included those names in the first place, something which would ensure the risk of missing such customers from a future campaign is minimised.
An Easy Win: Challenging the Timeframe
One way to improve the primary metric we have (Binary) may be to re-define that timescale of the selection. The version of Binary that we’ve adopted is based on Yes/No (or 1/0, hence the name ‘Binary’) classifications for a customer’s ordering profile over each of the last four six-month seasons. It is entirely predicated on the fairly standard assumption that a customer is a customer from the date of their first order until exactly two years beyond the date of their last order. By extension, anyone on the list who hasn’t ordered for over two years must be considered a lapsed customer and is removed from the house list. They may continue to be contacted, but only as part of a reactivation programme.
The fact that the Binary system is based on a two-year model and the fact that it was adopted by ‘mainstream’ catalogue operators such as Littlewoods and La Redoute seems to have a fair degree of compatibility. I have always been (and remain) dubious that the simplistic ‘two year rule’ applies as strongly in a niche market such as our own. As a ‘safety net’ against pinning our performance on adhering to it, I ensured that our mailings included a ‘best of the rest’ deep-dive, based on high point-scoring customers (who would therefore have been mailed under our previous segmentation model), who, being outside of the Binary segments would therefore have been inactive for over two years.
As we have seen from the most recent data, this 30,000-deep segment yielded a response (and therefore a Return on Investment) performance, similar to the ‘12’ group in the standard 4-season Binary model. Evidently, our less Recent, more Frequent and/or higher Monetarily-valuable individuals were able to outperform most of their more Recent counterparts. The cut-off at two years has always seemed arbitrary and inappropriate for us – and these figures appear to support that position. Recency is therefore not necessarily ‘king’ in a niche market, even if it may be considered as such by more mainstream operators.
To corroborate this view, perhaps it’s helpful to contrast the characteristics of a mainstream proposition and a mainstream customer with those propositions in a more niche market context.
Mainstream v Niche: Some Observations
Mainstream catalogue companies have tended to define their core markets more by the way they choose to buy (i.e. by choosing not to walk into a shop) far more than by the type of products they buy. They are in competition with a far wider section of the market, selling standard products to a broad section of the public. Light fittings, pyjamas, holiday footwear and all the other day-to-day offerings were always generally available on any high street or in a plethora of other catalogues or websites, in which there is usually massive competition. It is therefore difficult for them to create a sense of what their brand represents beyond their pricing, the quality of their merchandise and their service – certainly no-one can define their range as a whole as representing and supporting a ‘lifestyle choice’. Even before the further commodification of retail by search engine and affiliate sites, their offering was often close to being commodified by the presence of so much competition.
It is easy (and perhaps fair) to conclude that they must therefore adopt a ‘plenty more fish in the sea’ approach to customer retention over acquisition. If customers are that easily acquired, and if retention can prove to be so difficult, it follows that it is seen as far easier to entice a new customer than it is to win back one who has not been back for a relatively short amount of time. It’s dangerous to suggest they acted arbitrarily in arriving at two years as the determinant of dormancy; it seems reasonable to expect that it was driven by their data, suggesting a parameter that was appropriate for their purposes.
Conversely, niche market businesses tend to define their customers by a specific activity or affinity, which is to a greater or lesser extent important to all of their customers. They may find that the percentage of customers willing to buy remotely in that market is far higher than in general (historically) because of the relative lack of credible alternatives. Broader ranges of products that appertain to that activity or affinity may be more difficult to build, depending on the obscurity or the scope of that activity or affinity. Wider competition will always be present but, at their strongest, these niche markets are filled with customers who define their interest as a ‘lifestyle choice’. These brands do not just purvey goods, they represent or even define a lifestyle.
In a niche market, almost by definition, there aren’t quite so many ‘other fish in the sea’ and even customers who have been lapsed for a number of years are a far greater prospect to approach once more than any attempt to trawl for a fresh batch. If customers are not so easily acquired, and if retention proves less difficult than in the mainstream sector, it follows that it is disproportionately easier to entice an older customer than it is to acquire a new one. It seems clear that these markets inherently find the mainstream parameter of dormancy at two years to be inappropriate for their purposes.
Extending the Binary System from Two Years to Three
The Binary system’s strengths are its customer-centricity, its ability consistently to predict the difference in response between more regular-and-recent and less regular-and-recent customers and its scalability. Its weakness is the fact that we can prove that it has omitted perfectly responsive customers. Perhaps this can be corrected by using its scalability to ensure that they are re-admitted into the process.Under a four-season (two-year) standard model, the categories are defined by fifteen groups, which is the number of permutations of order activity (or inactivity) across four seasons. One point is awarded for the least recent reported season (four seasons ago), there are two points for an order three seasons ago, four points for orders from the penultimate season and eight for the last season. The number of points awarded doubles, the more recent the season, which seems like an arbitrary system but is actually an ingenious mechanism to ensure that every single permutation is represented by a different number of points.
In this way, we may contend that Recency is a vital factor in predictive modelling whilst also expecting to target customers that are patently less Recent in profile. The crucial point being that we have evidence that suggests we cut off responsive customers too readily by adhering to a ‘two-year rule’. By re-introducing segments of longer-dormant customers, we become able to evaluate their relative value – and therefore the predictiveness of this wider flavour of Binary analysis. Like the current four-season model, there’s also the thorny issue to consider of how many high-performing customer segments that even this model may continue to ignore.
We can’t turn back time but we can simulate the conditions of a six-season Binary selection. It is possible to re-order the customers we may have selected for the current Autumn/Winter campaign using a six-season Binary model. From there, we can identify not only which customers were mailed but also which customers placed an order in the current campaign and compare them with the equivalent responses using the usual 15-point, four-season Binary model. With six seasons, the number of permutations of orders increases from 15 to 63.
This hugely increases the level of granularity that the list analysis can give and will also help to establish the importance of the 5th-last and 6th-last season on predictiveness for a forthcoming campaign. Using four seasons (two years), the Binary graph for Activations from the current Autumn/Winter campaign to late November looks like this:
The same response data under a six-season (three year) Binary grouping shows a similar degradation but with more definition between high-performing and low-performing segments.
The added granularity helps to provide more evidence of predictiveness at each end of the Binary spectrum. Almost two hundred more customers are classified in groups which yielded an Activation rate of over 40% than in the 15-point model and over six hundred more customers are classified in groups which yielded less than a 10% Activation rate. If a 10% rate was shown to be the break-even point for inclusion, then this information would identify names who the Binary model would not predict a sufficient response. If no other justification could be found to mail those names, then that information could demonstrate a saving of unnecessary expenditure.
A ‘Health Warning’ for Any Model of Segmentation with a Single Axis
As we’ve already seen, demonstrating a suitably stratified segmentation model is only the first requirement of achieving a fully-optimised list. We must also ensure that no other potentially responsive segments are omitted. I’ve also highlighted the almost inevitable need for some subsequent segmentation criteria to exist beyond the reaches of the primary (in this case, Binary) model. Not only should this process stimulate as many as possible of the remaining responsive segments (a ‘best-of-the-rest’ group), it should also seek to test other responsive techniques beyond that.
A good example of that methodology would be the segmentation of customers, irrespective of Binary and Points, who have previously ordered during a Sale for a mailing of a Sale Catalogue. This is based on a given principle (that a customer is a known Sale responder). In the field of probability, this is known as Conditional Probability: where a given condition already exists, results in outcomes with a higher degree of probability and therefore predictiveness. The methodology appears sound but the result may or may not agree but either way, the results of that decision will shape our future selections.
Currently, our preferred secondary metric is the Points from our long-standing ‘PointsAnalysis’ table, which was created for our previous segmentation technique, where customers accrue 100 points every time they order, gain 1.5 points for every pound they spend and lose a point for every day that passes without an order.
In order to pursue this line of segment development, we will need to more clearly record what segments were used and on what basis. Where transient variables such as Points are used, the figures at the date of segmentation need to be written back to the database to enable better, easier analysis and cross-reference between the segments used and their eventual performance.
As part of my reconstruction of a six-season Binary in Excel, I have been able to identify customers mailed with and responding to the Spring/Summer Deep Dive catalogues. I have also been able to reverse engineer their historic Points level at around January 15th, based on their November Points and their known activity since January. This graph is what that analysis suggests. I can’t guarantee perfect accuracy within each Points band but I can say that the totals for each group match those given by a report for the activity of groups -1 to -5.
These responses strongly suggest that there are responsive customers to be found outside of the 4-season Binary model we employed in that season, bearing in mind that the Activation level for Binary group 1 was 4.6% and the “-1” Deep Dive group yielded around 18% Activation.
Conclusion
There’s nothing wrong with mailing across multiple axes of segmentation, as long as the hierarchy is established (if a customer qualifies for a segment in each method, which one wins and which method is left with the rest?) and as long as each segment is performing well. Curves which become too horizontal may still be predictive at the level of each category but also show that the method itself has begun to lose its predictiveness at that point. Thought should be given to the point in the list/on the axis at which one model is abandoned and another is given free rein to replace it.
For example, using the derived ‘live’ data for the current campaign below, comparing six-season Binary with Deep-Dive, based on Points, it may be concluded that, long tail or not, groups 1-4 should not be mailed but those quantities replaced by the best of the rest on the Points scale.
There are of course too many variables to merely prescribe a ‘one-size-fits-all’ answer here. Issues of quantities of names available within each group together with associated AOVs and break-even Activation levels all play a part. The main issue at this stage is that we give ourselves the impetus, and the tools, to break away from a single system of segmentation, as long as our focus remains at the customer level.
Whatever we do, it should be a far more scientific process than simply betting on the horses…
If you’re in business, you’ll probably be aware of the impending changes to data privacy law (known as GDPR), about to take effect. You may also have noticed lots of companies have recently started to ask you to continue to opt-in to their emails. If this is all news to you and you don’t know about GDPR, it might be a good time to read up about how it will affect your business.
Basically, the new law effectively resets many of the permissions we currently have to hold and use your contact data. If we’ve been emailing you for many years, it’s possible that we’re holding your contact data without all the permissions that GDPR now requires – which means we won’t be able to contact you from May 25th without good reason to do so, or your given permission.
Obviously, we don’t want to lose contact with any of our valued customers so between now and then, like many other companies, we’ll be encouraging as many of our contacts as possible to re-subscribe to our emails. We apologise in advance if it looks like we’re over-doing the reminder activity but we expect every other business to be doing the same – and we’ll all have to stop doing it by May 25th!
In order to keep receiving a our important industry and CSG related information via email, all you have to do is visit our Contacts page and fill in the brief form, confirm you’re not a robot and click on the red button. To make it interesting, we challenge you to see if you can do the whole thing in under sixty seconds. Are you up for that?