Saturday, 2 February 2019

Receivables Segmentation using Decision Trees : Part 2

To understand in a better way what the two formulas described in the previous post mean really let's build a practical use case:  building simple rules to determine whether an account will be recovered. 

The table below contains a table of 10 accounts. For example; the first account is Under 30 days DSO having a High Amount due, Credit Score and has been recovered.

In our example, as the root node contains 6 recovered account and 4 not recovered, we need to split on the root node. This process is then repeated to create child nodes with the least impurity.

If we split the root node based on DSO, we will get the following entropies :

Entropy(Under 30) = −[(4 / 4) * (Log2 (4 /43) + (0/4) * Log2 (0 / 4)] = 0

Entropy(Between 31 and 60) = −[(1 / 4) * Log2 (1 / 4) + (3 / 4) * Log2 (3 / 4)] = 0.811

Entropy(Over 61 days) = −[(1 / 2) * Log2 (1 / 2) + (1 / 2) * Log2 (1 / 2)] = 1

Because the dataset contains four accounts Under 30, four between 31 and 60 days and 2 accounts over 61 days, the impurity of this split is calculated : 

Impurity (DSO)  (4 / 10) * 0 + (4 / 10) * (0.811) + (2 / 10) * (1) = 0.52

To build this calculation in Excel we will use :

  • SUM to compute the number of accounts for a given attribute value,
  • DIV to compute the fraction of observations having each possible attribute value.
  • LOG to calculate the impurity by also using IFERROR  to ensure that the undefined value P(0)Log2(0)=0  the undefined value is replaced by 0.
  • SUM to compute the entropy for each possible node split
  • SUMPRODUCT to calculate the impurity for each split

As shown in the result above the impurity of DSO is the smallest (0.324), among the 3 features. We should then start by splitting over the DSO feature.

The process same Process can be repeated on the new subtree. The DSO Under 30 and DSO Over 61 are pure nodes, we don't need to split them. 

DSO between 31 and 60 is impure. Splitting on DSO gives an impurity of 0.8, whereas Credit Score or Amount due yield impurity of 0. So we can either split over the Credit Score or the Amount due variables.

Finally, we are able to build the following decision tree, holding a simple classification rule of Accounts Receivables.

This simple rule-based classification tree might help Receivables Analysts to prioritize their actions towards receivables that might need more effort to collect.

Get for Free the Spreadsheet receivables Decision- Tree template 

If you are interested in having your own copy of the Decision Tree template, implementing the calculations for the entropies measure and impurities for Receivables Analytics, please subscribe below and get the download link.


In this article, we learned how to model a simple decision tree using Excel.  The key issue in creating a decision tree is its size. By going building deep trees we might have pure terminal nodes, but this usually overfits, which means the model will perform poorly on new data (different from the training set).  This overfitting problem bird to a lot of research, such as Leo Breiman’s CART algorithm, and more advanced Tree-based models such as Random Forest and GBT, all of them have been successfully implemented as part of packages in R and python. Some of their implementations, Xgboost for instance, are considered to be the most winning models in Data Science competition (Kaggle).

Tuesday, 22 January 2019

Receivables Segmentation using Decision Trees : Part 1

Who said we cannot build Decision Trees in Excel? probably you didn't, but I bet you thought it was. 

This post will be the first in a series of two posts where we will explore how a Receivables Analyst can use Microsoft Excel, to build simple Decision Tree model helping to segment Accounts Receivable.

What is a Decision Tree :
Decision Tree-based prediction models are one of the most used families of classifiers in Machine Learning. They are used widely by the Machine Learning community. You can have a look at Victory Lab (by Sasha Issenberg) to understand how decision trees were used in a very concrete case: Obama’s successful 2012 reelection campaign.
One of the important advantages of Decision Trees is their simplicity, their decision model can be understood by people with little statistical training while helping to predict the value of a binary outcome from several independent variables.

A bit of theory about Decision Trees

Let's start by some theory, a decision tree starts with a root node that includes all combinations of attribute values, then we decide what are the "pertinent" variable to "split" the root node to create the best improvement in class separation. This process is then repeated to build all the brushes of the tree.

The split of a node should only take place on impure data. And a node is considered pure if all the data associated have the same value of the dependent variable. 

Impurity can be measured as the weighted average of the impurities for the child nodes created by the split, where the weight for the child node is proportional to a number of data points in the child node. 

The impurity of a child node might be calculated in different ways: Entropy, Gini Index, Classification error. In this article, we will use entropy to measure the impurity of a node.

To define measure the entropy of a node, suppose there are p possible values (0, 1, .., p-1) for the dependent variable. Assume the child node is defined by independent variable X is equal to a. Then the entropy of the child node is computed as the following equation:

{\displaystyle \mathrm {H} (X)=-\sum _{i=1}^{n}{\mathrm {P} (x_{i})\log _{b}\mathrm {P} (x_{i})}}
                                 Where P(xi)  is the fraction of observations in class i

The impurity associated with the split is then calculated as 

Impurity = (Σ  ni * Entropy(i))/ N

Where ni is number of observations having they split variable equal i 
N is the total number of observations in the parent node.

Monday, 10 December 2018

Collections Connect Rate in DAX

Collections Connect Rate (CCR) is one of the first Key Performance Indicators that needs to be implemented within the Collections departments. It measures the number of outbound calls that succeeded in connecting to a valid phone number over the total number of outbound calls.

Naturally, a healthy Collections process requires a high CCR. Benchmarks show that the average value of that KPI is about 34.9%.

Accounts having a low value of this KPI might be considered as unlikely recovered and thus increase charge-off rates which might increase the financial risks especially regarding the IFRS9 expected loss calculation.

A low value of the CCR indicated inefficiencies within the customer relationship management processes that handled the customers account before getting transferred to Collections. This KPI can rise by improving Data Quality process, organising and auditing Contact data on a regular basis and using multiple company and public databases to enrich contacts records.

The calculation of CCR takes into consideration all outbound calls connected to a valid phone, including right party contacts (borrower or other trusted party) and wrong party contacts. The only condition is that the contact phone number should be in service.  The CCR is calculated as follow :

CCR = 100 * Outbound Calls Connected to a Valid Phone Number / Total Number of Outbound Calls

Suppose CollectionActions[] is the table containing details about the Collection outbound calls such as the AccountId, Date, Time, ScriptId, AgentID, PreviousStatus, NewStatus...

The PreviousStatus and NewStatus can have the following values depending on the result of the collection action: Call not connected, Call Connected, Right party contact, Promise to Pay, Partial Payment, Installment payments, payment in full...

Let's calculate CCR in DAX.


'CollectionActions'[CallsValidNumber] =

'CollectionActions'[NewStatus] IN { "[Call Connected]" }
CRC = 
    IF (
      'CollectionActions'[CallsValidNumber]  = BLANK (),


Tuesday, 27 November 2018

Aging Accounts Receivable in DAX

When your collection teams are working old accounts receivable and catching with current billing, things can quickly get overwhelming.
Sure the accounts balance is an important priority criterion but you also need also track other features and prioritise your efforts based on that. Probably, the simplest feature you can build is the accounts aging. 
To do so, the simple way is by using your favorite Self BI tool, I have a preference for Microsoft Power BI, and build you aging reports by classifying accounts 
that are more than 60, 90, 120, and 180 days old, which will help you focus on the ones that you better chance to collect first . Naturally, the older the A/R is ( 180-day+ ), the less likely you might get paid on it. 

Then, you can decide, based your write off policy (which will impact your expected credit losses, IFR9) which Accounts should be written off, considered bad debt, and probably outsourced to Debt Collection Agencies.
We will show in the following how you can define your buckets in DAX (Power BI Language) :

Suppose Receivables[] is the table containing details about Receivables accounts such as Due Date, Due Amount, Customer etc..

For instance, let's build the bucket for the 1 to 60 Days Bucket as a new measure :


1 to 60 Days =

           [Due Amount],  FILTER(VALUES('Receivables'[Due Date]), INT( TODAY() - 'Receivable Item'[Due               Date] ) <= 30 )))
           [Due Amount],  FILTER(VALUES('Receivables'[Due Date]), INT( TODAY() - 'Receivable Item'[Due               Date] ) <= 30 )
This formula can then be rewritten for the other buckets :  61 to 90 Days, 91 to 120 Days121 to 180 Days and Over 180 Days

Friday, 19 October 2018

Analytics for Accounts Receivable Collection

Account Receivable Collection is probably one of the most data-intensive fields. Collection Agents handle data along the way of managing their Debt portfolio.

It starts by the Data Cleaning and Enrichment Process: where the missing and erroneous data are wrangled and transformed, then enriched with important contact data such as Debtors Emails, phone numbers and Addresses formatted to the local Postcode standard. 

The second Data-Driven Process is Portfolio Analytics, where Portfolio managers operate Portfolio Segmentation, where the main aim is to split the Debts portfolio to various coherent buckets using multiple criteria such as Receivable Aging, Debt balance, Credit Bureau Score, Location, Purchased products, etc..

In addition to the rule-based Segmentation described below, Debt Collection Analysts might use Machine Learning techniques to learn from historical Debtors behaviour regarding the collections actions. This Learning will help to build a Debt Collection Scoring describing the probability of recovery of the debt and even recommending the best actions and timing to undertake for maximising the Recovery likelihood. 

The last, and not least, Data-Driven Process to take into consideration is Key Performance Indicators (KPI's) Tracking. Data Analytics can help managers to analyse collection logs and produce pertinent KPI's describing the efficiency of the collection process, from Data Cleaning and Enrichment, Generating Promises, Keeping Promises, to Payments. The most important KPI's to track are the following :

  • Percentage of Accounts with Missing or Incomplete Data
  • Percentage of Accounts Requiring Skip tracing
  • Average Age of Purchased Debt
  • Right Party Contact (RPC) Rate
  • Percentage of Outbound Calls Resulting in Promise to Pay
  • Percentage of  Promises to Pay Kept
  • Days Sales Outstanding (DSO)
  • Collection Effectiveness Index (CEI)
  • Collections Revenue per Collections Agent

Friday, 14 September 2018

AI based Pricing and the trade-off between efficiency and fairness

In theory, the AI based Pricing Process may provide a technical framework for building an economically efficient Dynamic and Personalised Pricing, capturing the perceived value of the product for every single customer. 

In practice, such Pricing tactics might cause some unwanted serious side-effects, and raise huge concerns among customers regarding the fairness of such decisions.

Indeed, in Airline industry, although the time-based Pricing discrimination (where companies change prices for all the customers based on remaining time to flight) are today relatively well-acceptable, other experienced Pricing tactics based on the specific OS version of the customer (Mac vs Microsoft) and the use of cookies, were rejected.

In other words, the experience of Pricing Optimisation and Revenue Management,  developed in the Airline, showed that customers are not radically against the principle of price discrimination but are rather concerned with the discrimination criteria. 

Therefore, the massive use of more and more complicated Machine learning and AI models, such as Ensemble Models and  Neural Networks , with a unique objective of capturing the maximum value from each single customer transaction, makes it difficult to interpret and to understand the discrimination criteria. This may be badly perceived by the customers and considered as an ambiguous, arbitrary and unfair decisions.

In order to make this Personalised AI based Pricing more acceptable and reduce it’s perceived unfairness, companies need to put more constraints to those models, regarding their price updates freequency and variance. Also, instead of updating dynamically and publicaly product prices, companies may use targeted discounts as a way to implement pricing changes. And above all, companies need to share and communicate in a transparent manner the Pricing and discounting rules to their customers.

Thursday, 16 August 2018

Dynamic, Customised Pricing using AI

The Pricing is one of the most important exercises in Marketing, it’s the final validation of the value of products to the customers.

We might succeed brilliantly in the acquisition process of customers, using the best SEO, targeted ads, Marketing Automation techniques and yet fail in the step of Customer activation because the suggested Price is beyond the customer's perceived value for that product.

Dynamic and Personalised Pricing

This relationship between pricing and value, reveals the problem we face: trying to build a decision for the Price, which is a concrete and measurable attribute of the product, based on an intangible, subjective and dynamic concept which is the Value.

In the modern pricing theory,  the pricing optimisation problem takes into consideration the consumer surplus hypothesis, where we assume that the buying decision takes place only if the perceived value is higher than the price of the product and that every customer seeks to maximise his consumer surplus.

From that perspective, companies optimise their pricing tactics within the interval between the product cost, and the perceived value of the product, regarding the constraints dictated by the company's Pricing Strategies, whether it’s a skimming, penetration or price war strategy etc..

To build a great Pricing strategy, companies need to develop a deep understanding of the perceived value of the product to their customers.

The value is dynamic because it keeps changing in regards to the current situation of the customer. Obviously, an umbrella on a rainy day has not the same value as in a sunny one.

The value is also, personalised because it depends on the background of the customer and the context of the transaction. A customer operating extensive search for a product at 4a.m does probably not value the product the same as if he was searching the product at 4p.m

In theory, that means that technically, to build Dynamic and Customized Pricing tactics, we need to take into consideration the timing, the context and the customer profile. And this may result in different prices, at different timing for each different customer.

Big Data and real facts instead of intentions

One of the most common approaches used in the past to infer the value of a product was based on the use of Survey Data. Where Marketing Analysts show consumers several product profiles, varying Price and other product attributes (ideally orthogonally designed) and ask them to rank these profiles. Then Analysts uses Full profile Conjoint Analysis, Hybrid Conjoint Analysis or Discrete Choice Analysis, to infer the importance of Price (and other product attributes) to the customers and estimate its Price elasticity. The Pricing problem is then framed as an optimisation problem to maximise Revenue or Profit. Tools such as Excel Solver engines (Simplex, GRG and evolutionary) where then used to find the best Price.

Today, the Big Data enables us to be more factual. Indeed, we no longer need to use declarative consumer's intentions about buying or not and at which price. We can simply use server log data, containing detailed information about the real choices that customers did.

Feature engineering for Pricing

Once we collected all the transactions log data, building the right Dynamic and Customized Pricing tactics is first a matter of Feature Engineering: which means defining and selecting the most important attributes to take into consideration in the Pricing process.

Those features could be Product related, such as brand, price, colour, technical features of the product. They could be Customer related, such as sociodemographic attributes and customer’s purchase history.

The Pricing could be built using more Advanced features, using the mouse clicks data, the identification of the acquisition channel, and the customer’s interactions with the company's communications campaigns via Direct emails, facebook, twitter etc..

Machine Learning Techniques

Once the Feature Engineering is operated, using the available data sources at the company level, comes the Modelling tasks.

To the Pricing problem can be modelled as a two-step classification problem : At first we predict whether the user will buy the product, and if the prediction is positive, then we predict the price he might be willing to pay. The two classifiers can be trained as a supervised machine learning problem, learning from the historical data and using the engineered features.

Many classifiers can be used, such as Logistic regression, generalized linear model (GLM),  Extreme Gradient Boosting (xgboost), Neural Networks.

The choice between those techniques depend mainly on the trade-off between the accuracy, scalability of the algorithms we might need.

Monday, 15 January 2018

Thoughts about GDPR

The General Data Protection Regulation (GDPR) (Regulation (EU) 2016/679) stands for the regulation by which the European Parliament, the Council of the European Union and the European Commission intend to strengthen and unify data protection for all individuals within the European Union (EU). It also addresses the export of personal data outside the EU.

Let me share in this post some thoughts about the GDPR and it's challenges :
  • It's about consent: Companies need to have the consent of the customer for each specific use of its data. This consent can be withdrawn whenever the customer wants. If the companies obtain new data sources, their previous consent doesn't cover it.
  • In case of a security breach, companies have to report, under 72 hours, the data involved, what happened, how quickly it was resolved and its implications and impact.
  • The right to be "forgotten": Each individual has the right to ask for erasing all the data related to him or her.
  • Justify the use of data: Regulators may ask questions about the use of data: why needed, why holding it so long, how was it validated, compliance with the customer's consent.
  • Portability of the data: Customers may request all the data a company holds on them to be transferred to another company.
  • Creation of a Data Protection Officer role within companies: DPO
  • If the data is passed to an outsourcer or a partner, the companies responsibility concerns also the actions undertaken by this outsourcer or partner.
  • Fines up to 4% of turnover, depending on cases.

IMHO, Up to date some of the requirements of the GDPR are still ambiguous, and need some clarifications before getting implemented  :

  • The scope of the data considered in the GDPR regulation
  • I'm wondering if the GDPR gives the Regulators the authority to question the decision made by algorithms, and the models used for such decisions (such as scoring)
  • In the case a customer asks for erasing its data, what data the companies may keep ? for instance, can they keep the data used to meet the legal obligation of the collection process?

Definitely, the GDPR regulation rises two opportunities :
- The first one is within Consulting business, to help companies have a safe drive into the GDPR compliance
- The second one is about Engineering challenges, to help companies build the right IT, enabling their DPO's to monitor, protect, report, erase and transfer the data with respect to the requirements of the regulations.