I will cover a brief introduction to the terminology of Business Intelligence to clarify the topic covered herein. A metric, or a fact is a specific data point:
– turnover % (heads terminated/total [active] heads)
– revenue per person (total revenue $/total [active] heads)
– ratio of HR administrators to Employees (total [active] HR employee heads: total [active] heads)
Business Intelligence facts are fixed calculations and charted over time for different organizational segments. They can be absolute counts, percentages, ratios, or any other mathematical measure. While they are open to interpretation they often consist of a fixed mathematical operation like count, sum, minimum, maximum, mean, standard deviation, variance, etc.
Analytics refer to the mining, measurement and reporting of facts or metrics (often presented visually) over time as they relate to one another. Although this realm grows extremely complex in areas such as quantitative risk analysis (often used in financial services) for the purposes of Human Resources we only need to get familiar with some basic concepts:
positive correlation: in a sample set of data points what % probability will one metric relate to another or move in parallel with another metric forming a ‘//’.
negative or inverse correlation: in a sample set of data points what % probability will one metric relate to another’s polar opposite, meaning they move in exact opposite directions forming an ‘X’.
For a more detailed intro to HR Analytics read the blog post: Human Resources Analytics Primer.
Predictive Modeling Errors
I happen to buy mostly business books from Amazon.com most often and for that reason I am presented with quirky recommendations such as “Who Moved My Cheese” courtesy of Amazon’s effective predictive analytics engine. While some of the recommendations are interesting, others miss the mark completely. The most frightening result of my shopping habits, as it applies to the Human Resources domain, is that I’ve now found that it will only recommend a very narrow genre of books. I’ve lost all random and creative associations that used to lead me to surprising new genres and authors. It has become very Stepford in its associations because the data it leverages is incomplete.
This same Stepford-effect is a risk of predictive analytics in Human Resources. If I happen to hire a few great performing 20-something-year-old men then that pattern becomes part of the predictive criteria and in turn encourages me to hire more young males. This is exaggerated to make a point, but it is the same effect witnessed as a result of failing risk models of the subprime mortgage induced market crash (NY Times) and should not be underestimated or dismissed as unrelated. The full set of data was not possible to obtain or was not considered. Quantifying and predicting human behavior requires far more data than is possible to obtain with today’s Human Resources software and technology. This affects many areas of Human Resources but specifically it impacts the more behaviorally nuanced areas such as Talent Analytics: Performance, Recruiting, and Training.
Data Aging is one challenge. In summary predictive models are only good for a certain amount of time as the factors involved in behavioral economics change over the years in correlation to macro and micro economic trends. At a macroeconomic level you can often find different age groups weigh different factors at varying levels. External changes like inflation, unemployment, wars, etc. all shape human behaviors. Below is an example which aimed to show whether high performing employees by age and generation bands stayed or voluntarily left an organization depending on compensation changes over 4 years. What is interesting is the variance between who left and who stayed by age (generation) and gender:
Aside from external factors employees also face internal “microeconomic” influences some of which can be measured and some not. For example, commute time may be long for an employee and that can be used in metrics. However there are many personal factors that cannot (for example, if a spouse loses a job or becomes ill). So, with the constant movement of both sets of internal and external factors, predictive models must be maintained regularly and sometimes redeveloped from scratch over time.
Data Volume is also a critical factor to be aware of in predictive modeling for HR. Let’s say I would like to create a risk model that lets managers and HR determine what employees might be a flight risk. In a company of 10,000 employees let’s say my voluntary turnover is 10% per year. That gives me a sample data size of only 1,000 per year to work with. In order to increase that I may look over 3 yrs so I can have 3,000 to work with. I probably want to limit my study to high performers (assuming I don’t care as much if low performers go) and that likely brings me to half the population or 1,500. Additionally I would want to include some demographic dimensions here as those impact models in HR heavily.
So let’s say I want to cut it by age and gender (following the example above) because a single woman in her 40s with 2 children is going to value different things in the employer relationship than a single man in his mid-twenties. So, now I have about 750 male/ 750 female. Then by generation I’ll have Baby Boomers: 25% (188) Generation X/Y: 40%(300) Millennials: 35%(263). At these sample sizes the use of the word “predictive” becomes somewhat misleading. Yes, you can forecast perhaps better than before but marketing materials often gloss over the weaknesses of these kinds of analyses. For companies of 5,000 employees and under this is an important distinction.
Predictive Analytics are what they “eat” and in Human Resources that still amounts to limited and tightly controlled data currently (although this is changing rapidly).
Technology Outpacing Operations
Technology also quickly outpaces corporations’ ability to comprehend and effectively train on its usage. Predictive Analytics reporting especially, due to its complex composition, is not something the average manager is trained to reverse engineer (without great effort) in order to challenge the underlying assumptions. Therefore, they will typically either be forced to choose to ignore or accept the information at face value. This is akin to using autopilot without being able to manually fly a plane as well. As referenced earlier, this is also what afflicted financial services companies who fell victim to the subprime mortgage fallout. Many people at the top of these firms had very little insight into what sort of details were built into the technical risk models to manage the trading of these instruments.
HR has historically been a far less quantitative business function than other areas such as Finance or Technology. The lack of insight into these models can erode competitive capabilities very quickly and by the time you begin to lose key resources it may be too late to address the operational causes.
Dehumanizing Human Resources
People need shortcuts in the modern day world to cope with the vast amount of data we synthesize every second of every day. Visual reporting has come a long way and when it is supported by details for further evaluation it is a powerful tool. However, there is a sinister side to visual reporting and data mining that sometimes (intentionally or unintentionally) can influence business decisions where they should not.
When I was consulting for a client several years ago, there was a request for our team to produce a report to show a “day in the life” of users by name. We went to great lengths to produce a nicely formatted, visual report allowing managers to see what activities were being completed by which users and when. We were proud of this user-friendly development until we learned later that it was being used to help justify layoffs. The tool was neither written to account for many non-system related tasks nor did it by any measure show how productive an employee was in his or her role. The reality was that Management wanted to dehumanize difficult decisions. They wanted the job of laying people off to be made less personal and more quantifiable whether it was accurate for this specific purpose or not.
Reporting software vendors are all too willing to oblige and managers all too willing to use the reporting at face value without proper due diligence. Predictive Analytics in particular in Human Resources is inevitably turned into a “who-to-hire and who-to-fire” tool, a major part of the HR function’s responsibilities and it should not be blindly automated. While the function of Finance is often to look quantitatively at decision points, HR should offer a counterbalance to this.
Making a Positive Impact
The objective is not to argue that Predictive Analytics cannot be used effectively or responsibly in HR with positive impacts. There are many beneficial ways Predictive Analytics can be used in HR (among countless others):
- Determine what new Benefit plans are employees likely to leverage
- Estimate spikes in Absences or Overtime at certain points during the year
- Estimate correlated voluntary turnover as a function of involuntary workforce reductions (collateral turnover)
- Identify turnover risks
There is a great book that should be mandatory reading for Analytics professionals in general: Thinking, Fast and Slow by Daniel Kahneman. Although its focus is on the mind and its shortcomings in critical thinking, it also covers a lot of the risks alluded to above in data mining and analytics.