This article makes the case for data and why learning the language of data is important.
There is a gap between most people's statistical understanding and the practice of data science. The tech world has an unsavory drug culture-inspired name for individuals like you --- you are known as a "user." Then, to complete the analogy, this makes the tech companies the "dealers" or "pushers." The statistics taught in school are not as useful as they could be to guide individual "users" to make the most of data in the real world. This article provides intuition from data science and business practitioners. Many resources are furnished to help you dig deeper and make the language of data your Information Age superpower!
Yes, I have been on the dealer side. But now it is time to help the users fight back.
Data and algorithms are different.
I offer this disclaimer because data and algorithms are often confused. Data represents our past reality. Algorithms are used to transform data. They are different. Data has already happened. An algorithm is a tool to transform data intended to predict and impact the future. An organization’s data-transforming algorithm may be helpful to you - especially when your attentions are aligned with that algorithm’s objective. More often today, an organization’s data-transforming algorithm is even more helpful for optimizing some other objective -- such as maximizing shareholder profit or filling political party coffers. Please see the appendix for more context.
But algorithms are not just for organizations trying to sell you stuff. You should identify, test, and periodically update an intuitive set of personal algorithms in order to make a lifetime of good decisions. “Intuitive algorithms” are a personal set of rules you use to transform your data. Your intuitive algorithms are either informal or, more necessary today, enhanced with the help of personal decision tools. Together, we will build an intuitive understanding of data in the service of building your personal algorithms. Our focus is on using the statistical moments as a bedrock for that data understanding. During the course of our data exploration, Bayesian Inference and choice architecture tools like Definitive Choice will be introduced. Choice architecture is a helpful tool for implementing your personal algorithms.
Choice Architecture and personal algorithms.
Behavioral economist and Nobel laureate Richard Thaler said:
“Just as no building lacks an architecture, so no choice lacks a context.”
A core behavioral economics idea is that all environments in which we need to make a choice have a default setting. There is no "Neutral Choice" environment. For example, a company may provide 300 mutual funds and investment strategies to assist people in making a retirement plan decision. Their rationale for the volume of choices is that the company does not want to influence the employee with a particular retirement strategy. They want to ensure the employee's choice occurs from a wide array of possible alternatives.
But let’s look at it from the employee’s standpoint. This choice environment, with its high volume of complicated-looking alternatives, seems noisy and overwhelming. In fact, research shows, this sort of choice environment discourages selecting ANY retirement plan. A typical employee narrative may be: "300 choices!? Wow, this looks hard! Plus, I have so much to do to get onboarded and productive in my new company. I will wait until later to make a decision." ... and then - later never comes.
A complicated, overwhelming choice environment causes savings rates to be less than what they otherwise could have been. A compounding factor is that, traditionally, the default choice provided by the employer is for the employee NOT to participate in the retirement program. This means that if the employee does not complete some HR form with a bunch of complicated choices, then they will not save for their own retirement. Thus, not surprising is that a difficult choice that does not have to be made is often not made.
Regarding company incentives, the company will usually match employee contributions to the retirement plan. So if the employee does not participate, the company will not need to contribute. An employee not participating reduces the company's retirement expense. Thus, the unused match will drop to the bottom line and be made available to the equity owners. A company's default choice environment is going to be a function of its complex motivations and self-interests. As we discuss in the appendix, the employee is only one of four stakeholders' benefits a company needs to balance.
So, choice architecture is essential for building your personal algorithm. The choice architecture of a company is likely motivated by objectives NOT necessarily aligned with your welfare. As such, you should augment the companies’ choice architecture with a choice architecture of your own!
Data is the foundation.
On the way to implementing or updating your personal algorithms, we must begin by building your data bridge foundation. Personal algorithms are greatly improved when the owner of those algorithms has a solid understanding of data and statistics.
Motivation connects our past data
to our algorithmic-influenced future
About the author: Jeff Hulett is a career banker, data scientist, behavioral economist, and choice architect. Jeff has held banking and consulting leadership roles at Wells Fargo, Citibank, KPMG, and IBM. Today, Jeff is an executive with the Definitive Companies. He teaches personal finance at James Madison University and provides personal finance seminars. Check out his new book -- Making Choices, Making Money: Your Guide to Making Confident Financial Decisions -- at jeffhulett.com.
In my undergraduate personal finance class, part of the curriculum is to help students understand the interaction of data, the power of algorithms, and how to leverage or overcome them with a robust decision process.
From data scarcity to data abundance
In the last half of the 20th century, the world shifted from the industrial era to the information era. The changing of eras is very subtle. For those of us who lived through the era change, it is not like there was some official government notice or a “Welcome to the information era” party to usher in the new era. It just slowly happened – like a “boil the frog” parable - as innovation accelerates and our cultures' adapt. Era changeovers are very backward-looking. It is more like a historian observing that so much had changed that they decided to call the late 20th century as when the information era started.
This big change requires people to rethink their relationship with data, beliefs, and decision-making. Prior to the information age, data was scarce. Our mindset evolved to best handle data scarcity over many millennia. In just the last few decades, the information age required us to flip our mindset 180 degrees. Today, the data abundance mindset is necessary for success. Our genome WILL catch up some day…. Perhaps in a thousand or more years as evolution does its' inevitable job. Until then, we need to train our brains to handle data abundance. The objective of this article is to make the case to best handle data abundance. Cognitive gaps, such as that created by the difference between our data scarcity-based genome and our data abundance-expected culture have only accelerated during the information era.
In the industrial era, computing power was needed and not yet as available. As a result, math education taught people to do the work of computers. In many ways, people were the gap fillers for furnishing society's increasing computational needs. Our education system trained people to provide the needed computational power before powerful computers and incredible data bandwidth became available.
Over time, digital data storage has been increasing. However, even during the industrial era, those data stores still took effort to locate. Data was often only available to those with a need to know or those willing to provide payment for access. The average person during the industrial era did not regularly interact with data outside that observed in their local, analog life.
The information era is different. Today, powerful computers exist and computing power is both ubiquitous and inexpensive. Digital data stores are no longer like islands with vast oceans around them for protection. Data stores are now among easy-to-access cloud networks. Also, many consumers are willing to trade personal data for some financial gain or entertainment. While this attitude is subject to change, this trade seems to be working for both the consumers and those companies providing the incentives. Data abundance is the defining characteristic of today's information era. Success comes from understanding your essential data and leveraging that data with available computing technology.
See: A Content Creator Investment Thesis - How Disruption, AI, and Growth Create Opportunity This article provides background for why people are willing to give up their data to the social media platforms.
For most people, today's challenge is less about learning to do the work of a computer. Today's challenge concerns using abundant data and leveraging technology to serve human-centered decisions. Our formal math education systems have been slow to change and tend to favor former industrial era-based computation needs over information era-based data usage. [i] This is unfortunate but only emphasizes the need to build and practice your statistical understanding even if you did not learn it in your formal education.
The big change – From data scarcity to data abundance
Data scarcity was when the most challenging part of a decision was collecting data. The data was difficult to track down. It was like people were data foragers, where they filled a basket with a few pieces of difficult-to-obtain data they needed for a decision. Since there was not much data, it was relatively easy to weigh and decide once the data was located.
Data abundance has changed our relationship with data 180 degrees in just the last few decades. Consider your smartphone. It is like the end of a data firehose. Once the smartphone is opened, potentially millions of pieces of data come spewing out. Plus, it is not just smartphones, data is everywhere. But it is not just the volume of data, it is the motivation of the data-focused firms. The data usage has a purpose and that purpose is probably not your welfare.
"The best minds of my generation are thinking about how to make people click ads. That sucks." - Jeff Hammerbacher, a former Facebook data leader.
The challenge is no longer foraging for data. Our neurobiology, as tuned by evolution, is still calibrated to the data scarcity world. It is like no one told our brains that how we make decisions is dramatically different today. The challenge is now being clear about which of the overwhelming flood of data is actually needed. The challenge is now to curate data, subtract the unneeded data, and use the best decision process. Unfortunately, the education curriculum often teaches students as if we are still in the data scarcity world.
Economics teaches us that which is scarce is that which creates value. So, since data is abundant, then what is it that creates value? In the information era, it is scarce human attention that creates value for companies trading in data abundance.
For a "Go West, Young Man" decision made during the 1800s as compared to a similar decision today, please see the article:
Our past reality is diverse
Our world can be interpreted through data. After all, data helps people form and update their beliefs. Often, our family of origin and communities help people form their initial beliefs, especially when they are young. This makes statistics the language of interpreting our past reality in the service of updating those beliefs. Like any other language, the language of statistics has grammar rules. Think of statistical moments as the grammar for interpreting our past realities. The better we understand the grammar rules, the better we can:
Learn from our past reality,
Update our beliefs, and
Make confidence-inspired decisions for our future.
'Past reality’ may be a nanosecond ago, which was as long as it took for the light of the present to reach our eyes. Alternatively, ‘past reality’ could be that learned from our distant ancestors. A group of people is known as a population. Populations are mostly described across diverse distributions. A distribution describes unique factors of a population and how often those unique factors occur. How often those unique factors occur relative to the total is also described as a probability. Understanding the probabilities based on your past reality helps you infer future outcomes. While people may share some similarities, we also share incredible uniqueness. Understanding that uniqueness is at the core of statistics and helping us to make good decisions.
Diversity goes beyond typical characteristics, like gender, race, and eye color. Even more impactful is our diverse behavior generated by the uncertainty we face in our lives. Those uncertainty characteristics include:
a) the incomplete and imperfect information impacting most situations,
b) the dynamic, interrelated nature of many situations, and
c) the unseen neurobiological uniqueness we each possess.
This means even the definition of rationality has been redefined. There was once a belief that rationality could be robotically assigned as a single point upon which all people converge. Instead, today, rationality is better understood through the eyes of the diverse beholder. The same individual is often diverse across different situations because of uncertainty, framing, and anchors. This means the “you” of one situation is often different than the “you” of another situation because of our state of mind at the time the situation is experienced and how seemingly similar situations inevitably differ. Certainly, the different "us" of the same situation and time are also divergent, owing to individual neurodiversity.
This graphic is an excerpt from the article: Becoming Behavioral Economics — How this growing social science is impacting the world
Our hunt is to understand the population by learning of its past reality. But rarely can data be gathered on the entire population. More often, we must rely on samples to make an inference about the population.
Tricky samples and cognitive bias
Owing to situational uncertainty, framing, and anchors, samples can be tricky. The sample data from others in the population may be challenging to interpret. But even more troublesome, our own brains may play tricks on us. These tricks have grown in significance because of how the information era has evolved. These tricks may lead us to conclude the sample data we used to confirm a belief is representative and appropriate to make an inference. It takes careful inspection to guard against those tricks, called confirmation bias. Next is a typical decision narrative descriptive of the environment leading to confirmation bias and a less-than-accurate decision:
This narrative is typical of how people experience their decision environment and motivations. The challenge is that the past outcome is a single observation in the total population. Your sample size of one is likely too small to make a robust inference. To be clear, this does NOT mean your past experience has no decision value... of course it does. However, blindly following our past experiences as a guide to the future may not include other past realities to help inform our decisions.
Robyn Dawes (1936-2010) was a psychology researcher and professor. He formerly taught and researched at the University of Oregon and Carnegie Mellon University. Dr. Dawes said:
"(One should have) a healthy skepticism about 'learning from experience.' In fact, what we often must do is to learn how to avoid learning from experience."
Properly understanding your past reality in the present decision context is doable with the appropriate decision process. Part of being a good data explorer is using a belief-updating process including a suitable integration of our and others' past reality. A proper decision process helps you avoid confirmation bias and achieve conviction in your decision confidence.
Think of confirmation bias as a mental shortcut gone bad. Most mental shortcuts provide effective or at least neutral heuristic-based signals. But confirmation bias occurs when a mental shortcut leads us to make a poor decision. As the next graphic illustrates, confirmation bias occurs when only a subset of evidence is used to make a decision. While the current set of information may be convenient and apparently confirms a previous belief, the decision-maker ignores a fuller set of data that may be contrary to the existing belief. This kind of cherry-picking bias leads to a reasoning error called an error of omission. Errors of omission are tricky because technically the subset of information is not wrong, it is simply incomplete to draw the appropriate conclusion.
A politician example for reasoning errors: Fact-checking is often done to detect incorrect statements of the data the politician provides. A false statement is also known as an error of commission. However, the challenge is not necessarily what the politician said, but what the politician did NOT say. Politicians regularly engage in providing incomplete fact sets. Errors of omission are a) different than their error or commission cousins and b) generally tolerated or not detected by the public. Politicians regularly and conveniently leave out data - an error of omission - when trying to sell a particular policy or campaign plank.
Could you imagine a politician saying, “Here are all the reasons why this is a great policy decision! But wait! Here are several other reasons that may make this policy decision risky and potentially not effective. There are many tradeoffs. The chance of success depends greatly on the complex and unknowable future!” A politician who honestly presented all the facts and tradeoffs necessary to make a great decision would likely struggle to get elected. Political theater and a complete rendering of complex policy decisions are very different.
It is not clear whether the politician is selfishly motivated to commit errors of omission, as part of a goal to grow their power base. Alternatively, those errors may be selflessly motivated, recognizing that most people need help clarifying complex situations. It is likely some of both. However, regardless of the politician's motivation, errors of omission are rampant.
Bertand Russell (1872-1970) - the late, great mathematician and philosopher's timeless aphorism reminds of the politician's reasoning challenge:
"The whole problem with the world is that fools and fanatics are always so certain of themselves, and wiser people so full of doubts."
Being on the lookout for confirmation bias is essential for the successful data explorer. Confirmation bias is a type of cognitive trick called cognitive bias. All people are subject to cognitive biases. Mental shortcuts, also known as heuristics, are a helpful feature of the human species. Their related cognitive bias cousins are a heuristic byproduct and something we all share. The transition to the data-abundant and attention-scarce era has caused those byproduct cognitive biases to be more impactful upon decision-making.
A great cognitive bias challenge is that they come from the emotional part of our brain lacking language. [iv-a] This means that other than vague feelings, we have no signal to warn us when we are under the spell of a cognitive bias. In the last typical decision narrative, the pain or joy of those outcomes was remembered. The challenge is that those emotions have no weight as an input to the current decision. Also, that feeling has no way to integrate with all the other data you need to make the best decision. Confirmation bias is when we do not weigh our data signals - inclusive of emotion - correctly. Inaccurate weighting goes both ways — one may be under-confident or over-confident when interpreting emotion-based data.
In order to learn and infer from our past reality, one must either have a) an unbiased sample or b) at least understand the bias so inferential corrections can be made. Statistics help us use a wider set of data and properly integrate our own experience, including those vague feelings. This is in the service of taking a less biased, outside-in view to better understand our data.
Please see the following VidCast for more information on how confirmation bias leads to reasoning errors. This VidCast shows the slippery slope of how confirmation bias may devolve to cancel culture and allowing others to determine an individual’s self-worth. Political leaders may aspire to this level of followership. Social Media echo chambers are a hotbed for confirmation bias and cancel culture.
Being Bayesian and the statistical moments' map
In the next few paragraphs, Bayesian Inference will be introduced. Consider this a starting point. You will want to circle back to Reverend Bayes' work after walking through the statistical moments framework found in the remainder of this article. That circle-back resource is provided next.
The story of Thomas Bayes is remarkable. He lived over 250 years ago and created an approach to changing our minds. The Bayesian approach disaggregates the probabilistic steps to update our beliefs. Effectively changing our minds is a core human challenge - mostly unchanged by evolution. Belief updating is a challenge in today’s information-overloaded world. Bayes' treatise is a beacon for helping people change their minds when faced with uncertainty. Being a successful data explorer often requires us to actively manage our cognitive biases by curating and refining valid data and subtracting the data that is irrelevant or wrong. That is at the core of Bayes' work, called Bayesian inference.
Please see the following article for the Bayesian inference approach and an example of using Bayesian inference to make a job change decision. Bayesian inference is a time-tested belief-updating approach VERY relevant to today’s world. Bayesian inference enables us to make good decisions by understanding our priors and appropriately using new information to update our beliefs. Bayesian inference helps us use our good judgment and overcome our cognitive biases. The Definitive Choice app is presented to implement a Bayesian approach to your day-to-day decision-making.
For an example of using Bayesian inference to help make a decision after a scary terrorist attack, please see the article:
As we discussed near the beginning, Bayesian Inference and Definitive Choice are types of personal algorithms. They implement a robust personal decision process as an outcome of being a good data explorer.
To summarize, the case for being a successful data explorer:
a) data exploration is important in the data abundant / attention scarcity era,
b) data exploration is tricky to manage,
c) data exploration requires a statistical understanding, and
d) data exploration benefits from a robust decision process to appropriately manage.
Next, the statistical moments are placed in the context of scientific inquiry. Mathematician William Byers describes science as a continuum. [ii] At one extreme is the science of certainty and at the other extreme is the science of wonder. The statistical moments' grammar rules fall along the science continuum. At the left end of the continuum, the initial statistical moments describe a more certain world. As we go along the continuum from left to right, risk and variability enter the world picture. Then, uncertainty and unknowable fat tails give way to wonder.
Just like grammar rules for language, statistical moments are essential for understanding and capturing the benefits accrued from our past reality. And, just like grammar rules for language, statistical moments take practice. This practice leads to the effective understanding of our past reality and for statistical moments to become a permanent feature for your information-era success. Data, as representing our past reality, contains nuance and exceptions adding context to that historical understanding. Also, there are even more grammar rules that help guide us in more unique circumstances. Building statistical intuition is your superpower in the Information Age.
Please see the following article to step deeper into being a data explorer. This article explores the statistical moments, proceeding from the science of certainty and concluding with the science of wonder.
Appendix - How well are algorithms aligned to you?
This appendix supports the "This article is about data, not algorithms" disclaimer found at the end of the introduction.
Generally, public companies have 4 major stakeholders or "bosses to please" and you - the customer - are only one of the bosses. Those stakeholders are:
The shareholders,
The customers (YOU),
The employees, and
The communities in which they work and serve.
Company management makes trade-off decisions to please the unique needs of these stakeholder groups. In general, available capital for these stakeholders is a zero-sum game. For example, if you give an employee a raise, these are funds that could have gone to shareholder profit or one of the other stakeholders.
This means the unweighted organizational investment and attention for your customer benefit is one in four or 25%. The customer weight could certainly be below 25%, especially during earnings season. Objectively, given the competing interests and tradeoffs, this means a commercial organization's algorithms are not explicitly aligned with customer welfare. Often, the organization's misaligned algorithm behavior is obscured from view. This obscuring is often facilitated by the organization's marketing department. Why do you think Amazon's brand image is a happy smiley face :) For more context on large consumer brands and their use of algorithms please see the next article's section 5 called "Big consumer brands provide choice architecture designed for their own self-interests."
This article’s focus on data will help you make algorithms useful to you and identify those algorithms and organizations that are not as helpful. Understanding your data in the service of an effective decision process is the starting point for making data and algorithms useful.
While this article is focused on the data, please see the next article links for more context on algorithms:
An approach to determine algorithm and organizational alignment in the Information Age:
How credit and lending use color-blind algorithms but accelerate systemic bias found in the data:
Notes and a word about citations
Citations: There are many, many references supporting this article. Truly, the author stands on the shoulders of giants! This article is a summarization of the author's earlier articles. Many of the citations for this article are found in the linked supporting articles provided throughout.
[i] The challenge of how high school math is taught in the information age is well known. The good news is that it is recognized that the traditional, industrial age-based high school "math sandwich" of algebra, geometry, trigonometry, and calculus is not as relevant as it used to be. Whereas information age-based data science and statistics have dramatically increased in relevance and necessity. The curriculum debate comes down to purpose and weight.
Purpose: If the purpose of high school is to a) prepare students for entrance to prestigious colleges requiring the math sandwich, then the math sandwich may be more relevant. If the purpose of high school is to b) provide general mathematical intuition to be successful in the information age, then the math sandwich is much less relevant. I argue the purpose of high school for students should be b, with perhaps an option to add a for a small minority of students. Also, it is not clear whether going beyond a should be taught in high school or be part of the general college education curriculum or other post-secondary curriculum. Today, the math sandwich curriculum alone lacks relevance for most high schoolers. As many educators appreciate, anything that lacks relevance will likely lead to not learning it.
Weight: Certainly, the basics of math are necessary to be successful in statistics or data science. To be successful in b) one must have a grounding in a). The reality is, high school has a fixed 8-semester time limit. Which, by the way, education entrepreneurs like Sal Khan of Khan Academy argue against tying mastery to a fixed time period. But, for now, let's assume the 'tyranny of the semester' must be obeyed. As such, the courses that are taught must be weighed within the fixed time budget. Then, the practical question is this: "If statistics and data science become required in high school, which course comes out?" I suggest the math sandwich curriculum get condensed to 4 to 5 semesters, with the information age curriculum being emphasized in 3 to 4 semesters.
The tyranny of the semester can be overcome with education platforms like Kahn Academy. Since the high school math curriculum increasingly lacks relevance, an enterprising learner or their family can take matters into their own hands. Use Kahn Academy outside of regular class to learn the data science and statistics-related classes you actually need to be successful in the information era.
[ii] Byers, The Blind Spot: Science and the Crisis of Uncertainty, 2011
[iii] See Our Brain Model to explore 1) the parts of the brain lacking language, called the fast brain, and 2) people’s abilities to see through the big block.
1) The Fast Brain: The human ability to quickly process information through our emotions is anchored in the right hemispheric attention center of our brain. Please see the “The high emotion tag & low language case” for an example.
2) The Big Block: The human ability to forecast the future based on past inputs is anchored in the left hemispheric attention center of our brain. Please see the “The low emotion tag & high language case” for an example.
Hulett, Our Brain Model, The Curiosity Vine, 2020
Comments