In Math We Trust

Summary:
It has become fashionable to use Big Data to solve pressing global issues (Pandemics, Water Shortages, Poaching, Voting etc.,.) Data is looked upon as an unalloyed blessing and more amounts of it, according to this formulation, can only be for the better. That rosy view discounts the faulty assumptions that Big Data operates under. Because Big Data relies on Mathematics, people assume it is value neutral and fail to hold companies that use Big Data accountable for the very real negative consequences for consumers and subjects. This book takes a critical look at how Big Data products inflict serious harm on a large section of the population. The author chalks it up to 3 characteristics of these models - Opaqueness (Details of the algorithm underlying these models are fiercely protected by companies as proprietary), Scale (these algorithms process data about massive number of people) and Damage (errors in these models have real life negative consequences). She calls them Weapons of Math Destruction - WMDs. The book goes through different sectors - High finance, Teacher evaluations in school, College Ranking systems, Courts, Employment (Applying for a job, scheduling used in jobs, wellness benefits offered by employers) and Elections.
India has rolled out Aadhaar on a massive scale as an identification mechanism for its citizens. One of its purported benefits is to plug leaks in delivery of government subsidies to its citizens. Indians willingly forked over their private details to government, responding to exhortations about Aadhaar being in the national interest. Like any other massive data set of personal information, the initiative faces routine challenges of hacking and security breaches. The more intriguing aspect of Aadhaar has been the attempt to leverage its identity validation mechanism for availing private services in India - Internet connections, SIM Card approval etc.,. India's Supreme Court invalidated the push to make Aadhaar mandatory for consuming private products and services and narrowed its scope to its original intent. Since poor people are the primary beneficiaries of subsidies provided by Indian government, the entrenchment of Aadhaar increases their burden of availing public services (Gas, PDS etc.,.). Time will tell if Aadhaar will be used by private companies to build value added models to offer identity validation on their own (separate from the Indian government) and what the ramifications of those models will be.
Analysis:
While Big Data is portrayed as a solver of many of the pressing problems across the world, it also plays a huge part in fueling them. Some of the most devastating crises to hit America in recent years - Housing crisis for example - have been driven by math nerds wielding their fearsome knowledge for immense profit. As information about people has proliferated, mathematics and statistics provide the tools to analyze consumer desires and their spending power. They are also used to evaluate trustworthiness and creditworthiness of people (with creditworthiness sometimes becoming a proxy for trustworthiness). In addition to private companies, countries and their governments have hitched themselves onto the Big Data bandwagon. Big Data processes look up to the past as a road map for the future. For people at the top of a society (wealth or political power), those processes serve to solidify their perch by improving their life chances. The choices under girding mathematical models in Big Data are driven by human assumptions. In this manner, prejudice and bias are systematically built into Big Data systems. When Big Data systems remain opaque, are massively scalable
 and cause real life damage to people, they are classified as Weapons of Math Destruction - WMDs. Mathematical models are simplifications of complex processes. As a result, there will always be mistakes. Simplicity in models, by itself, is not a disqualifying factor. In certain situations (for example, using smoke to detect fire in a house), simpler models are sufficient.
The fidelity of a mathematical model lies in its ability to use feedback to resolve any discrepancies between its results and the observed reality. Companies like Google or Amazon utilize statistics to fine tune the layout of their pages to increase web traffic to their site. They do so by conducting continuous testing of user behavior on their website by tweaking different variables (different colored buttons, changing the positioning of those buttons etc.,.). Baseball is an example of Mathematical modeling and Big Data systems done the right way. The different variables governing results of a game and individual performances in a baseball game are clearly defined and are available to anyone (Transparency). Since baseball data is updated constantly, it provides a large enough data set for analysis (Scale). Statisticians are able to run a multitude of scenarios looking for optimal combinations. Mathematical modeling in Baseball analysis also uses the actual attributes of a baseball game instead of proxies. Proxies or stand ins are usually used when the actual data underlying an attribute is not available or is erroneous. An example would be test score being a reflection of a student's potential. Statisticians compare results of their computations with the actual results of baseball games, thereby putting in place a very strong feedback loop. When the results (for example, prediction about a particular pitcher throwing well against a particular batter) do not match with actual results, statisticians can go back and tweak their models to ensure correctness. This allows Baseball models to be fine tuned thereby increasing the probability that their predictions will match the actual reality.
The Achilles heel of mathematical models widely used by Financial Industry before 2008 recession was the assumption that past history is a reliable predictor of future behavior. The models operated on the assumption that quants at all the hedge funds and banks knew the underlying risk of the different assets they were trading (opaqueness). They also assumed that not every borrower would default at the same time. As a result, they expected defaulters to be balanced out by people who would continue to pay their mortgages like clockwork. When both these assumptions fell flat, the resulting mayhem in mortgage backed securities caused extensive damage to people by sinking the value of their homes. For all the brilliance in slicing, dicing and packaging mortgages into exotic securities, the  mathematical models could not untangle those securities to isolate the messy mortgages (for the purposes of insulating the really bad ones from the rest). That work had to be done painstakingly by humans. The use of mathematical models and Big Data started in finance and has now spread to other domains. Big Data initiatives across all the domains gobble up the same pool of talent from elite universities (MIT, Harvard, Stanford), who have been conditioned to look towards external metrics (SAT scores, college admissions) all their lives. Data scientists look at the amount of revenue they are generating for their companies and use it as a proxy to assure themselves that their life is on the right track.
Teacher evaluation models using Big Data are marketed as a surefire way to improve educational outcomes in US schools. Supporters argue that these models achieve success by getting rid of ineffective teachers and retaining good teachers thereby improving educational outcomes of students. These models have assumed students' test score as a reflection of a teacher's teaching ability discounting other factors like engagement level of teacher with students, socio economic status of students etc.,. Big Data systems in teacher evaluations suffer from the size of the data set. While the number of users clicking on Google website is in the millions, number of students participating in teacher evaluations are in multiples of tens. The negative impact of the miniscule nature of the data set is further exacerbated by lack of feedback. For example, students provide feedback about a teacher and when the final rankings of the teacher are in contrast to the student feedback, the models are not tweaked to account for the discrepancy. The rush to introduce Big Data systems in teacher evaluations has led schools to manipulate test scores of students to come out ahead. When the mathematical models and Big Data systems come up with a low score for a teacher, action is taken against them. For the teacher to overturn the verdict, they need evidence that is airtight. The mathematical models and Big Data systems do not have adhere to the same set of iron clad standards when coming up with the original score for the teacher. In private sector, Big Data systems are used to identify ways to make a profit. As long as the Big Data system brings in revenue (irrespective of the negative effects on borrower/consumer/subject), the company assures itself that the Big Data system works. Data scientists in charge of putting together the mathematical model push aside any erroneous results as rare exceptions rather than looking upon them as an opportunity to incorporate a feedback loop that will provide better results in the next iteration of the mathematical model. The mathematical models used for teacher evaluations trade fairness for efficiency. While it is helpful to administrators to make decisions, those decisions have a high probability of turning out wrong.
College admissions in US are a rat race, with prospective students using college rankings published by US News and World Report, to narrow down their choices. Colleges try to manipulate their positioning in these rankings so they can attract more students. US News and World Report was a struggling magazine in 1983 when it came up with the idea to rank all the colleges and universities across US. In the beginning, it ranked the colleges and universities based on the results of a survey sent to school Presidents. When the rankings came out, school administrators started complaining that their school deserved a higher place in the list. To counter that complaint, the magazine picked a set of proxies and came up with a mathematical model in 1988. Proxies included SAT scores, student teacher ratios, acceptance rates, percentage of incoming freshmen who made it to sophomore year, percentage of those who graduated and percentage of living alumni who contributed to their alma mater. These proxies made up 75% of the ranking while the rest 25% of the ranking were made up of survey results from school administrators across US. The model used proxies that reflected the elite universities at that time - Harvard, Stanford, Princeton, Yale. All these elite institutions had students with high SAT scores, high graduation rates and wealthy alumni. The rankings did not take into account the cost of education at these universities. Once the rankings became popular, they set in motion a vicious feedback loop - showing up in the lower part of the rankings resulted in a serious dent in a college's reputation as a result of which student admissions dried up and professors stayed away from the school. Alumni stopped their contributions and the college's ranking tanked further. To counter it, administrators started to focus obsessively on their school's ranking in US News and World Report. Some colleges and universities including Bucknell University and Claremont McKenna College sent false test scores for their students to improve their ranking. Other colleges and universities like Baylor University paid students to retake their SATs so they would get a higher score. US News and World Report defended itself saying that the college rankings forced colleges and universities to set meaningful goals and focus on hitting them. Because US News and World Report used proxies in its model, colleges and universities were able to manipulate it to their heart's content. Colleges and universities across the US have invested in things furthest from educational quality - As an example, TCU in 2009 spent hundreds of millions of dollars to improve its fund raising (which also happened to be one of the proxies at that time). It succeeded in moving up the ranking and attracted a higher share of academically inclined students (going by higher SAT scores). Because cost of education was not included in the criteria for the mathematical model, colleges and universities have figured out that they could invest in the new stadium facilities, luxury dormitories and glass enclosed student centers and pass on the costs in the form of tuition and fees to the students.
Big Data has now extended from ranking colleges to ranking the incoming student body in these universities. Predictive analytics packages are now sold to college and university administrators that allows them to rank incoming prospects by geography, gender, ethnicity, field of study, academic standing or any attribute they can come up with (if they cannot collect data on an attribute, they resort to using proxies). Predictive analytics packages work under the assumption that US News and World Report rankings are the final word and come up with ways to target students that will bump up the college's rankings. This has in turn convinced students and parents to invest huge sums of money in
coaches and tutors in their desire to make sure their kid gets to the right college. Because the exact working of the algorithm is a mystery, parents and students take on large debt loads to gain admission into and finish their degree of study in colleges. When Obama administration pushed the idea of changing the college ranking criteria to include affordability, there was immediate push back from college Presidents. With no support forthcoming from US Congress, Obama administration released a massive amount of data regarding college admissions so parents and students themselves could analyze success rates of different colleges.
The obsession with college degrees has provided online education companies with a steady revenue flow. These companies use internet advertising and Big Data to target the most vulnerable and the poor and sell them on the benefits of an online degree. Even though the benefits of the online degree does not materialize(with very few exceptions), the student ends up in a sinkhole of debt. The target audience is usually immigrants and the poor for whom private universities look more valuable and trustworthy than public universities. Poor immigrants come from countries where very little money is invested in public education making private schools seem more worthy in their perception. Once the prospective student clicks on a targeted advertisement (or performs a Google search) for a for-profit institution, a recruiter for the online university sells them on the worth of a degree at that institution. To ensure better than average response rates, the online university runs repeated tests on what kind of targeted ads are most effective at attracting prospective students. Another method is to publish false or misleading claims about US government plans to radically change the education sector leading people to click on it and divulge personal information that then allows a recruiter to contact them. As an example, students in US use College Board website to sign up for SAT tests and research higher education options. That website generally directs poorer students towards for profit universities. Once the student indicates they are interested in getting a degree from the for profit university, they are evaluated for their financial ability. For profit universities have a good idea of the credit worthiness of the student since the student has already provided them their personal information. US government has a 90-10 rule according to which colleges cannot get more than 90% of their  funding from Federal student aid (which is a facility set up by US government to provide low interest loans to qualified students. At the end of the day, it is still debt, albeit at a lower interest rate than private student loans). For profit universities convince their students to scrounge up a small amount towards tuition which they then use as leverage to request the student to apply for federal student aid. This allows the for profit university to pocket the tuition and fees from the student (supported by the Federal government) while the debt repayment becomes the responsibility of the student(Student loans is one of the few loans in US that cannot be discharged even in personal bankruptcy). The sad part is that community college system in US is robust and provides the same level of education at a fraction of a cost of the for profit university.
Police departments routinely use predictive analytics and Big Data systems in their drive to reduce crime in their jurisdictions. The mathematical models used by Police are good at predicting violent crimes like homicide, arson and assault. However, Police have started using the same models to target lower level crimes like vagrancy, aggressive panhandling and trading in small quantities of drugs. That is the influence of Broken Windows Policing theory that was popularized by Bill Bratton in New York City in 1980's. Even after that theory has been shown to be over hyped for its modest influence on crime reduction, it continues to be an article of faith among Police and policymakers. By collecting data on all types of crimes, whether violent or non violent, Police are now deployed en masse even in case of minor crimes. The original authors of the Broken Windows Policing theory actually showed how Police could be effective by adjusting their enforcement according to local community standards. In their telling, broken windows policing would mean Police would leave addicts alone when they shoot up drugs sitting on steps but arrest them if they were sprawled down on sidewalk. The intent behind Broken Windows Policing theory was to uphold local community standards while being flexible enough to allow deviations from the norm. Collecting data on both major and minor crimes has increased the volume of data available for analysis. Because minor crimes make it easier for Police to target poorer sections of the society, they do not face any push back from the majority of voters. Using Big Data models allows a community to criminalize poverty while holding steadfastly to the belief that the tools they use are scientific and fair. Legal standards in democracies are comfortable sacrificing efficiency for fairness as explained by the oft repeated dictum (Blackstone's formulation) of 'It is better that ten guilty persons escape than that one innocent suffer'. Big Data systems go in the opposite direction, sacrificing fairness for efficiency. It is a choice made by the data scientists and public policy experts to include minor crimes in their models but exclude white collar crimes from it. Because white collar crimes are usually committed by well off people (who also vote), the lack of fairness in Big Data systems does not impact them at all. That explains why they are usually in favor of using these systems for crime prevention in their neighborhoods.
When a candidate applies for a job, their evaluation is nowadays predominantly handled by Big Data systems. Software used for recruiting routinely uses personality tests to evaluate potential candidates for a job. Because software programs cannot predict how a particular candidate would perform in that position in that company in future, they use proxies to arrive at an educated guess. Research has shown that personality tests do a poor job in predicting future performance of potential employees. Cognitive exams and reference checks have much higher success rates when it comes to predicting on the job performance of potential employees. Defenders of personality tests do not provide enough information to independently evaluate its efficacy thereby making it opaque. In contrast, Big Data systems in baseball publish their criteria and statistics for everyone to consume. A prospective employee rejected by a company's personality test might become a superstar in another company. In that situation, the company does not go back and look at its personality test to determine why the model filtered out the prospective employee. In contrast, Baseball teams go back and tweak their models when their results are proven wrong. A team might trade a player based on conclusions from its model. If that player proceeds to shine for the other team, the the team that traded that player away goes back to its model and looks at how future mistakes can be avoided. Companies argue that personality tests are a cost effective replacement for human resources professionals. However, the lack of feedback inherent in the model makes them rigid and inflexible. Companies generally view their employees as replaceable and seen in that light, Big Data systems are portrayed as a success by these companies as they do not quantify the loss to the company from rejecting a potentially superior candidate. These Big Data systems operate on a logical basis which makes them ripe for manipulation by prospective employees - anyone who has puffed up their resumes with trendy looking phrases has indulged in this. The belief of companies in their Big Data systems is very similar to the craze about the efficacy of Phrenology in 19th century, when scientists relied on the shape of a person's skull to divine their personality traits.
Once a person gets the job, their scheduling is handled by Big Data systems. Using mathematical models, companies are looking for ways to minimize the number of employees for any block of time that will generate maximum profit. This has resulted in employees being shunted around without any rhyme or reason. Employees, in turn, find it difficult to plan their personal life (with damaging effects on their children) which has knock on effects of their productivity. This is usually referred to as clopening - am employee who works late night shift is asked to open the shop next day morning shift as well for resource optimization reasons. Companies are always to looking to cut costs and improve their bottom line so they can be rewarded by their shareholders. Mathematical models used by companies for scheduling their employee work day aims to maximize efficiency and profitability while ignoring the well being of their employees. Research has shown that employees are most productive when they are allowed to take breaks and socialize with other team members on non work related activities. However, the mathematical models treat each employee as a singular unit whose primary purpose is to ensure profit maximization for the company. The software tools used to track employee behavior can sometimes be used in reducing employee headcount. Because these mathematical models do not take employee feedback into account, they usually live in a reality of their own, far removed the actual situation on the ground.
Consumers are at the mercy of Big Data systems when it comes to their credit health. Before Big Data systems became popular, a local banker would use his/her own model that operated on proxies like job situation, family dynamics, race and ethnic background to evaluate credit worthiness of a borrower. The downside of that approach was that minorities and women were routinely denied credit because they did not fit into the local banker's model. Earl Isaac and Bill Fair devised the FICO model to evaluate credit worthiness of a borrower. FICO model is currently used by credit rating agencies (Experian, Trans Union, Equifax) to come up with credit score for a borrower. FICO model is an example of Big Data systems and mathematical models operating correctly. Any errors in transactions reported to the credit unions can be challenged (Feedback loop), Credit companies can see who defaults on their loans and verify it against their credit scores. FICO website also provides ways in which borrowers can improve their credit score (Transparency). Recently, other companies have started building value added models that add new attributes to a borrower's credit score. An example is Neustar that uses its proprietary model based on credit score, to identify its most profitable customers for call center companies. These new scoring systems use geo tags and click streams to augment FICO model. These attributes are proxies and carry with them all the features that make them into WMDs. FICO model was a step forward from the local banker model in that it did not use proxies but used actual consumer transactions to come up with a score. The new scoring systems are a step back into the local banker era due to their reliance on proxies. Between the time a person aces the job process and is waiting for job offer, their FICO scores are used by the companies to evaluate their credit worthiness. This leads to poorer sections of the population facing a vicious cycle - a spotty credit record excludes them from a job while lack of a job forces them to indulge in risky financial behavior (payday loans) that further depresses their credit worthiness. Companies collect all kinds of data on consumer behavior and use it to build a new model for evaluating credit worthiness. An example is ZestFinance, a company that considers all data to be credit data. So, data on payment of phone bills can impact a borrower's credit score and the proprietary nature of these models makes any recourse for an affected borrower unlikely.
Insurance grew out of actuarial science that was first developed in Europe when middle class there became prosperous enough to plan for the future. A draper in London named John Graunt went through birth and death records and came up with the first study of mortality rates in 1682. Insurance industry does not attempt to predict the fate of each individual. It tries to predict the prevalence of tragic events (accidents, fires, deaths) among clusters of people. Big Data systems have allowed insurance providers to reduce the size of those clusters leading them to come up with their own categories. Big Data systems for insurance use proxies to evaluate risks inherent in a group of drivers. These proxies include demographic data and credit scores. They count far more than the actual driving record of a driver. For example, drivers in Florida with clean driving record and poor credit scores paid an average of $1,552 more than drivers with excellent credit scores and a drunk driving conviction. This approach aligns very well with the profit maximization drive of insurance companies - after all, someone with an excellent credit score will have enough money to dispute the claims of insurance companies. That avenue is foreclosed for someone with poor credit score. Insurance companies have started widely marketing a small telemetry unit to be placed in cars while providing discounted insurance rates in return. These units collect data on different attributes of the driver - geography, frequency of driving etc.,. This puts poorer drivers at a disadvantage - because of their chaotic work schedules, frequency of their driving causes their insurance rates to increase. Insurance companies offer this as an opt in feature currently. As they collect more and more data, these units will become a standard offering in insurance products, with the discounts going away. At that point, drivers who do not want these units will be charged more for that privilege. Wealthier drivers who can afford to pay more higher premiums will choose to maintain their privacy by opting out of these units.
Facebook and Google are textbook examples of WMDs in operation. Both of them are opaque (Google refuses to divulge its search ranking algorithm, Facebook refuses to divulge the algorithm of its news feed in the name of proprietary content), have massive presence (Facebook has more than a billion users) and are damaging (a negative portrayal on Facebook has serious real life consequences, lower ranking on Google search results severely impacts a business). In 2010 and 2012 US elections, Facebook performed experiments to improve voter turnout. Those experiments are considered to have increased voter turnout by 340,000 which in US elections is significant enough to swing the results. Elections are supposed to be free of private influence and Facebook experiments hastened that belief's demise. Russia took advantage of gullibility of Facebook by actively intervening in 2016 US Presidential election. Users of Facebook perceive it to be a neutral broker. However, Facebook is a publicly owner corporation making specific choices that determines what shows up on an user's news feed. Because this book was written in 2016 before the election of Donald Trump as President, it takes a neutral stance towards complicity of Google and Facebook in manipulation of the US Election system. While Democrats are furious at Facebook for tilting the elections towards Donald Trump, they are not blameless when it comes to using Big Data to their electoral benefit. Obama and Hillary political organizations used Big Data to drive voter enthusiasm and turnout during their campaigns. Big Data systems allow politicians to target micro groups of voters and ply them with different messages. Because of the extensive micro targeting indulged in US political campaigns, it has splintered the consensus among the people. A democratic government works under the assumption that people with common grievances can band together and change their situation.   
The author provides some solutions to countering the pernicious effects of Big Data in people's lives. She proposes data scientists hold true to something similar to Hippocratic oath for medical professionals, tailored for Data science. She also points to the necessity of including fairness and public good in mathematical models in addition to efficiency and profit maximization. She suggests conducting algorithmic audits on Big Data models. Companies think of their mathematical models as
proprietary. She suggests looking at outputs of the models to understand the criteria behind the model. Big Data models in certain sectors should be portrayed as what they actually are - simplistic models that serve to confirm inherent biases of decision makers. She insists that government has to play a strong regulatory role in ensuring algorithmic audits of Big Data models. Americans with Disabilities Act (ADA) needs to be updated to prohibit Big Data personality tests, health and reputation scores. Where health data of individuals is collected by employers, health apps and Big Data companies, HIPAA should be updated to extend protections to those individuals. US should seriously consider adopting the data privacy rules laid down by European Union (GDPR). 
 
Other Books for Reference:
You are not a gadget - Jaron Lanier
To Save Everything, Click Here: The folloy of Technological Solutionism - Evgeny Morozov
Dataclysm: Love, Sex, Race and Identity - What our online lives tell us about our offline selves - Christian Rudder
All You Can Pay: How Companies use our Data to Empty our Wallets - Anna Bernasek, D.T.Mongan

No comments: