Download PDF

Increased datafication of debt raises ethical questions and calls for a new approach to regulating lending

Throughout history, society has debated the morality of debt. In ancient times, debt—borrowing from another on the promise of repayment—was viewed in many cultures as sinful, with lending at interest especially repugnant. The concern that borrowers would become overindebted and enslaved to lenders meant that debts were routinely forgiven. These concerns continue to influence perceptions of lending and the regulation of credit markets today. Consider the prohibition against charging interest in Islamic finance and interest rate caps on payday lenders—companies that offer high-cost, short-term loans. Likewise, proponents of debt forgiveness appeal in part to morality when they advocate relieving hard-up debtors of the burden of unsustainable debt.

“Datafied” lending

In much of this debate, the principal moral value at play is fairness; specifically, distributional fairness. Debt is deemed to be unfair and thus immoral because of the inequality of knowledge, wealth, and power between borrowers and lenders, which lenders can and often do exploit. Recent technological advances in lending have added new dimensions to debt’s morality. Notably, the datafication of consumer lending has amplified moral concerns about harm to individual privacy, autonomy, identity, and dignity. Datafication in this context describes the rapidly growing use of personal data for consumer credit decision-making—particularly “alternative” social and behavioral data, such as a person’s social media activity and mobile phone data—together with more sophisticated data-driven machine learning algorithms to analyze those data (Hurley and Adebayo 2017).

These techniques enable lenders to predict the behavior of consumers and shape their financial identities in much more granular ways than in the past. For example, it has been shown that borrowers who use iOS devices, have larger and more stable social networks, or spend more time scrolling through a lender’s terms and conditions are more likely to be creditworthy and repay debt on time (of course, many of these variables proxy for fundamental credit life-cycle variables, such as income). Innovation in datafied lending has been driven largely by fintech start-ups, particularly peer-to-peer lending platforms such as LendingClub and Zopa and Big Tech companies like Alibaba/Ant Group. However, alternative data and machine-learning techniques are increasingly being adopted by traditional bank lenders, as highlighted by recent surveys from the Bank of England and the Cambridge Centre for Alternative Finance.

These practices diminish consumers’ ability to craft their own identity as they become increasingly chained to their “data self,” or algorithmic identity. Moreover, the ubiquitous collection of data and surveillance that fuels datafied lending constrains consumers from acting freely lest their actions negatively affect their creditworthiness. And the commodification of certain types of personal data for lending decisions raises moral concern about harm to individual dignity. Is it moral for lenders to use highly intimate health and relationship data—for example, captured from social media and dating apps—to determine consumer creditworthiness? Consumers may willingly share their data in specific contexts and for specific purposes, such as to facilitate online dating and social interaction. However, this does not imply that they consent to the use of that information in new contexts and for different purposes, particularly commercial purposes such as credit scoring and marketing.

Datafication also amplifies existing concerns about fairness and inequality in consumer lending. Lenders are prone to abuse data-driven insights, for example, to target the most vulnerable consumers with unfavorable credit offers. Data-driven profiling of borrowers also facilitates more aggressive and intrusive debt-collection practices against the poor. And more accurate screening and price discrimination using alternative data and machine learning increase the cost of borrowing for consumers previously subsidized by hidden information (Fuster and others 2020).

In addition, increasingly data-driven, algorithmic lending could amplify unfairness as a result of racial and gender-based discrimination, as highlighted by the recent Apple Card debacle, when women were offered smaller lines of credit than men. In particular, biases and proxy variables in the data used to train machine-learning models could exacerbate indirect discrimination in lending against minority groups—particularly where the data reflect long-standing structural discrimination. Alternative data, such as social media data, are typically more feature-rich than financial credit data and thus embed more proxy variables for protected characteristics, such as race and gender. The limited interpretability of certain machine-learning methods (such as deep neural networks) could impede efforts to detect discrimination by proxy. Deploying these machine-learning models without rigorously testing their results, and without meaningful human oversight, therefore risks reinforcing social biases and historical patterns of unlawful discrimination, perpetuating the exclusion of less-advantaged and minority groups from consumer lending markets.

Yet the datafication of consumer lending could also uphold the morality of debt, by improving other dimensions of distributional fairness in consumer credit markets. Notably, more accurate credit assessment thanks to machine learning and alternative data in algorithmic credit scoring will improve access to credit, particularly for (creditworthy) “thin-file” and “no-file” consumers previously locked out of mainstream credit markets because of insufficient credit data, such as a credit history (Aggarwal 2019). Estimates from Experian and the US Consumer Financial Protection Bureau suggest, respectively, that nearly 10 percent of the UK population, and nearly 15 percent of the US population, have thin files or no files (also described as “credit invisibles”) and lack access to affordable credit. In developing economies, this figure is several times greater. According to the World Bank Global Financial Inclusion Index, more than 90 percent of people living in south Asia and sub-Saharan Africa lack access to formal credit.

Given that these consumers are often the least-advantaged members of society, typically from ethnic minority and lower-income groups, improving their access to credit supports financial inclusion and enhances fairness—as well as efficiency—in consumer lending markets. Datafied, algorithmic lending also stands to support fairness by reducing more visceral forms of direct discrimination in lending—for example, stemming from sexist or racist preferences of a (human) loan officer (Bartlett and others 2017). Moreover, better access to credit and the accompanying opportunities can enhance the autonomy and dignity of consumers.

More broadly, the digitalization and automation of lending stand to increase financial inclusion by reducing transaction costs and making it more feasible for lenders to extend small-value loans and reach consumers traditionally excluded from borrowing by their remote physical location (for example, a lack of bank branches in “banking deserts”). Data-driven technology also can support financial inclusion by improving consumer financial literacy and personal debt management. For example, automated saving and debt pay-down features of many fintech credit apps can help overcome some of the more common behavioral biases that undermine sound personal financial management.

Recasting regulation

The rise of machine learning and datafied lending renders the morality of debt much more nuanced. The Goldilocks challenge for regulators is to find the right balance between the benefits and harms of datafied lending. They must protect consumers from its greatest harms—in terms of privacy, unfair discrimination, and exploitation—while still capturing the key benefits, particularly improved access to credit and financial inclusion. However, existing regulatory frameworks governing consumer credit markets and datafied lending in places such as the United Kingdom, United States, and European Union do not strike the right balance. In particular, they do not sufficiently alleviate the privacy, autonomy, and dignity harms of datafied lending.

The Goldilocks challenge for regulators is to find the right balance between the benefits and harms of datafied lending.

The prevailing approach to regulating consumer privacy in these jurisdictions is distinctly individualistic. It relies on consumers to consent to all aspects of data processing and to self-manage their privacy—for example, by exercising their right to access, correct, and erase their own data. However, this approach cannot protect consumers in ever-more-datafied consumer credit markets. These markets display steep asymmetries of information and power between borrowers and lenders, negative externalities related to data processing, and biases that impede consumer decision-making, such that individuals cannot on their own safeguard their privacy and autonomy.

In a new article in the Cambridge Law Journal, I recommend ways to address these inadequacies and close the privacy gap in consumer credit markets through substantive and institutional regulatory reforms (Aggarwal 2021). To begin with, a more top-down regulatory approach is needed. Firms should be subject to more rigorous obligations to justify the processing of personal data under the paradigm of datafied lending. This should include stricter ex ante restrictions on the types and granularity of (personal) data that can be used for credit decision-making. For example, the use of intimate, feature-rich data, such as social media data, should be explicitly prohibited, and anonymization of personal data should be the default.

Firms should, moreover, bear a higher burden of proof regarding the necessity and proportionality of processing personal data and thus their encroachment on consumer privacy. This should include stricter, ongoing model validation and data quality verification obligations, particularly for nonbank fintech lenders. For example, in the context of algorithmic credit scoring, lenders should be required to demonstrate that the processing of alternative data yields a sufficiently significant improvement in the accuracy of creditworthiness assessment.

These reforms should be accompanied by changes to the regulatory architecture to improve the enforcement of consumer privacy protection in consumer credit markets. In particular, regulatory agencies responsible for consumer financial protection, such as the UK Financial Conduct Authority, should have expanded authority to enforce privacy and data protection in consumer credit markets. I argue that data protection is consumer financial protection. Given their expertise and experience working with consumer credit firms, sectoral agencies are in many ways better positioned than cross-sectoral data protection and consumer protection agencies to enforce data protection in consumer financial markets. However, they should continue to collaborate with cross-sectoral regulators, such as the UK Information Commissioner’s Office, that have expertise in data protection regulation.

Of course, these reforms are not needed only for datafied consumer lending and its regulation. To truly safeguard the privacy of (credit) consumers, stricter limits on the processing of personal data are called for in all contexts, not only consumer credit markets, and on all actors in the development life cycle of consumer-facing information systems. Likewise, in an increasingly datafied economy, the optimal institutional arrangement for data protection regulation entails a greater role for sectoral regulators and deeper collaboration between sectoral and cross-sectoral regulators everywhere—not just in consumer credit markets.


NIKITA AGGARWAL is a research associate at the Oxford Internet Institute’s Digital Ethics Lab, University of Oxford.


Aggarwal, Nikita. 2019. “Machine Learning, Big Data and the Regulation of Consumer Credit Markets: The Case of Algorithmic Credit Scoring.” In Autonomous Systems and the Law, edited by N. Aggarwal, H. Eidenmüller, L. Enriques, J. Payne, and K. van Zwieten. Munich: C. H. Beck.

———. 2021. “The Norms of Algorithmic Credit Scoring.” Cambridge Law Journal.

Bartlett, Robert, Adair Morse, Richard Stanton, and Nancy Wallace. 2017. “Consumer-Lending Discrimination in the FinTech Era.” University of California, Berkeley, Public Law Research Paper.

Fuster, Andreas, Paul Goldsmith-Pinkham, Tarun Ramadorai, and Ansgar Walther. 2020. “Predictably Unequal? The Effects of Machine Learning on Credit Markets.

Hurley, Mikella, and Julius Adebayo. 2017. “Credit Scoring in the Era of Big Data.” Yale Journal of Law and Technology 18 (1): 147–216.

Opinions expressed in articles and other materials are those of the authors; they do not necessarily represent the views of the IMF and its Executive Board, or IMF policy.