
Social protection systems provide critical support during crises, boost productivity and protect vulnerable populations. The COVID-19 pandemic has raised global extreme poverty for the first time in 20 years, making the need for effective social protection programs more urgent than ever. However, targeting eligible households in low- to middle-income countries is very difficult. This is because traditional administrative data such as tax records are often unavailable due to the high proportion of informal workers.
The paper, published by researchers at the University of California, Berkeley, and the World Bank, finds the use of machine learning for unconventional managed data, such as call detail records (CDRs) from a major mobile operator in Afghanistan, as a promising solution. It shows that it is floating. The government’s anti-poverty program targets the poorest households. CDRs contain information about phone numbers, communication patterns, contact networks, charging patterns, and more.
This paper evaluates and compares three methods for correctly identifying ultra-poor households. supervised machine learning models trained on CDR data, asset-based wealth indices, and consumption metrics commonly used as proxies for measuring low-income poverty. – and middle-income countries. A supervised machine learning algorithm was trained on 797 behavioral metrics computed from CDR data using a gradient boosting model that outperforms other common machine learning algorithms. This includes communication patterns, contact networks, spatial patterns, and recharging patterns. In addition, this paper leverages all three of his methods above to examine the accuracy achieved by the combined method of classifying ultra-poor and non-ultra-poor households using logistic regression. To assess the accuracy of each method, the study employed ROC and precision-recall curves and calculated the standard deviation of the accuracy index using 1000 bootstrap samples.
The accuracy of the CDR-based method for identifying ultra-poor households was found to be comparable to the other two methods, achieving an accuracy and recall of 42% (versus 49% for the asset-based method and 49% for the consumption-based method). method was 45%). base method). The trade-off between false positives and false negatives was assessed using ROC curves and area under the curve (AUC) scores were also found to be comparable between the methods, with the asset-based method being slightly less expensive than the consumption-based and CDR. was excellent. (AUC=0.73, 0.71, and 0.68, respectively).
The combined method, which used logistic regression to classify ultra-poor and non-extremely poor households using all three methods, showed the most promising result with an AUC of 0.78, with either one or two data sources. better than the individual methods using one. However, collecting consumption data for a large population is impractical, so a combined method using only CDR and wealth data may be the most feasible option (AUC=0.76).
Another important advantage of CDR-based targeting compared to currently used methods (surrogate testing, community-based targeting, or consumption-based targeting) is the need for targeted program implementation. It saves both time and marginal cost. For example, community-based targeting and surrogate mean testing are estimated to add $276,000 and $503,000, respectively, representing 2.18% and 3.97% of the total program budget, while CDR household screening The marginal costs are: can be ignored.
Using CDR data for targeting raises ethical concerns and limitations that must be considered. First, if you need access to phone data and the data is not available to some segments of the population (for example, if you don’t have a phone or your particular provider doesn’t allow access to the data), Targeting accuracy is reduced. Second, CDR-based targeting involves access to sensitive and personal information such as phone numbers and geolocation traces, requiring informed consent and clear privacy standards that do not exist today. . One potential solution to mitigating privacy risks is data minimization. This limits the model to features that pose the least risk to privacy, but this leads to less accurate targeting. Finally, his use of the CDR for program eligibility may create incentives for strategic actions by individuals wishing to manipulate the system, such as refraining from using their phones. Even though complex machine learning algorithms can reduce the scope of operations, societies often demand transparency in algorithmic decision-making because black-box decisions are difficult to audit and explain.
In conclusion, the integration of machine learning and CDR data has the potential to revolutionize the targeting of economic interventions or aid programs by reducing costs and complementing existing survey-based methods. However, there are practical and ethical issues to consider, such as access to data, privacy issues, and potential data manipulation. It is essential to weigh these limitations against the potential benefits of CDR-based targeting in each particular context. As machine learning continues to evolve and shape our world, it is important to approach its applications thoughtfully and responsibly, in line with ethical standards, and prioritizing the well-being of individuals and communities.
check out paperdon’t forget to join Our 19k+ ML SubReddit, cacophony channeland email newsletterWe share the latest AI research news, cool AI projects, and more. If you have any questions about the article above or missed something, feel free to email me. Asif@marktechpost.com
🚀 Check out 100 AI Tools in the AI ​​Tools Club
Nathalie Crevoisier holds a Bachelor’s and Master’s Degree in Physics from Imperial College London. She studied Applied Data Her Science, Machine Learning, and Internet Analytics at the Polytechnic Federal Institute of Lausanne (EPFL) for her one year as part of her degree. While she was in school, she developed a strong interest in AI, and after graduating, she joined Meta (formerly Facebook) as a data scientist. During her four-year tenure at the company, Nathalie worked on various teams, including Ads, Integrity, and Workplace, applying cutting-edge data science and her ML tools to dozens of We’ve solved a complex problem that affects billions of users. Seeking independence and time to stay on top of her latest AI discoveries, she recently decided to transition to her freelance career.
