Karen Hao, author of the New York Times bestseller Empire of AI: The Dreams and Nightmares of Sam Altman’s OpenAI, has spent years studying the human and environmental costs behind Silicon Valley’s rush to build out AI infrastructure and models. She was recently in India to attend Synapse, a society and technology conference, and spoke with Sujit John about the “imperial” nature of big tech and why India must forge its own AI path by focusing on small models and affordable infrastructure.
India is starting to see a massive ramp-up of AI data center infrastructure. How do you see it?
What I’ve seen in many countries that have this catch-up mentality is they look at the Silicon Valley model and say we need to catch up to it. And that in itself is what I criticize as imperialistic. You’re just taking an idea that came from Empire and adopting it as a template. Instead, we need to fundamentally rethink what AI should be for this country. I attended many sessions during the AI Summit. The session was about open source models and small scale models. For AI to work at scale in India, it can’t really be a large-scale language model (LLM) because it needs to run very cheaply. It needs to be able to be run on someone’s mobile device without the internet so that a farmer can detect a disease in a crop or a doctor in a rural area can. We actually use this kind of technology. If India were to think from scratch from that perspective, catching up in the AI era might look completely different. India already has all the necessary ingredients, a great talent base, infrastructure capabilities for small-scale models, and high-quality data to train these application-specific AI technologies. Then we wouldn’t need these big partnerships[with American companies]anymore, and we wouldn’t actually be surrendering our sovereignty.
Are any major countries adopting such a small-scale model approach? What is China doing?
China has very limited computing power, so they’re actually developing a lot of small-scale models. China also has a very different approach to AI development than Silicon Valley, which has a mindset of pushing technological progress just for the sake of technological progress. And now Silicon Valley is facing this problem of lack of product-market fit. So they’re trying to find ways to convince people to buy their products. During the AI Summit, Brad Smith (President of Microsoft) gave a keynote address and he used a very specific phrase: “Governments need to help create demand for our technology,” he actually said out loud, something that is not usually said. Chinese companies have a different mindset. Venture capital models and investment models are very different, so you can’t afford to build technology that no one will use. VCs don’t want to wait years to get their money back. So they’re thinking more about what applications will meet users where they are. And in many cases, those applications don’t require the scale that American companies are building.
But they are also building large language models…
They’re building it too, but it’s actually much cheaper to compute. This is what DeepSeek confirmed. DeepSeek has the same functionality but is significantly cheaper to build and run. And that’s why so many companies in the U.S. that are now consumers of AI technology rather than model developers are using the Chinese model instead of the Silicon Valley model. Because the quality is the same or sometimes better and the price is lower. I’m not saying everyone should use the Chinese model, but this is an example of how countries and companies are actually fundamentally rethinking the models that work for us. And I think India has the potential to do the same.
In India, the work of annotating data is often done by the very poor, primarily women, and tends to be treated as work to improve their living conditions. But you criticized it, characterizing it as a form of “modern imperialism.”
What I show over and over again in the book is that as the technology industry’s desire for resources accelerates exponentially, where are they going to look for those resources? Whether it’s human resources, such as the labor needed to train the models, or physical resources, such as the minerals and energy resources that are mined, they always go to the poorest communities. India is a huge hub of labor supplying these companies. And the way they treat that labor is terrible. Women from some of India’s poorest regions are being forced to moderate pornographic and child sexual abuse content, a new study published in the Guardian has found. We’re now talking about video generation models that can produce this kind of stuff, so companies that want to make sure their models don’t produce this kind of stuff are building content moderation filters that are trained by humans in these very poor contexts. And women protested and said this is affecting my well-being, my family’s well-being, and it’s destroying our community. Because I was never told that this is part of the job, they’re being told, your job is data annotation, this is data annotation.
What is the answer to these content management problems?
The reason there is all this harmful content that needs to be filtered out is because these companies are training their models all over the internet. That’s why harmful content exists. If we’re talking about a very application-specific technology, such as helping farmers detect diseases in crops, why are we training in porn? It won’t help detect crops. You don’t have to deal with content moderation because you just take selected photos of different types of diseases in different types of crops and train the model. But companies are taking the opposite approach, glossing over everything and then hoping that the poorest communities become human shields for everyone.
