Firefox uses AI to make browsing more accessible

It seems like every piece of software these days is powered by AI, offering customers features of questionable quality, practicality, and security. Mozilla and its Firefox browser are not immune to the widespread use of AI, which the company wants to implement to improve accessibility.

In a recent post on Mozilla Hacks, Tarek ZIade explained how Firefox uses artificial intelligence to improve accessibility by providing AI-generated image captions for people who rely on assistive technologies like screen readers.

Image captions or “alt text” provide readers with necessary context, but unfortunately, many writers ignore alt text, resulting in almost half of images lacking a proper description. Recent advances in AI now allow us to run local machine learning models to auto-generate captions without sending potentially sensitive information to our servers.

Firefox 130 ships in the Nightly channel with a new feature in its PDF editor that uses a small open source Transformer-based machine learning model to generate alternative text. Mozilla claims that it is better at describing images without consuming a lot of resources, so Firefox users will be able to get image descriptions (for PDFs for the first time) even on less powerful devices.

According to the blog post, the tiny model can generate alternative text with over 200 million parameters, taking up less than 200MB of disk space, and delivering the output in a matter of seconds. Although it is less detailed and accurate than modern LLM Mastodons like the latest GPT-4o, the developers don't want to overwhelm users with too much information. So Firefox focuses on generating one-sentence descriptions like:

A group of people in an office are celebrating a birthday with a lit birthday cake in the foreground and a smiling woman in the background.

Using local models offers several benefits: increased privacy (images are not sent anywhere for processing), as well as improved resource efficiency, greater transparency, reduced CO2 emissions (training large models generates a lot of carbon emissions), and frequent updates with regular enhancements.

For more technical details, please see the official post.

Source link