Radar Trends to Watch: July 2023 – O’Reilly

AI Basics


A surprising number of AI entries are about generative models that don’t produce text or artwork, specifically those that produce human voices and music. Will voice be the next frontier for AI?Google’s AudioPaLM, which integrates speech recognition, speech synthesis and language modeling, could point the way for AI. There are also growing concerns about the impact of training AI on AI-generated data. With less input from real humans, would “model collapse” result in mediocre output at best?

AI

  • RoboCat is an AI model for learning how to learn and control robots. Unlike most robots that are designed to perform a small number of tasks, RoboCat can learn new tasks after deployment, and the learning process speeds up as it learns more tasks.
  • AudioPaLM is Google’s new language model that combines speech generation, speech understanding, and natural language processing. This is a large language model that understands and generates speech.
  • Voicemod is a tool that converts human voices into AI-generated voices in real time. The company offers a number of “Sonic Avatars” that can be further customized.
  • Thought tree prompts extend the thought chain by having the language model consider multiple inference paths in the process of generating output.
  • Facebook/Meta built a new speech generation model called Voicebox and claims it outperforms other models. They haven’t released an open source version. This paper describes several methods for distinguishing between generated speech and human speech.
  • The MIT Technology Review demystifies the key points of the EU’s draft proposals to regulate AI. It will likely take at least two years for the proposal to pass through the legislative route.
  • OpenLLM provides support for running a large number of large open source language models in production. This includes the ability to integrate with tools like Bento. Support for langchain is promised soon.
  • Infinigen is a photorealistic nature 3D scene generator. Designed to generate synthetic training data for AI systems. It currently generates natural phenomena such as terrain, plants, animals, and weather. Constructed objects may be added later.
  • Facebook/Meta created a new large-scale model called I-JEPA (Image Joint Embedding Predictive Architecture). It claims to be more efficient than other models and works by building higher-level models of the world in the same way humans do. This is the first step towards realizing Jan Lekun’s ideas for next-generation artificial intelligence.
  • MusicGen is a new generative model for music from Facebook/Meta. It sounds somewhat more compelling than other musical models, but it’s not clear if it can do much more than reframe musical clichés.
  • OpenAI now has a “function call” API. The API allows applications to write functions to models. When GPT needs to call one of these functions, it returns a JSON object describing the function call. Applications can call functions and return results to the model.
  • Research shows that AWS Mechanical Turk employees are using AI to do their jobs. Mechanical Turk is often used to generate or label training data for AI systems. How will using AI to generate training data impact future generations of AI?
  • What happens when a generative AI system is trained on generated data? When Copilot is trained on code generated by Copilot, or on web content generated by GPT-4 When trained? Model collapse: The “long tail” of the distribution disappears and the quality of the output decreases.
  • FrugalGPT is an idea to reduce the cost of using large language models like GPT-4. The authors use a pipeline of language models (GPT-J, GPT-3, GPT-4) and refine the prompts at each stage so that most of the processing is done in free or cheap models. is proposed to be
  • Deep Mind’s AlphaDev uses AI to speed up sorting algorithms. Their software worked at the assembly language level. Once complete, the code was converted back to C++, submitted to the LLVM project, and incorporated into the C++ standard library.
  • An artist used Stable Diffusion to create a functional QR code that is also a work of art and posted it on Reddit.
  • Movements to regulate AI need to learn from non-proliferation, where the key ingredients are not virtual harm (we all know what bombs do), but traceability and transparency. is sex. The dataset’s model card and datasheet are a good start.
  • Sam Altman talks about ChatGPT’s plans, saying that it is currently computationally constrained and needs more GPUs. This bottleneck is the latency of features such as custom fine-tuning of models, extended context windows, and multimodality (i.e. images).
  • Facebook/Meta’s LIMA is a 65B-parameter language model based on LLaMa, but without RLHF (Reinforcement Learning with Human Feedback) and fine-tuned based only on 1,000 carefully selected prompts and responses. rice field.
  • A few things have to happen. Gandalf is a prompt injection game. Your task is to force the AI ​​to reveal the password.

programming

  • Leptos is a new open-source, full-stack, fully-typed web framework for Rust. (How many days have passed since the last web platform?)
  • WebAssembly may replace containers in the not too distant future. Software deployed as WebAssembly is portable and much smaller.
  • Adam Jacob discusses reinvigorating DevOps with a new generation of tools that use insights from multiplayer games and digital twins.
  • Alex Russell, Improving web performance for the majority of users with mid-range or low-end smartphones: JavaScript is useful, but it’s a heavy burden for many sites.
  • Doug Crockford says it’s time to stop using JavaScript and move to a newer and better next generation programming language.
  • Wing is a new programming language with high-level abstractions for the cloud. The claim is that these abstractions make it easier to write cloud-native programs with AI code generation.
  • Simpleaichat is a Python package that simplifies writing programs that use GPT 3.5 or GPT 4.
  • StarCoder and StarCoderBase form an open source language model (similar to Codex) for writing software. It was trained on a “large collection of permissively licensed GitHub repositories with inspection tools and opt-out processes.”
  • How do you measure the developer experience? Metrics tend to be technical, personal issues such as developer satisfaction, friction they encounter on a daily basis, and other aspects of their real-world experience Ignored.
  • OpenChat is an open source chat console designed to connect to large language models (currently GPT-*). Anyone can create their own customized chat bot. It supports unlimited memory (using PineconeDB) and plans to add support for other language models.
  • WebAssembly promises to improve runtime performance and latency both in the browser and on the backend. It also promises to enable developers to create packages that can run on any environment, including her Kubernetes clusters, edge devices, and more. However, this feature is still under development.
  • People are starting to talk about software defined cars. This is an opportunity to either rethink security from the ground up or create a larger attack surface.
  • LQML is a programming language designed for the prompt language model. This is an early example of a formal informal language for communicating with AI systems.
  • Memory Spy is a web application that runs a simple C program and shows you how variables are represented in memory. You don’t have to be a C programmer to learn a lot about how software works. Memory Spy was created by Julia Evans (@b0rk). Julia’s latest ZINE of her on how computers represent integers and floating point numbers is also worth reading.

Augmented reality and virtual reality

  • David Pogue’s review of Apple Vision, the $3500 AR headset: Limited, reminiscent of the original iPhone – “But no headset or device has ever achieved such high numbers on an astounding scale. I never did.”
  • Apple did it. Announced AR/VR goggles. It’s very expensive ($3499), looks like a ski google, and has 2 hours of battery life with an external battery pack. Apple might manage to make it fashionable, but it’s hard to imagine wearing it in public.
  • Apple’s big problem with the Vision Pro goggles may be keeping people from using them. Developers may be able to create compelling apps. Converting a 2D app into a 3D environment is not enough. How can software actually take advantage of 3D?
  • Tim Bray’s post on what augmented reality is and what it demands of software developers is a must-read. It’s not Apple Vision.
  • Hachette created a metaverse experience named “Beyond the Pages” as part of an attempt to attract a younger audience. The original Experience was only open for two days, but he promised more on the schedule.

safety

  • Ransomware is getting faster, giving organizations even less time to respond to attacks. To avoid becoming a victim, focus on the basics like access controls, strong passwords, multi-factor authentication, zero trust, penetration testing, and good backups.
  • The number of attacks against systems running in the “cloud” is growing rapidly. The biggest danger remains errors in basic hygiene, such as misconfiguration of identity and access controls.
  • AI Package Hallucination is a new technique for distributing malware. Ask AI hallucinating questions about packages or libraries. Create a malware with that package name and place it in the appropriate repository. Wait for others to get the same recommendation and install malware. (This assumes AI hallucinations are consistent, but I’m not sure if that’s true.)

web

  • The new standard allows NFTs to contain wallets containing NFTs. Users build collections of related resources. This can be used for games (where a character “owns” his gear), as well as travel (including tickets to events) and customer loyalty his programs.
  • W3C has announced a new web standard for secure payment verification. This standard is intended to make checkout easier and less likely to be fraudulent.
  • Tyler Cowen argues that cryptocurrencies will play a role in transactions between AI systems. AI systems are not allowed to have their own bank accounts, and that is unlikely to change in the near future. But as it becomes more prevalent, it needs a deal.
  • Web and mobile performance aren’t discussed as much as they should be. Here’s a great post about improving Wikipedia’s performance by eliminating certain blocking issues: removing unnecessary JavaScript and optimizing what’s left.

quantum computing


Learn faster. Let’s dig deeper. Look further afield.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *