Become an architect! Discover Generative AI on AWS

Generative artificial intelligence (generative AI) is a type of AI used to generate content such as conversations, images, videos, and music. Generative AI can be used directly to build customer-facing capabilities (chatbots, image generators) or serve as an underlying component of more complex systems. For example, you can generate embeddings (or compressed representations) and other artifacts needed to improve downstream machine learning (ML) models or backend services.

With the advent of generative AI, it's important to understand what it is, how it works under the hood, and what options are available to bring it into production. In some cases, it can also be helpful to move closer to the underlying model to fine-tune or drive domain-specific improvements. In this edition, Become an architect!Let's explore these topics and share a set of initial methodologies for bringing generative AI into production. We'll start with a broad introduction to the domain and share a mix of videos, blogs, and practical workshops.

Many teams are turning to open source tools running on Kubernetes to accelerate their ML and generative AI efforts. In this video session, the expert discusses why his Kubernetes is perfect for ML and tackles challenges like dependency management and security. Learn how tools like Ray, JupyterHub, Argo Workflows, and Karpenter can accelerate your journey to building and deploying generative AI applications on Amazon Elastic Kubernetes Service (Amazon EKS). A real-world example shows how Adobe leveraged Amazon EKS to accelerate time to market and reduce costs. We also introduce Data on EKS, a new AWS project that provides best practices for deploying a variety of data workloads on Amazon EKS.

Take me to this video!

Figure 1. Containers are powerful tools for creating reproducible research and production environments for ML.

This video session is designed to thoroughly explore emerging concepts in generative AI. By delving into real-world applications and detailing implementation best practices, this session will provide concrete understanding to help businesses leverage the full potential of these technologies. You'll gain valuable insights into navigating the complexities of generative AI, equipping you with the knowledge and strategies you need to stay ahead of the curve and harness the transformative power of these new techniques. If you want to learn more, check out this Generative AI Best Practices post.

Take me to this video!

Models are growing rapidly, and as functionality improves, the cost of production increases.

Figure 2. The model is growing exponentially. Improving functionality comes at a higher cost of production.

Handling AI/ML workloads and generated AI in production requires proper system design and careful consideration of tenant isolation in a SaaS context. How to map different tenants to models, how to scale inference, how to integrate your solution with other upstream/downstream services, and how to build large language models ( LLM). needs.

This video dives into the concept of multi-tenancy for AI/ML workloads. This includes common design, performance, isolation, and experience challenges you'll find during your work. You will also understand concepts such as RAG (used to enrich LLM with contextual information) and fine-tuning through real-world examples.

Take me to this video!

To support different tenants, you may need to use RAGs to obtain different contextual information or provide different options for fine-tuning.

Figure 3. To support different tenants, you may need to use RAGs to capture different context information and provide different options for fine-tuning.

As the research in the Accelerate book shows, DevOps Research and Assessment (DORA) metrics, which measure key DevOps performance indicators such as lead time, are essential to engineering practices. By leveraging generative AI technology, the zAdviser Enterprise platform can now provide detailed insights and actionable recommendations to help organizations optimize their DevOps practices and drive continuous improvement. This blog shows how generative AI can be applied to a wide range of areas beyond language and image generation.

Take me to this blog post.

Figure 4. Generative AI is used to summarize, analyze, and provide recommendations for improvement based on DORA metrics.

Hands-on Generative AI: AWS Workshop

Hands-on is often the best way to understand how everything actually works and create a mental model that connects the theoretical foundations to real-world applications.

Generative AI in Amazon SageMaker shows you how to build, train, and deploy generative AI models. You can learn about options to fine-tune it based on your needs, use an existing out-of-the-box model, or even customize an existing open source model.

Building with Amazon Bedrock and LangChain shows how you can use existing fully managed services provided by AWS when working with the underlying model, covering a variety of use cases. Also, if you want a quick guide for quick engineering, check out the PartyRock lab in the workshop.

An example of image replacement in the workshop.

Figure 5. Examples of image replacements found in the workshop.

See you next time!

thank you for reading! We hope you've gained insight into the applications of generative AI and discovered new strategies for using it. In our next blog, we'll dig deeper into machine learning.

To revisit previous posts or explore the entire series, please visit: Become an architect! page.

Source link