Google releases Gemini Embedding 2 for multimodal AI applications

google has released Gemini Embedding 2. this is, gemini architecture.

This model extends beyond previous text-only embedding systems by mapping text, images, video, audio, and documents into a single unified embedding space. Captures semantic meaning across over 100 languages and supports AI tasks such as search augmented generation (RAG), semantic search, sentiment analysis, and data clustering.

Gemini embedding 2

Gemini Embedding 2 uses the multimodal capabilities of the Gemini architecture to generate embeddings from different types of data.

This model supports interleaved multimodal input, allowing developers to combine inputs such as text and images in a single request. This allows the system to capture relationships between different media types and process datasets containing multiple formats.

Main features

Multimodal input support

Text: Supports up to 8,192 input tokens
Images: Processes up to 6 images per request and supports PNG and JPEG formats.
Video: Supports up to 120 seconds of video input in MP4 and MOV formats
Audio: Process audio directly without the need for transcription.
Documentation: Supports embedding PDF files up to 6 pages

Interleaved multimodal input

This model can handle multiple media types within a single request, allowing for contextual understanding between inputs such as images and text.

Matryoshka Representation Learning (MRL)

Gemini Embedding 2 includes the following features Matryoshka expression learningThis allows the embedding vector to be scaled across different dimensions. The default dimension is 3,072, and developers can reduce the size to manage storage and performance requirements.

Recommended output dimensions:

Model features

According to Google, this model introduces multimodal embedding support across text, image, video, and audio tasks, and also adds native audio processing capabilities.

Supported use cases

Search extension generation (RAG)
Semantic search
sentiment analysis
data clustering
Large scale data management

availability

Gemini Embedding 2 is available in public preview below. Gemini API and Vertex AI. Developers can access models through integration with frameworks and vector database tools, including:

rung chain
llama index
haystack
Weaviate
quadrant
chroma DB

This model can also be used with vector search systems for multimodal data processing.

Source link

binance "oppna konto commented on Forget Ray-Ban Meta smart glasses. We tested cheaper ones that support ChatGPT.: Thanks for sharing. I read many of your blog posts
Binance账户 commented on The Smartest Man Who Ever Lived: Your point of view caught my eye and was very inte
打开Binance账户 commented on Top 10 Machine Learning Jobs with the Best Salaries in 2023: Your point of view caught my eye and was very inte
binance Registrera dig commented on Generative-AI-Jobs: Die 11 gefragtesten KI-Berufe: Thanks for sharing. I read many of your blog posts
create a binance account commented on WHOOP 4.0 review: Fitness tracker brand launches new AI features: Can you be more specific about the content of your

Google releases Gemini Embedding 2 for multimodal AI applications

Gemini embedding 2

Main features

availability

RECENT POSTS

How AI is turning weeks of cybercrime into days

DoorDash CEO says AI coding alone is not enough to improve engineering productivity

Learn AI basics for free and get certified – Letem svetem Applem

Gemini embedding 2

Main features

availability

Related Posts