Salesforce AI Introduces CodeT5+: A New Family of Open-Code Large-Scale Language Models with Encoder/Decoder Architecture

Screenshot 2023-05-20 at 4.46.14 PM — https://arxiv.org/abs/2305.07922

Modern Large Language Models (LLMs) deliver superior performance on code reading and generation tasks, allowing more people to enter the once-mysterious field of computer programming. Architecturally, existing code LLMs use encoder- or decoder-only models, which are only good for some comprehension and generation tasks. Code-centric LLMs typically have limited pre-training objectives, resulting in poor performance on downstream tasks that are less relevant to those objectives. Also, they often employ encoder-only or decoder-only architectures, which can limit optimal performance. specific task.

Salesforce’s AI research team announces CodeT5+. This is an innovative family of encoder/decoder code foundation LLMs that can be easily customized for exceptional performance on a variety of code interpretation and generation tasks. To do this, the team provides CodeT5+ with a wide range of pre-training targets on unimodal and bimodal data, and code LLMs that can be easily adapted to different downstream tasks.

What is CodeT5+

🚀 Check out 100’s of AI Tools at the AI Tools Club

CodeT5+ is a large set of language models for analyzing and generating code. This framework incorporates a wide range of unimodal and bimodal pre-training targets. CodeT5+’s modules can be flexibly separated and recombined to meet the needs of various zero-shot, fine-tuning, and instruction-coordination applications.

Decoders are trained to provide different outputs based on a pre-training learning task, while encoders prefer to encode contextual representations from code/text sequences (whole, partial, or span-masked sequences). learn.

CodeT5+ is first pre-trained on large unimodal data from public platforms such as GitHub. To teach the model how to recover code context for code spans, subprograms, and whole programs, this pretraining employs different objectives such as span denoising, decoder-only causal LM, and seq2seq causal LM tasks. increase.

The second stage of pre-training uses bimodal data in text code, or a combination of text and code that provides code-function semantics. To enhance its cross-modal comprehension and creation capabilities, CodeT5+ is pre-trained on cross-modal contrastive learning, matching, and causal LM tasks.

CodeT5+ can adapt its performance to different tasks thanks to a two-stage pre-training procedure involving a seq2seq generation task, a decoder-only activity, and a comprehension-based task.

In an empirical study, the team compared CodeT5+ against 20 benchmark datasets and state-of-the-art code LLMs such as LaMDA, GPT, and StarCoder for tasks such as zero-shot, fine tuning, and instruction tuning. Competing with OpenAI’s robust code-cushman-001 model, CodeT5+ achieved state-of-the-art (SOTA) results on his HumanEval coding task in zero shots.

In summary

CodeT5+ is a new family of open-source large-scale language models with an encoder-decoder architecture that offers several modes (encoder-only, decoder-only, and encoder-only) to provide different code interpretation and generation activities. -decoder). . CodeT5+ was trained using a variety of pre-training tasks such as span denoising, causal language modeling, contrastive learning and text code matching to achieve a comprehensive understanding of both unimodal and bimodal code text data. to get

This work demonstrates that the proposed CodeT5+ open code LLM can flexibly operate in encoder-only, decoder-only, and encoder-decoder modes to support SOTA performance across a wide range of downstream code jobs and even improve SOTA performance. It shows that you can reach The team believes that CodeTs+ can be deployed as an integrated search augmentation generation system, so they are open sourcing all CodeT5+ models to facilitate further research.

Please check paper and Github link.don’t forget to join 21,000+ ML SubReddit, Discord channeland email newsletterShare the latest AI research news, cool AI projects, and more. If you have any questions regarding the article above or missed something, feel free to email us. Asif@marktechpost.com

🚀 Check out 100’s of AI Tools at the AI Tools Club

Dhanshree Shenwai is a computer science engineer with extensive experience in FinTech companies covering the fields of finance, cards and payments, and banking, with a strong interest in AI applications. She is passionate about exploring new technologies and advancements in today’s evolving world to make life easier for everyone.

➡️ Introducing Bright Data: The World’s #1 Web Data Platform

Source link