Applications of large-scale language models in biotechnology and pharmaceutical research

Machine Learning


progen

Progen is a deep learning LLM that can generate protein sequences with predictable function across large protein families. ProGen has been trained on his 280 million protein sequences from over 19,000 families, and the model is enriched with control tags that specify protein properties. ProGen can be fine-tuned to use specific sequences and tags to create more accurate protein sequences.

Chemclaw

LLM has shown good performance on tasks in a wide variety of fields, but often struggles with chemistry-related problems. Furthermore, these models do not have access to external sources, which limits their usefulness in scientific research. ChemCrow is an LLM chemistry agent aimed at solving this problem. This model is designed to accomplish tasks spanning drug discovery, organic synthesis, and materials design.

13 expert-designed tools have been integrated to create ChemCrow to enhance your performance in chemistry. This model has features that assist expert chemists and lower the barriers for non-experts. Moreover, bridging the gap between experimental and computational chemistry can facilitate scientific progress.

ChatGPT in drug discovery

🚀 Check out 100 AI Tools in the AI ​​Tools Club

Researchers at Michigan State University investigated the use of ChatGPT in drug discovery. They got the following results:

  • ChatGPT can be fine-tuned based on scientific literature and can be used to generate summaries of the latest research on specific diseases. This helps researchers identify new potential targets or better understand the current state of research in a particular area.
  • By training ChatGPT on a set of established drug-like molecules, it is possible to generate new chemical structures with similar properties. This approach helps scientists identify new lead compounds with high success rates in preclinical and clinical studies.
  • ChatGPT predicts the pharmacokinetics and pharmacodynamics of new drugs and supports virtual screening of chemical libraries in early-stage drug discovery.
  • ChatGPT can be trained on a dataset of toxicity data and then used to predict potential toxic effects of new drugs.

Using ChatGPT/GPT-4 in Computational Biology

Below are some of the ways computational biologists are using ChatGPT/GPT-4 to optimize their workflows.

  • ChatGPT can improve the readability and documentation of your code.
  • ChatGPT helps you write efficient code.
  • Researchers can integrate ChatGPT into their IDEs via plugins for RStudio and Visual Studio Code.
  • ChatGPT can improve your scientific writing by helping you express your ideas more clearly.
  • ChatGPT can be used to clean and refine your data.
  • ChatGPT can suggest new visualization techniques and enhance existing numbers, so you can improve your data visualization.
  • Using the GPT API, you can fine-tune the system for your specific application, and adjust parameters to control the creativity and repeatability of your response.

ChatGPT in bioinformatics

A group of researchers demonstrated the feasibility of using ChatGPT in bioinformatics education to help students generate code for scientific data analysis tasks. In their study, ChatGPT generated code to align short reads to the human reference genome and summarized the alignments into genome-wide counts.

ChatGPT can also assist students with genealogy analysis. The researchers used the R code generated by the model to create phylogenetic trees for nine species. In their study, the researchers also showed that ChatGPT can act as a virtual teaching assistant teaching students the divide-and-conquer method.

ChatGPT in drug development

A group of researchers demonstrated the effectiveness of ChatGPT in predicting and describing common drug-drug interactions (DDIs). They generated a total of 40 DDI lists from previously published literature. Their study showed that ChatGPT is partially effective in predicting and explaining DDI.

Patients without ready access to medical facilities can get information about DDI with the help of ChatGPT. However, models may provide imperfect guidance. Therefore, further improvements are needed to potentially allow it to be used by patients to get an idea about DDI.

ChatGPT at Pharmacometrics

Below are some examples of how ChatGPT can be used in pharmacometrics.

  • ChatGPT can accurately obtain typical PK parameters from scientific literature.
  • This model can generate a population PK model in R.
  • ChatGPT can develop interactive Shiny applications for visualization.
  • ChatGPT allows you to develop R code with minimal coding knowledge. It also makes debugging errors just as easy.

GeneGPT is a new way to teach LLMs to utilize the National Center for Biotechnology Information (NCBI) Web APIs to answer genomics questions. GeneGPT achieved state-of-the-art results on 75% of one-shot tasks and 80% of zero-shot tasks on the GeneTuring dataset. GeneGPT has the potential to enhance LLM with domain tools that improve access to biomedical information.

CancerGPT is a first-of-its-kind few-shot learning model that utilizes LLM to predict drug pair synergy in rare tissues that lack structured data and functionality. It contains about 124M parameters, comparable even to the fine-tuned larger GPT-3 model with 175B parameters. CancerGPT demonstrates the potential of LLM to provide an alternative approach to biological reasoning.

ChatGPT in medical research

ChatGPT can analyze large amounts of data, including scientific articles, medical reports, and patient reports. All this analysis can provide new insights into orthopedic disease symptoms and treatment options.

ChatGPT can extract relevant information from text and present it in a structured format. ChatGPT also helps researchers develop new hypotheses. Additionally, ChatGPT helps develop clinical decision-making and support systems by analyzing patient records and identifying common patterns.

ChatGPT in medicine

ChatGPT can keep researchers informed about the latest literature in a particular field. You can write a patient discharge summary after surgery. This model can assist with patient discharge notes, summarize recent trials, or provide information on ethical guidelines.


don’t forget to join Our 20k+ ML SubReddit, cacophony channeland email newsletterWe share the latest AI research news, cool AI projects, and more. If you have any questions about the article above or missed something, feel free to email me. Asif@marktechpost.com

🚀 Check out 100 AI Tools in the AI ​​Tools Club

References:

  • https://www.nature.com/articles/s41587-022-01618-2
  • https://arxiv.org/abs/2304.05376
  • https://chemrxiv.org/engage/chemrxiv/article-details/63d56c13ae221ab9b240932f
  • https://arxiv.org/abs/2303.16429
  • https://www.biorxiv.org/content/10.1101/2023.03.07.531414v1.abstract
  • https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10105894/
  • https://europepmc.org/article/ppr/ppr650004
  • https://arxiv.org/abs/2304.09667
  • https://arxiv.org/abs/2304.10946
  • https://link.springer.com/article/10.1007/s00167-023-07355-6
  • https://link.springer.com/article/10.1007/s11845-023-03377-8

I am a Civil Engineering Graduate (2022) from Jamia Millia Islamia, New Delhi with a strong interest in Data Science, especially Neural Networks and its applications in various fields.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *