Virginia Tech researchers explore innovation trends with machine learning

Virginia Tech graduate student Shehryar Khan applies machine learning to a growing challenge in research and innovation: vast amounts of patent data. The number of patents granted each year has more than doubled over the past 15 years, making it increasingly difficult for experts to assess the novelty of new ideas, prompting Khan to seek automated solutions. In collaboration with the University Libraries’ Research Impact and Intelligence team, Khan’s research focuses on optimizing systems that make research information more organized and accessible, and ultimately extract meaningful insights from complex data. “My work focuses on optimizing and building systems that make research information more organized, accessible, and accurate,” Khan said, adding, “I like to say that I do ‘research within research.’” The new study, aimed at publication with Khan as lead author, aims to leverage models like ChatGPT to generate new patent ideas while circumventing existing concepts.

Applying machine learning to patent data and innovation trends

The proliferation of patent applications, which have more than doubled over the past 15 years, is driving demand for new methods to assess the true novelty of inventions, and researchers are increasingly turning to machine learning as a solution. “We are researching how to push models like ChatGPT to generate new ideas. To try this, we aim to have these models generate patents while avoiding ideas that already exist,” Khan said, highlighting the challenge of moving beyond simple predictive algorithms. This approach addresses a critical gap in current AI research, as models are often perceived as simply extrapolated from existing knowledge. Sarah Over, assistant director of Research Intelligence, emphasizes the broader importance of this initiative to Virginia Tech. “Virginia Tech needs to continue to advance in areas like machine learning and AI,” she said, noting that patent analysis complements other research efforts within the university’s libraries. A forthcoming publication with Kahn as lead author will detail his findings on assessing novelty in patent applications, based on a conference paper currently under review.

Kahn’s research at Virginia Tech’s Advanced Computing Institute

Shehryar Khan’s research at Virginia Tech’s Institute for Advanced Computing focuses on extracting order from rapidly expanding volumes of patent data, a challenge that is becoming increasingly important as the pace of innovation accelerates. This project aims to move beyond simple predictive text generation, a common criticism of large-scale language models. This approach addresses the important need to determine whether a proposed invention truly represents a novel contribution, a task complicated by the scale of existing intellectual property. Kahn’s research is more than just academic. It aims to bridge the gap between theoretical machine learning and real-world applications.

Virginia Tech needs to continue advancing in areas like machine learning and AI [artificial intelligence].

Assessing patent novelty using machine learning models

“Patents granted annually have more than doubled over the past 15 years,” creating a major challenge for professionals trying to keep up with innovation, said Sarah Over, assistant director of research intelligence. Khan believes this represents a combination of intellectual property, research evaluation and advanced computing. The ultimate goal is to create a system that not only organizes and accesses research information, but also derives, as Kahn puts it, “meaningful insights from very messy data sources” to better understand innovation trends and research implications.