How does transfer learning benefit AI models in leveraging knowledge?

overview

Artificial intelligence (AI) and machine learning (ML) are demanding technologies because training AI models and collecting large amounts of data on a regular basis is a complex task. Training relies on data collection, but regulatory compliance, cost, and ethical concerns make it difficult to access large new datasets. Transfer learning is a technique used to address this problem by allowing pre-trained models to be used as starting points for training new models.

What is transfer learning?

Transfer learning is like a master chef who leverages existing culinary expertise and previous cooking techniques to create new dishes. On the dataset he trains an AI model so that it can recognize objects in the dataset based on their properties and patterns. Deploying the same AI model to recognize another related object greatly reduces the amount of training and data required for the AI model to integrate his second object, reducing the cost of the process.

How does transfer learning work?

Transfer learning of AI models is a standard and well-defined process.

AI models are trained on large datasets with the goal of enabling the AI model to recognize elements within the dataset. An example of a large dataset is ImageNet, a free-to-use database of labeled images. The AI model then acquires knowledge about the objects in the dataset and uses it to recognize different but related objects.
AI models have different layers that perform different tasks. For example, one layer can recognize the shape of an object such as a bird, and another layer can recognize the body parts of a bird such as wings, beak, and eyes. In the process, the AI model acquires important knowledge about birds. This stage is known as pre-training.
AI models are then deployed with awareness of different but related tasks. For example, if the AI model learned to recognize small birds such as sparrows and kingfishers during the pre-training phase, it can now be deployed to recognize large birds such as vultures. This stage is known as fine-tuning.
During the fine-tuning phase, the layers used in the pre-training phase are typically frozen or lightly trained, while the other layers are extensively trained on the new dataset. Therefore, the AI model combines the knowledge gained in the pre-training stage and the knowledge gained in the fine-tuning stage to accurately identify objects.

Case Study

Transfer learning enabled doctors to accurately and quickly identify disease in at least one case. In 2017, researchers at Stanford University used transfer learning to improve image processing and diagnosis. They developed his deep learning model, called CheXNet, which studied over 100,000 chest X-ray images so that it could detect various diseases of the chest. Images were sent from the National Institutes of Health (NIH) Clinical Center.

CheXNet was able to leverage a pre-trained AI model known as DenseNet to accelerate learning of various image processing and identify diseases. CheXNet can accurately identify 14 medical conditions, including pulmonary nodules, pneumonia, and pneumothorax. However, the researchers wanted to rigorously verify that CheXNet’s results were accurate and could be used therapeutically. There they compared his CheXNet findings on his 420 samples of chest radiographs with the findings and observations of expert and experienced radiologists. They found that ChestNet’s findings were in close agreement with those of expert radiologists and therefore considered an important advance.

Limitations of transfer learning

Transfer learning is a great way to train AI models because it optimizes time and accelerates learning. However, this process has certain limitations, which are explained below.

overlearning

Overfitting is a case where a pre-trained model is too familiar with a particular kind of dataset. Think of this as a deeply specialized model, making it difficult for later models that utilize the pre-trained model to recognize similar but different objects. This is similar to the pre-training model recognizing all kinds of small birds, but the ability to recognize larger birds based on their basic characteristics cannot be transferred to a later model.

negative transfer

Negative transfer is when there is a difference between the dataset of the pre-trained AI model and the subsequent model that leverages the pre-trained AI model. Differences in datasets can defeat the purpose of training AI models. All models must be provided with a well-matched dataset, or have a set of logical parameters by the development team or engineers to ensure that the datasets match. Otherwise, it is very likely that the later model will identify something different than the pre-trained AI model. This creates overhead for engineering and development teams.

Conclusion

Transfer learning is an efficient way to train AI models to recognize objects and take actions, as evidenced by many case studies. However, limitations can seriously impede that progress. Engineers must overcome overfitting limitations. Otherwise, subsequent AI models will not be able to identify the object. Similarly, organizations need a consistent policy and parameter checklist to ensure that the pre-trained model and all subsequent data models have a uniform dataset to work with. However, the task of integrating the data will be difficult.

Source link