Pre-Trained Model

A pre-trained model is a machine learning model that has already been trained on a large dataset for a general task — so you don’t have to train it from scratch.

Instead of teaching a model everything from zero, you start with one that already “knows” a lot about the world (like language, images, or sounds). You can then fine-tune it — that is, slightly adjust it — to specialize in your specific task.

Pre-train is connected to representation learning and in particular to the concept of “transfer learning”: using knowledge learned from one task or dataset to help another.

A pre-trained model is basically the result of transfer learning’s first step — it’s trained on a broad, general dataset (like ImageNet or Wikipedia). Then you fine-tune (transfer) it to your specific, smaller task.

For example:

  • BERT is pre-trained on massive text corpora to understand language structure and meaning. You can then fine-tune it for sentiment analysis, question answering, or text classification. (See NLP - Natural Language Processing)
  • In computer vision models like ResNet are pre-trained on ImageNet to recognize thousands of objects; you can fine-tune them to recognize medical images or specific products.