监督式学习
使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。
监督式学习的任务定义明确,可应用于各种场景,例如识别垃圾内容或预测降水量。
监督式学习的基础概念
监督式机器学习基于以下核心概念:
数据
数据是机器学习的推动力。数据以存储在表格中的字词和数字的形式提供,或以图片和音频文件中捕获的像素和波形的值的形式提供。我们会在数据集中存储相关数据。例如,我们可能有以下数据集:
数据集由包含特征和标签的各个示例组成。您可以将示例视为类似于电子表格中的单行。特征是监督式模型用于预测标签的值。标签是“答案”,即我们希望模型预测的值。在用于预测降雨量的天气模型中,特征可以是纬度、经度、温度、湿度、云量、风向和气压。标签为降雨量。
同时包含特征和标签的示例称为有标签示例。
两个有标签的示例
与之相反,无标签示例包含特征,但没有标签。创建模型后,模型会根据特征预测标签。
两个无标签示例
数据集特征
数据集的特点在于其规模和多样性。大小表示示例数量。多样性表示这些示例涵盖的范围。优质的数据集既大又多样。
数据集可以是庞大且多样化的数据集,也可以是庞大但不多样化的数据集,还可以是小但高度多样化的数据集。换句话说,大型数据集并不能保证足够的多样性,而高度多样化的数据集也不能保证足够的示例。
例如,一个数据集可能包含 100 年的数据,但仅限 7 月份。如果使用此数据集来预测 1 月份的降雨量,预测结果会不准确。反之,数据集可能仅涵盖几年,但包含每个月的数据。由于此数据集包含的年份不足以反映变异性,因此可能会产生不准确的预测结果。
检查您的理解情况
哪些数据集属性最适合用于机器学习?
体量大 / 多样性高
大量涵盖各种用例的示例对于机器学习系统了解数据中的底层模式至关重要。使用此类数据集训练的模型更有可能对新数据做出准确的预测。
规模较大 / 多样性较低
机器学习模型的准确性最高与用于训练它们的示例相当。模型对从未训练过的新数据的预测结果会较差。
体积小 / 多样性高
大多数模型无法在小型数据集中找到可靠的模式。预测结果将不如使用较大数据集时可靠。
规模较小 / 多样性较低
如果您的数据集较小且没有太大差异,您可能无法从机器学习中受益。
数据集还可以通过其特征数量进行描述。例如,某些天气数据集可能包含数百个地图项,从卫星图像到云层覆盖率值都有。其他数据集可能只包含三四个特征,例如湿度、大气压和温度。包含更多特征的数据集可以帮助模型发现更多模式并做出更准确的预测。不过,包含更多特征的数据集并不一定会生成能做出更准确预测的模型,因为某些特征可能与标签没有因果关系。
型号
在监督式学习中,模型是一组复杂的数字,用于定义特定输入特征模式与特定输出标签值之间的数学关系。模型会通过训练发现这些模式。
培训
监督式模型必须先接受训练,然后才能进行预测。为了训练模型,我们会向模型提供包含标记示例的数据集。模型的目标是找出根据特征预测标签的最佳解决方案。该模型会将其预测值与标签的实际值进行比较,以找到最佳解决方案。模型会根据预测值与实际值之间的差异(定义为“损失”)逐步更新其解决方案。换句话说,模型会学习特征与标签之间的数学关系,以便对未见过的数据做出最准确的预测。
例如,如果模型预测降雨量为 1.15 inches
,但实际值为 .75 inches
,则模型会修改其解决方案,使其预测结果更接近 .75 inches
。在模型查看数据集中的每个示例(在某些情况下,会多次查看)后,它会得出一个解决方案,该解决方案能针对每个示例做出平均最佳预测。
以下代码演示了如何训练模型:
该模型接受单个标记示例,并提供预测结果。
图 1. 机器学习模型根据标记的示例进行预测。
模型会将其预测值与实际值进行比较,并更新其解决方案。
图 2. 机器学习模型正在更新其预测值。
模型会对数据集中的每个标记示例重复此过程。
图 3. 机器学习模型正在更新训练数据集中每个标记示例的预测结果。
这样,模型会逐渐学习特征与标签之间的正确关系。这种逐步理解也是大型多样化数据集能够生成更好模型的原因。模型看到了更多值范围更广的数据,并进一步了解了特征与标签之间的关系。
在训练期间,机器学习从业者可以对模型用于进行预测的配置和特征进行微妙调整。例如,某些特征的预测能力比其他特征更强。因此,机器学习从业者可以选择模型在训练期间使用哪些特征。例如,假设某个天气数据集包含 time_of_day
作为地图项。在这种情况下,机器学习从业者可以在训练期间添加或移除 time_of_day
,以了解模型在添加或移除 time_of_day
后能否做出更准确的预测。
正在评估
我们会评估训练好的模型,以确定其学习效果。评估模型时,我们使用标记数据集,但只向模型提供数据集的特征。然后,我们将模型的预测结果与标签的真实值进行比较。
图 4. 通过将机器学习模型的预测结果与实际值进行比较来评估模型。
根据模型的预测结果,我们可能会在将模型部署到真实应用之前进行更多训练和评估。
检查您的理解情况
为什么模型需要先训练,然后才能进行预测?
模型需要经过训练,才能学习数据集中特征与标签之间的数学关系。
推理
对模型评估结果感到满意后,我们就可以使用该模型对无标签示例进行预测(称为推理)。在天气应用示例中,我们会向模型提供当前天气状况(例如温度、大气压和相对湿度),然后模型会预测降雨量。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-07-27。
[null,null,["最后更新时间 (UTC):2025-07-27。"],[[["\u003cp\u003eSupervised learning uses labeled data to train models that predict outcomes for new, unseen data.\u003c/p\u003e\n"],["\u003cp\u003eThe training process involves feeding the model labeled examples, allowing it to learn the relationship between features and labels.\u003c/p\u003e\n"],["\u003cp\u003eModels are evaluated by comparing their predictions on unseen data to the actual values, helping to refine their accuracy.\u003c/p\u003e\n"],["\u003cp\u003eOnce trained and evaluated, models can be used for inference, making predictions on new, unlabeled data in real-world applications.\u003c/p\u003e\n"],["\u003cp\u003eThe quality of the dataset, including its size and diversity, significantly impacts the model's performance and ability to generalize.\u003c/p\u003e\n"]]],[],null,["\u003cbr /\u003e\n\nSupervised learning's tasks are well-defined and can be applied to a multitude\nof scenarios---like identifying spam or predicting precipitation.\n\nFoundational supervised learning concepts\n\nSupervised machine learning is based on the following core concepts:\n\n- Data\n- Model\n- Training\n- Evaluating\n- Inference\n\nData\n\nData is the driving force of ML. Data comes in the form of words and numbers\nstored in tables, or as the values of pixels and waveforms captured in images\nand audio files. We store related data in datasets. For example, we might have a\ndataset of the following:\n\n- Images of cats\n- Housing prices\n- Weather information\n\nDatasets are made up of individual\n[examples](/machine-learning/glossary#example) that contain\n[features](/machine-learning/glossary#feature) and a\n[label](/machine-learning/glossary#label). You could think of an example as\nanalogous to a single row in a spreadsheet. Features are the values that a\nsupervised model uses to predict the label. The label is the \"answer,\" or the\nvalue we want the model to predict. In a weather model that predicts rainfall,\nthe features could be *latitude* , *longitude* , *temperature* ,\n*humidity* , *cloud coverage* , *wind direction* , and *atmospheric pressure* . The\nlabel would be *rainfall amount*.\n\nExamples that contain both features and a label are called\n[labeled examples](/machine-learning/glossary#labeled-example).\n\n**Two labeled examples**\n\nIn contrast, unlabeled examples contain features, but no label. After you create\na model, the model predicts the label from the features.\n\n**Two unlabeled examples**\n\nDataset characteristics\n\nA dataset is characterized by its size and diversity. Size indicates the number\nof examples. Diversity indicates the range those examples cover. Good datasets\nare both large and highly diverse.\n\nDatasets can be large and diverse, or large but\nnot diverse, or small but highly diverse. In other words, a\nlarge dataset doesn't guarantee sufficient diversity, and a dataset that is\nhighly diverse doesn't guarantee sufficient examples.\n\nFor instance, a dataset might contain 100 years worth of data, but only for the\nmonth of July. Using this dataset to predict rainfall in January would produce\npoor predictions. Conversely, a dataset might cover only a few years but contain\nevery month. This dataset might produce poor predictions because it doesn't\ncontain enough years to account for variability.\n\nCheck Your Understanding \nWhat attributes of a dataset would be ideal to use for ML? \nLarge size / High diversity \nA large number of examples that cover a variety of use cases is essential for a machine learning system to understand the underlying patterns in the data. A model trained on this type of dataset is more likely to make good predictions on new data. \nLarge size / Low diversity \nMachine learning models are only as good as the examples used to train them. A model will produce poorer predictions on novel data that it never trained on. \nSmall size / High diversity \nMost models can't find reliable patterns in a small dataset. The predictions will lack the confidence a larger dataset provides. \nSmall size / Low diversity \nIf your dataset is small and without much variation, you may not benefit from machine learning.\n\nA dataset can also be characterized by the number of its features. For example,\nsome weather datasets might contain hundreds of features, ranging from satellite\nimagery to cloud coverage values. Other datasets might contain only three or\nfour features, like humidity, atmospheric pressure, and temperature. Datasets\nwith more features can help a model discover additional patterns and make better\npredictions. However, datasets with more features don't *always* produce models\nthat make better predictions because some features might have no causal\nrelationship to the label.\n\nModel\n\nIn supervised learning, a model is the complex collection of numbers that define\nthe mathematical relationship from specific input feature patterns to specific\noutput label values. The model discovers these patterns through training.\n\nTraining\n\nBefore a supervised model can make predictions, it must be trained. To train a\nmodel, we give the model a dataset with labeled examples. The model's goal is to\nwork out the best solution for predicting the labels from the features. The\nmodel finds the best solution by comparing its predicted value to the label's\nactual value. Based on the difference between the predicted and actual\nvalues---defined as the [loss](/machine-learning/glossary#loss)---the\nmodel gradually updates its solution. In other words, the model learns the\nmathematical relationship between the features and the label so that it can\nmake the best predictions on unseen data.\n\nFor example, if the model predicted `1.15 inches` of rain, but the actual value\nwas `.75 inches`, the model modifies its solution so its prediction is closer to\n`.75 inches`. After the model has looked at each example in the dataset---in\nsome cases, multiple times---it arrives at a solution that makes the best\npredictions, on average, for each of the examples.\n\nThe following demonstrates training a model:\n\n1. The model takes in a single labeled example and provides a prediction.\n\n **Figure 1**. An ML model making a prediction from a labeled example.\n\n \u003cbr /\u003e\n\n2. The model compares its predicted value with the actual value and updates its solution.\n\n **Figure 2**. An ML model updating its predicted value.\n\n \u003cbr /\u003e\n\n3. The model repeats this process for each labeled example in the dataset.\n\n **Figure 3**. An ML model updating its predictions for each labeled example\n in the training dataset.\n\n \u003cbr /\u003e\n\nIn this way, the model gradually learns the correct relationship between the\nfeatures and the label. This gradual understanding is also why large and diverse\ndatasets produce a better model. The model has seen more data with a wider range\nof values and has refined its understanding of the relationship between the\nfeatures and the label.\n\nDuring training, ML practitioners can make subtle adjustments to the\nconfigurations and features the model uses to make predictions. For example,\ncertain features have more predictive power than others. Therefore, ML\npractitioners can select which features the model uses during training. For\nexample, suppose a weather dataset contains`time_of_day` as a feature. In this\ncase, an ML practitioner can add or remove `time_of_day` during training to see\nwhether the model makes better predictions with or without it.\n\nEvaluating\n\nWe evaluate a trained model to determine how well it learned. When we evaluate a model,\nwe use a labeled dataset, but we only give the model the dataset's features. We\nthen compare the model's predictions to the label's true values.\n\n**Figure 4**. Evaluating an ML model by comparing its predictions to the actual\nvalues.\n\n\u003cbr /\u003e\n\nDepending on the model's predictions, we might do more training and evaluating\nbefore deploying the model in a real-world application.\n\nCheck Your Understanding \nWhy does a model need to be trained before it can make predictions? \nA model needs to be trained to learn the mathematical relationship between the features and the label in a dataset. \nA model doesn't need to be trained. Models are available on most computers. \nA model needs to be trained so it won't require data to make a prediction. \n\nInference\n\nOnce we're satisfied with the results from evaluating the model, we can use the\nmodel to make predictions, called\n[inferences](/machine-learning/glossary#inference), on\nunlabeled examples. In the weather app example, we would give the model the\ncurrent weather conditions---like temperature, atmospheric pressure, and\nrelative humidity---and it would predict the amount of rainfall.\n| **Key Terms:**\n|\n| \u003cbr /\u003e\n|\n| - [example](/machine-learning/glossary#example)\n| - [feature](/machine-learning/glossary#feature)\n| - [inference](/machine-learning/glossary#inference)\n| - [labeled example](/machine-learning/glossary#labeled-example)\n| - [label](/machine-learning/glossary#label)\n| - [loss](/machine-learning/glossary#loss)\n| - [prediction](/machine-learning/glossary#prediction)\n| - [training](/machine-learning/glossary#training)\n|\n| \u003cbr /\u003e\n|"]]