[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["必要な情報がない","missingTheInformationINeed","thumb-down"],["複雑すぎる / 手順が多すぎる","tooComplicatedTooManySteps","thumb-down"],["最新ではない","outOfDate","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["サンプル / コードに問題がある","samplesCodeIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2025-07-27 UTC。"],[[["\u003cp\u003eSupervised learning uses labeled data to train models that predict outcomes for new, unseen data.\u003c/p\u003e\n"],["\u003cp\u003eThe training process involves feeding the model labeled examples, allowing it to learn the relationship between features and labels.\u003c/p\u003e\n"],["\u003cp\u003eModels are evaluated by comparing their predictions on unseen data to the actual values, helping to refine their accuracy.\u003c/p\u003e\n"],["\u003cp\u003eOnce trained and evaluated, models can be used for inference, making predictions on new, unlabeled data in real-world applications.\u003c/p\u003e\n"],["\u003cp\u003eThe quality of the dataset, including its size and diversity, significantly impacts the model's performance and ability to generalize.\u003c/p\u003e\n"]]],[],null,["\u003cbr /\u003e\n\nSupervised learning's tasks are well-defined and can be applied to a multitude\nof scenarios---like identifying spam or predicting precipitation.\n\nFoundational supervised learning concepts\n\nSupervised machine learning is based on the following core concepts:\n\n- Data\n- Model\n- Training\n- Evaluating\n- Inference\n\nData\n\nData is the driving force of ML. Data comes in the form of words and numbers\nstored in tables, or as the values of pixels and waveforms captured in images\nand audio files. We store related data in datasets. For example, we might have a\ndataset of the following:\n\n- Images of cats\n- Housing prices\n- Weather information\n\nDatasets are made up of individual\n[examples](/machine-learning/glossary#example) that contain\n[features](/machine-learning/glossary#feature) and a\n[label](/machine-learning/glossary#label). You could think of an example as\nanalogous to a single row in a spreadsheet. Features are the values that a\nsupervised model uses to predict the label. The label is the \"answer,\" or the\nvalue we want the model to predict. In a weather model that predicts rainfall,\nthe features could be *latitude* , *longitude* , *temperature* ,\n*humidity* , *cloud coverage* , *wind direction* , and *atmospheric pressure* . The\nlabel would be *rainfall amount*.\n\nExamples that contain both features and a label are called\n[labeled examples](/machine-learning/glossary#labeled-example).\n\n**Two labeled examples**\n\nIn contrast, unlabeled examples contain features, but no label. After you create\na model, the model predicts the label from the features.\n\n**Two unlabeled examples**\n\nDataset characteristics\n\nA dataset is characterized by its size and diversity. Size indicates the number\nof examples. Diversity indicates the range those examples cover. Good datasets\nare both large and highly diverse.\n\nDatasets can be large and diverse, or large but\nnot diverse, or small but highly diverse. In other words, a\nlarge dataset doesn't guarantee sufficient diversity, and a dataset that is\nhighly diverse doesn't guarantee sufficient examples.\n\nFor instance, a dataset might contain 100 years worth of data, but only for the\nmonth of July. Using this dataset to predict rainfall in January would produce\npoor predictions. Conversely, a dataset might cover only a few years but contain\nevery month. This dataset might produce poor predictions because it doesn't\ncontain enough years to account for variability.\n\nCheck Your Understanding \nWhat attributes of a dataset would be ideal to use for ML? \nLarge size / High diversity \nA large number of examples that cover a variety of use cases is essential for a machine learning system to understand the underlying patterns in the data. A model trained on this type of dataset is more likely to make good predictions on new data. \nLarge size / Low diversity \nMachine learning models are only as good as the examples used to train them. A model will produce poorer predictions on novel data that it never trained on. \nSmall size / High diversity \nMost models can't find reliable patterns in a small dataset. The predictions will lack the confidence a larger dataset provides. \nSmall size / Low diversity \nIf your dataset is small and without much variation, you may not benefit from machine learning.\n\nA dataset can also be characterized by the number of its features. For example,\nsome weather datasets might contain hundreds of features, ranging from satellite\nimagery to cloud coverage values. Other datasets might contain only three or\nfour features, like humidity, atmospheric pressure, and temperature. Datasets\nwith more features can help a model discover additional patterns and make better\npredictions. However, datasets with more features don't *always* produce models\nthat make better predictions because some features might have no causal\nrelationship to the label.\n\nModel\n\nIn supervised learning, a model is the complex collection of numbers that define\nthe mathematical relationship from specific input feature patterns to specific\noutput label values. The model discovers these patterns through training.\n\nTraining\n\nBefore a supervised model can make predictions, it must be trained. To train a\nmodel, we give the model a dataset with labeled examples. The model's goal is to\nwork out the best solution for predicting the labels from the features. The\nmodel finds the best solution by comparing its predicted value to the label's\nactual value. Based on the difference between the predicted and actual\nvalues---defined as the [loss](/machine-learning/glossary#loss)---the\nmodel gradually updates its solution. In other words, the model learns the\nmathematical relationship between the features and the label so that it can\nmake the best predictions on unseen data.\n\nFor example, if the model predicted `1.15 inches` of rain, but the actual value\nwas `.75 inches`, the model modifies its solution so its prediction is closer to\n`.75 inches`. After the model has looked at each example in the dataset---in\nsome cases, multiple times---it arrives at a solution that makes the best\npredictions, on average, for each of the examples.\n\nThe following demonstrates training a model:\n\n1. The model takes in a single labeled example and provides a prediction.\n\n **Figure 1**. An ML model making a prediction from a labeled example.\n\n \u003cbr /\u003e\n\n2. The model compares its predicted value with the actual value and updates its solution.\n\n **Figure 2**. An ML model updating its predicted value.\n\n \u003cbr /\u003e\n\n3. The model repeats this process for each labeled example in the dataset.\n\n **Figure 3**. An ML model updating its predictions for each labeled example\n in the training dataset.\n\n \u003cbr /\u003e\n\nIn this way, the model gradually learns the correct relationship between the\nfeatures and the label. This gradual understanding is also why large and diverse\ndatasets produce a better model. The model has seen more data with a wider range\nof values and has refined its understanding of the relationship between the\nfeatures and the label.\n\nDuring training, ML practitioners can make subtle adjustments to the\nconfigurations and features the model uses to make predictions. For example,\ncertain features have more predictive power than others. Therefore, ML\npractitioners can select which features the model uses during training. For\nexample, suppose a weather dataset contains`time_of_day` as a feature. In this\ncase, an ML practitioner can add or remove `time_of_day` during training to see\nwhether the model makes better predictions with or without it.\n\nEvaluating\n\nWe evaluate a trained model to determine how well it learned. When we evaluate a model,\nwe use a labeled dataset, but we only give the model the dataset's features. We\nthen compare the model's predictions to the label's true values.\n\n**Figure 4**. Evaluating an ML model by comparing its predictions to the actual\nvalues.\n\n\u003cbr /\u003e\n\nDepending on the model's predictions, we might do more training and evaluating\nbefore deploying the model in a real-world application.\n\nCheck Your Understanding \nWhy does a model need to be trained before it can make predictions? \nA model needs to be trained to learn the mathematical relationship between the features and the label in a dataset. \nA model doesn't need to be trained. Models are available on most computers. \nA model needs to be trained so it won't require data to make a prediction. \n\nInference\n\nOnce we're satisfied with the results from evaluating the model, we can use the\nmodel to make predictions, called\n[inferences](/machine-learning/glossary#inference), on\nunlabeled examples. In the weather app example, we would give the model the\ncurrent weather conditions---like temperature, atmospheric pressure, and\nrelative humidity---and it would predict the amount of rainfall.\n| **Key Terms:**\n|\n| \u003cbr /\u003e\n|\n| - [example](/machine-learning/glossary#example)\n| - [feature](/machine-learning/glossary#feature)\n| - [inference](/machine-learning/glossary#inference)\n| - [labeled example](/machine-learning/glossary#labeled-example)\n| - [label](/machine-learning/glossary#label)\n| - [loss](/machine-learning/glossary#loss)\n| - [prediction](/machine-learning/glossary#prediction)\n| - [training](/machine-learning/glossary#training)\n|\n| \u003cbr /\u003e\n|"]]