测试您的掌握情况
使用集合让一切井井有条 根据您的偏好保存内容并对其进行分类。
以下问题有助于您巩固对核心机器学习概念的理解。
预测能力
监督式机器学习模型使用包含有标签示例的数据集进行训练。模型会学习如何根据特征预测标签。不过,数据集中并非每个特征都有预测能力。在某些情况下,只有少数特征可用作标签的预测因子。在下面的数据集中,将价格用作标签,其余列用作特征。
您认为哪三项特征最有可能预测汽车的价格?
Make_model、year、miles。
汽车的品牌/型号、年份和行驶里程可能是影响其价格的最有力预测因素。
颜色、高度、品牌型号。
汽车的高度和颜色不能很好地预测汽车的价格。
英里、变速箱、品牌型号。
变速箱不是价格的主要预测因素。
Tire_size、wheel_base、year。
轮胎尺寸和轴距不能很好地预测汽车的价格。
监督式学习和非监督式学习
根据问题,您将使用监督式或非监督式方法。例如,如果您事先知道要预测的值或类别,则可以使用监督学习。但是,如果您想了解数据集是否包含任何相关示例的分段或分组,则需要使用无监督学习。
假设您有一个在线购物网站的用户数据集,其中包含以下列:
如果您想了解访问网站的用户类型,您会使用监督式学习还是非监督式学习?
非监督式学习。
由于我们希望模型对相关客户群进行分组,因此我们将使用非监督式学习。模型对用户进行分组后,我们会为每个分组创建自己的名称,例如“折扣猎手”“特惠猎手”“浏览者”“忠诚用户”和“流浪者”。
监督学习,因为我尝试预测用户属于哪个类别。
在监督式学习中,数据集必须包含您尝试预测的标签。数据集中没有任何标签可用于指代用户类别。
假设您有一个包含以下列的住宅能耗数据集:
您会使用哪种类型的机器学习来预测新建住宅每年的用电量?
监督式学习。
监督式学习基于有标签的示例进行训练。在此数据集中,“每年用电量(千瓦时)”将是标签,因为这是您希望模型预测的值。特征包括“平方英尺”“位置”和“建造年份”。
非监督式学习。
非监督式学习使用无标签样本。在此示例中,“每年用电量”是标签,因为这是您希望模型预测的值。
假设您有一个包含以下列的航班数据集:
如果您想预测机票的价格,您会使用回归还是分类?
分类
分类模型的输出是离散值,通常是字词。在本例中,机票费用是一个数值。
根据数据集,您能否训练一个分类模型,将机票价格分类为“高”“平均”或“低”?
可以,但我们首先需要将 airplane_ticket_cost
列中的数值转换为分类值。
您可以根据该数据集创建分类模型。 您需要执行以下操作:
- 查找从出发机场到目的地机场的机票平均费用。
- 确定构成“高”“平均”和“低”的阈值。
- 将预测的费用与阈值进行比较,并输出相应值所属的类别。
不可以。无法创建分类模型。airplane_ticket_cost
值为数值,而非分类值。
只需稍加努力,您就可以创建分类模型。
不可以。分类模型只能预测两个类别,例如 spam
或 not_spam
。此模型需要预测三个类别。
分类模型可以预测多个类别。它们被称为多类别分类模型。
训练和评估
训练模型后,我们会使用包含标签示例的数据集对其进行评估,并将模型的预测值与标签的实际值进行比较。
为相应题目选择两个最佳答案。
如果模型的预测结果与实际情况相差甚远,您可以采取哪些措施来改进预测结果?
重新训练模型,但仅使用您认为对标签具有最强预测力的特征。
使用更少但预测能力更强的特征重新训练模型,可以生成预测效果更好的模型。
您无法修正预测结果相差甚远的模型。
您可以修正预测有误差的模型。大多数模型都需要进行多次训练,才能做出有用的预测。
使用更大、更具多样性的数据集重新训练模型。
在包含更多示例和更广泛值的数据集上训练的模型可以做出更好的预测,因为模型对特征与标签之间的关系有更好的泛化解决方案。
尝试采用其他训练方法。例如,如果您使用的是监督式方法,请尝试使用非监督式方法。
采用其他训练方法无法获得更好的预测结果。
现在,您可以继续学习机器学习:
人与 AI 指南。如果您正在寻找 Google 员工、行业专家和学术研究人员提供的一套机器学习方法、最佳实践和示例。
问题构建。如果你正在寻找经过实地测试的方法来创建机器学习模型并避免常见的陷阱。
机器学习速成课程。如果您准备以深入且实操的方式详细了解机器学习。
如未另行说明,那么本页面中的内容已根据知识共享署名 4.0 许可获得了许可,并且代码示例已根据 Apache 2.0 许可获得了许可。有关详情,请参阅 Google 开发者网站政策。Java 是 Oracle 和/或其关联公司的注册商标。
最后更新时间 (UTC):2025-07-27。
[null,null,["最后更新时间 (UTC):2025-07-27。"],[[["\u003cp\u003eThis page tests your understanding of core machine learning (ML) concepts through interactive questions.\u003c/p\u003e\n"],["\u003cp\u003eIt covers fundamental ML topics such as predictive power of features, supervised and unsupervised learning, and model training and evaluation.\u003c/p\u003e\n"],["\u003cp\u003eYou'll learn how to choose the right ML approach for different problems and assess the effectiveness of a trained model.\u003c/p\u003e\n"],["\u003cp\u003eLinks to further resources are provided to deepen your understanding of ML and its practical applications.\u003c/p\u003e\n"]]],[],null,["# Test Your Understanding\n\n\u003cbr /\u003e\n\nThe following questions help you solidify your understanding of core ML concepts.\n\nPredictive power\n----------------\n\nSupervised ML models are trained using datasets with labeled examples. The model\nlearns how to predict the label from the features. However, not every feature in\na dataset has predictive power. In some instances, only a few features act as\npredictors of the label. In the dataset below, use price as the label\nand the remaining columns as the features.\n\nWhich three features do you think are likely the greatest predictors for a car's price? \nMake_model, year, miles. \nA car's make/model, year, and miles are likely to be among the strongest predictors for its price. \nColor, height, make_model. \nA car's height and color are not strong predictors for a car's price. \nMiles, gearbox, make_model. \nThe gearbox isn't a main predictor of price. \nTire_size, wheel_base, year. \nTire size and wheel base aren't strong predictors for a car's price.\n\nSupervised and unsupervised learning\n------------------------------------\n\nBased on the problem, you'll use either a supervised or unsupervised approach.\nFor example, if you know beforehand the value or category you want to predict,\nyou'd use supervised learning. However, if you wanted to learn if your dataset\ncontains any segmentations or groupings of related examples, you'd use\nunsupervised learning.\n\nSuppose you had a dataset of users for an online shopping website, and it contained the following columns:\n\nIf you wanted to understand the types of users that visit the site, would you use supervised or unsupervised learning? \nUnsupervised learning. \nBecause we want the model to cluster groups of related customers, we'd use unsupervised learning. After the model clustered the users, we'd create our own names for each cluster, for example, \"discount seekers,\" \"deal hunters,\" \"surfers,\" \"loyal,\" and \"wanderers.\" \nSupervised learning because I'm trying to predict which class a user belongs to. \nIn supervised learning, the dataset must contain the label you're trying to predict. In the dataset, there is no label that refers to a category of user.\n\nSuppose you had an energy usage dataset for homes with the following columns:\n\nWhat type of ML would you use to predict the kilowatt hours used per year for a newly constructed house? \nSupervised learning. \nSupervised learning trains on labeled examples. In this dataset \"kilowatt hours used per year\" would be the label because this is the value you want the model to predict. The features would be \"square footage,\" \"location,\" and \"year built.\" \nUnsupervised learning. \nUnsupervised learning uses unlabeled examples. In this example, \"kilowatt hours used per year\" would be the label because this is the value you want the model to predict.\n\nSuppose you had a flight dataset with the following columns:\n\nIf you wanted to predict the cost of an airplane ticket, would you use regression or classification? \nRegression \nA regression model's output is a numeric value. \nClassification \nA classification model's output is a discrete value, normally a word. In this case, the cost of an airplane ticket is a numeric value. \nBased on the dataset, could you train a classification model to classify the cost of an airplane ticket as \"high,\" \"average,\" or \"low\"? \nYes, but we'd first need to convert the numeric values in the `airplane_ticket_cost` column to categorical values. \nIt's possible to create a classification model from the dataset. You would do something like the following:\n\n1. Find the average cost of a ticket from the departure airport to the destination airport.\n2. Determine the thresholds that would constitute \"high,\" \"average,\" and \"low\".\n3. Compare the predicted cost to the thresholds and output the category the value falls within. \nNo. It's not possible to create a classification model. The `airplane_ticket_cost` values are numeric not categorical. \nWith a little bit of work, you could create a classification model. \nNo. Classification models only predict two categories, like `spam` or `not_spam`. This model would need to predict three categories. \nClassification models can predict multiple categories. They're called multiclass classification models.\n\nTraining and evaluating\n-----------------------\n\nAfter we've trained a model, we evaluate it by using a dataset with labeled examples\nand compare the model's predicted value to the label's actual value.\n\nSelect the two best answers for the question. \nIf the model's predictions are far off, what might you do to make them better? \nRetrain the model, but use only the features you believe have the strongest predictive power for the label. \nRetraining the model with fewer features, but that have more predictive power, can produce a model that makes better predictions. \nYou can't fix a model whose predictions are far off. \nIt's possible to fix a model whose predictions are off. Most models require multiple rounds of training until they make useful predictions. \nRetrain the model using a larger and more diverse dataset. \nModels trained on datasets with more examples and a wider range of values can produce better predictions because the model has a better generalized solution for the relationship between the features and the label. \nTry a different training approach. For example, if you used a supervised approach, try an unsupervised approach. \nA different training approach would not produce better predictions.\n\nYou're now ready to take the next step in your ML journey:\n\n- [People + AI Guidebook](https://pair.withgoogle.com/guidebook/). If you're\n looking for a set of methods, best practices and examples presented by\n Googlers, industry experts, and academic research for using ML.\n\n- [Problem Framing](/machine-learning/problem-framing). If you're looking for\n a field-tested approach for creating ML models and avoiding common pitfalls\n along the way.\n\n- [Machine Learning Crash Course](/machine-learning/crash-course). If you're\n ready for an in-depth and hands-on approach to learning more about ML."]]