传统上,训练分类器等计算机视觉模型需要标记数据。训练数据中的每个示例都需要是一对:一张图像和一个描述该图像的人工生成的标签。
最近,新的 SSL 技术为计算机视觉领域提供了最准确的模型,以应对 Imagenet 等经典挑战。半监督学习 (SSL) 让模型可以从标记数据和未标记数据中学习。未标记的数据仅由图像组成,没有任何标签。
SSL 很棒,因为未标记的数据通常比标记的数据多得多,尤其是在将模型部署到生产中后。此外,SSL 还减少了标记的时间、成本和工作量。
但是模型如何从没有标签的图像中学习呢?关键的见解是图像本身包含信息。SSL 的神奇之处在于,它可以根据结构自动对相似的图像进行聚类,从未标记的数据中提取信息,并且这种聚类为模型提供了额外的信息供模型学习。
本教程使用 Google Colab 中包含的几个常见 Python 库,包括 matplotlib、numpy 和 TensorFlow。如果需要安装它们,通常可以 !pip install --upgrade pip; pip install matplotlib numpy tensorflow
在 Jupyter 笔记本中或 pip install --upgrade pip; pip install matplotlib numpy tensorflow
从命令行运行(无感叹号)。
如果您使用 Google Colab,请确保将运行时类型更改为 GPU。
在本教程中,我们将在 CIFAR-10 数据集上训练分类器。这是自然图像的经典研究数据集。让我们加载它并看一下。我们将看到 CIFAR-10 中的一些类别:青蛙、船、汽车、卡车、鹿、马、鸟、猫、狗和飞机。
import matplotlib.pyplot as plt def plot_images(images): """Simple utility to render images.""" # Visualize the data. _, axarr = plt.subplots( 5 , 5 , figsize = ( 15 , 15 )) for row in range ( 5 ): for col in range ( 5 ): image = images[row * 5 + col] axarr[row, col].imshow(image) import tensorflow as tf NUM_CLASSES = 10 # Load the data using the Keras Datasets API. (x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data() plot_images(x_test) |
创建模型
一般来说,您需要使用现成的模型架构。这节省了您摆弄模型架构设计的精力。模型大小调整的一般规则是选择一个足够大的模型来处理您的数据,但又不能大到在推理期间变慢。对于像 CIFAR-10 这样的非常小的数据集,我们将使用非常小的模型。对于具有较大图像尺寸的较大数据集,Efficient Net 系列是一个不错的选择。
def get_model(): return tf.keras.applications.MobileNet(input_shape = ( 32 , 32 , 3 ), weights = None , classes = NUM_CLASSES, classifier_activation = None ) model = get_model() |
准备数据
现在,让我们通过将标签(代表 10 个对象类别的从 0 到 9 的整数)转换为 one-hot 向量(如 [1,0,0,0,0,0,0,0,0, 0] 和 [0,0,0,0,0,0,0,0,0,1]。我们还将图像像素更新为模型架构预期的范围,即范围 [-1, 1]。
def normalize_data(x_train, y_train, x_test, y_test): """Utility to normalize the data into standard formats.""" # Update the pixel range to [-1,1], which is expected by the model architecture. x_train = x = tf.keras.applications.mobilenet.preprocess_input(x_train) x_test = x = tf.keras.applications.mobilenet.preprocess_input(x_test) # Convert to one-hot labels. y_train = tf.keras.utils.to_categorical(y_train, NUM_CLASSES) y_test = tf.keras.utils.to_categorical(y_test, NUM_CLASSES) return x_train, y_train, x_test, y_test x_train, y_train, x_test, y_test = \ normalize_data(x_train, y_train, x_test, y_test) |
该数据集包含 50,000 个示例。让我们使用其中 5,000 个作为标记图像,20,000 个作为未标记图像。
import numpy as np def prepare_data(x_train, y_train, num_labeled_examples, num_unlabeled_examples): """Returns labeled and unlabeled datasets.""" num_examples = x_train.size assert num_labeled_examples + num_unlabeled_examples < = num_examples # Generate some random indices. dataset_size = len (x_train) indices = np.array( range (dataset_size)) generator = np.random.default_rng(seed = 0 ) generator.shuffle(indices) # Split the indices into two sets: one for labeled, one for unlabeled. labeled_train_indices = indices[:num_labeled_examples] unlabeled_train_indices = indices[num_labeled_examples : num_labeled_examples + num_unlabeled_examples] x_labeled_train = x_train[labeled_train_indices] y_labeled_train = y_train[labeled_train_indices] x_unlabeled_train = x_train[unlabeled_train_indices] # Since this is unlabeled, we won't need a y_labeled_data. return x_labeled_train, y_labeled_train, x_unlabeled_train NUM_LABELED = 5000 NUM_UNLABELED = 20000 x_labeled_train, y_labeled_train, x_unlabeled_train = \ prepare_data(x_train, y_train, num_labeled_examples = NUM_LABELED, num_unlabeled_examples = NUM_UNLABELED) del x_train, y_train |
基线训练
为了衡量 SSL 带来的性能改进,我们首先使用不使用 SSL 的标准训练循环来衡量模型的性能。
让我们设置一个带有一些基本数据增强的标准训练循环。数据增强是一种正则化,它可以防止过度拟合,并允许您的模型更好地泛化到从未见过的数据。
下面的超参数值(学习率、时期、批量大小等)是常见默认值和手动调整值的组合。
结果是模型的准确率约为 45%。(请记住阅读验证准确性,而不是训练准确性)。我们的下一个任务是弄清楚是否可以使用 SSL 提高模型的准确性。
model. compile ( optimizer = tf.keras.optimizers.Adam(), loss = tf.keras.losses.CategoricalCrossentropy(from_logits = True ), metrics = [tf.keras.metrics.CategoricalAccuracy()], ) # Setup Keras augmentation. datagen = tf.keras.preprocessing.image.ImageDataGenerator( featurewise_center = False , featurewise_std_normalization = False , horizontal_flip = True ) datagen.fit(x_labeled_train) batch_size = 64 epochs = 30 model.fit( x = datagen.flow(x_labeled_train, y_labeled_train, batch_size = batch_size), shuffle = True , validation_data = (x_test, y_test), batch_size = batch_size, epochs = epochs, ) baseline_metrics = model.evaluate(x = x_test, y = y_test, return_dict = True ) print ('') print (f "Baseline model accuracy: {baseline_metrics['categorical_accuracy']}" ) |
输出:
Epoch 1 /30 79 /79 [==============================] - 4s 23ms /step - loss: 2.4214 - categorical_accuracy: 0.1578 - val_loss: 2.3047 - val_categorical_accuracy: 0.1000 Epoch 2 /30 79 /79 [==============================] - 1s 16ms /step - loss: 2.0831 - categorical_accuracy: 0.2196 - val_loss: 2.3063 - val_categorical_accuracy: 0.1000 Epoch 3 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.9363 - categorical_accuracy: 0.2852 - val_loss: 2.3323 - val_categorical_accuracy: 0.1000 Epoch 4 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.8324 - categorical_accuracy: 0.3174 - val_loss: 2.3496 - val_categorical_accuracy: 0.1000 Epoch 5 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.8155 - categorical_accuracy: 0.3438 - val_loss: 2.3339 - val_categorical_accuracy: 0.1000 Epoch 6 /30 79 /79 [==============================] - 1s 15ms /step - loss: 1.6477 - categorical_accuracy: 0.3886 - val_loss: 2.3606 - val_categorical_accuracy: 0.1000 Epoch 7 /30 79 /79 [==============================] - 1s 15ms /step - loss: 1.6120 - categorical_accuracy: 0.4100 - val_loss: 2.3585 - val_categorical_accuracy: 0.1000 Epoch 8 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.5884 - categorical_accuracy: 0.4220 - val_loss: 2.1796 - val_categorical_accuracy: 0.2519 Epoch 9 /30 79 /79 [==============================] - 1s 18ms /step - loss: 1.5477 - categorical_accuracy: 0.4310 - val_loss: 1.8913 - val_categorical_accuracy: 0.3145 Epoch 10 /30 79 /79 [==============================] - 1s 15ms /step - loss: 1.4328 - categorical_accuracy: 0.4746 - val_loss: 1.7082 - val_categorical_accuracy: 0.3696 Epoch 11 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.4328 - categorical_accuracy: 0.4796 - val_loss: 1.7679 - val_categorical_accuracy: 0.3811 Epoch 12 /30 79 /79 [==============================] - 2s 20ms /step - loss: 1.3962 - categorical_accuracy: 0.5020 - val_loss: 1.8994 - val_categorical_accuracy: 0.3690 Epoch 13 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.3271 - categorical_accuracy: 0.5156 - val_loss: 2.0416 - val_categorical_accuracy: 0.3688 Epoch 14 /30 79 /79 [==============================] - 1s 17ms /step - loss: 1.2711 - categorical_accuracy: 0.5374 - val_loss: 1.9231 - val_categorical_accuracy: 0.3848 Epoch 15 /30 79 /79 [==============================] - 1s 15ms /step - loss: 1.2312 - categorical_accuracy: 0.5624 - val_loss: 1.9006 - val_categorical_accuracy: 0.3961 Epoch 16 /30 79 /79 [==============================] - 1s 19ms /step - loss: 1.2048 - categorical_accuracy: 0.5720 - val_loss: 2.0102 - val_categorical_accuracy: 0.4102 Epoch 17 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.1365 - categorical_accuracy: 0.6000 - val_loss: 2.1400 - val_categorical_accuracy: 0.3672 Epoch 18 /30 79 /79 [==============================] - 1s 18ms /step - loss: 1.1992 - categorical_accuracy: 0.5840 - val_loss: 2.1206 - val_categorical_accuracy: 0.3933 Epoch 19 /30 79 /79 [==============================] - 2s 25ms /step - loss: 1.1438 - categorical_accuracy: 0.6012 - val_loss: 2.4035 - val_categorical_accuracy: 0.4014 Epoch 20 /30 79 /79 [==============================] - 2s 24ms /step - loss: 1.1211 - categorical_accuracy: 0.6018 - val_loss: 2.0224 - val_categorical_accuracy: 0.4010 Epoch 21 /30 79 /79 [==============================] - 2s 21ms /step - loss: 1.0425 - categorical_accuracy: 0.6358 - val_loss: 2.2100 - val_categorical_accuracy: 0.3911 Epoch 22 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.1177 - categorical_accuracy: 0.6116 - val_loss: 1.9892 - val_categorical_accuracy: 0.4285 Epoch 23 /30 79 /79 [==============================] - 1s 19ms /step - loss: 1.0236 - categorical_accuracy: 0.6412 - val_loss: 2.1216 - val_categorical_accuracy: 0.4211 Epoch 24 /30 79 /79 [==============================] - 1s 18ms /step - loss: 0.9487 - categorical_accuracy: 0.6714 - val_loss: 2.0135 - val_categorical_accuracy: 0.4307 Epoch 25 /30 79 /79 [==============================] - 1s 16ms /step - loss: 1.1877 - categorical_accuracy: 0.5876 - val_loss: 2.3732 - val_categorical_accuracy: 0.3923 Epoch 26 /30 79 /79 [==============================] - 2s 20ms /step - loss: 1.0639 - categorical_accuracy: 0.6288 - val_loss: 1.9291 - val_categorical_accuracy: 0.4291 Epoch 27 /30 79 /79 [==============================] - 2s 19ms /step - loss: 0.9243 - categorical_accuracy: 0.6882 - val_loss: 1.8552 - val_categorical_accuracy: 0.4343 Epoch 28 /30 79 /79 [==============================] - 1s 15ms /step - loss: 0.9784 - categorical_accuracy: 0.6656 - val_loss: 2.0175 - val_categorical_accuracy: 0.4386 Epoch 29 /30 79 /79 [==============================] - 1s 17ms /step - loss: 0.9316 - categorical_accuracy: 0.6800 - val_loss: 1.9916 - val_categorical_accuracy: 0.4305 Epoch 30 /30 79 /79 [==============================] - 1s 17ms /step - loss: 0.8816 - categorical_accuracy: 0.7054 - val_loss: 2.0281 - val_categorical_accuracy: 0.4366 313 /313 [==============================] - 1s 3ms /step - loss: 2.0280 - categorical_accuracy: 0.4366 Baseline model accuracy: 0.436599999666214 |
使用 SSL 进行训练
现在,让我们看看是否可以通过将未标记的数据添加到训练数据中来提高模型的准确性。我们将使用 Masterful,这是一个为计算机视觉模型(例如我们的分类器)实现 SSL 的平台。
让我们安装Masterful。在 Google Colab 中,我们可以从笔记本单元进行 pip 安装。我们也可以通过命令行来安装。有关更多详细信息,请参阅Masterful 安装指南。
!pip install - - upgrade pip !pip install masterful import masterful masterful = masterful.register() |
输出:
Loaded Masterful version 0.4.1. This software is distributed free of charge for personal projects and evaluation purposes. See http: //www .masterfulai.com /personal-and-evaluation-agreement for details. Sign up in the next 45 days at https: //www .masterfulai.com /get-it-now to continue using Masterful. |
设置大师
现在,我们来设置Masterful的一些配置参数。
# Start fresh with a new model tf.keras.backend.clear_session() model = get_model() # Tell Masterful that your model is performing a classification task # with 10 labels and that the image pixel range is # [-1,1]. Also, the model outputs logits rather than a softmax activation. model_params = masterful.architecture.learn_architecture_params( model = model, task = masterful.enums.Task.CLASSIFICATION, input_range = masterful.enums.ImageRange.NEG_ONE_POS_ONE, num_classes = NUM_CLASSES, prediction_logits = True , ) # Tell Masterful that your labeled training data is using one-hot labels. labeled_training_data_params = masterful.data.learn_data_params( dataset = (x_labeled_train, y_labeled_train), task = masterful.enums.Task.CLASSIFICATION, image_range = masterful.enums.ImageRange.NEG_ONE_POS_ONE, num_classes = NUM_CLASSES, sparse_labels = False , ) unlabeled_training_data_params = masterful.data.learn_data_params( dataset = (x_unlabeled_train,), task = masterful.enums.Task.CLASSIFICATION, image_range = masterful.enums.ImageRange.NEG_ONE_POS_ONE, num_classes = NUM_CLASSES, sparse_labels = None , ) # Tell Masterful that your test/validation data is using one-hot labels. test_data_params = masterful.data.learn_data_params( dataset = (x_test, y_test), task = masterful.enums.Task.CLASSIFICATION, image_range = masterful.enums.ImageRange.NEG_ONE_POS_ONE, num_classes = NUM_CLASSES, sparse_labels = False , ) # Let Masterful meta-learn ideal optimization hyperparameters like # batch size, learning rate, optimizer, learning rate schedule, and epochs. # This will speed up training. optimization_params = masterful.optimization.learn_optimization_params( model, model_params, (x_labeled_train, y_labeled_train), labeled_training_data_params, ) # Let Masterful meta-learn ideal regularization hyperparameters. Regularization # is an important ingredient of SSL. Meta-learning can # take a while so we'll use a precached set of parameters. # regularization_params = \ # masterful.regularization.learn_regularization_params(model, # model_params, # optimization_params, # (x_labeled_train, y_labeled_train), # labeled_training_data_params) regularization_params = masterful.regularization.parameters.CIFAR10_SMALL # Let Masterful meta-learn ideal SSL hyperparameters. ssl_params = masterful.ssl.learn_ssl_params( (x_labeled_train, y_labeled_train), labeled_training_data_params, unlabeled_datasets = [((x_unlabeled_train,), unlabeled_training_data_params)], ) |
输出:
MASTERFUL: Learning optimal batch size. MASTERFUL: Learning optimal initial learning rate for batch size 256. |
火车!
现在,我们准备使用 SSL 技术进行训练!我们将调用 masterful.training.train,它是 Masterful 训练引擎的入口点。
training_report = masterful.training.train( model, model_params, optimization_params, regularization_params, ssl_params, (x_labeled_train, y_labeled_train), labeled_training_data_params, (x_test, y_test), test_data_params, unlabeled_datasets = [((x_unlabeled_train,), unlabeled_training_data_params)], ) |
输出:
MASTERFUL: Training model with semi-supervised learning enabled. MASTERFUL: Performing basic dataset analysis. MASTERFUL: Training model with: MASTERFUL: 5000 labeled examples. MASTERFUL: 10000 validation examples. MASTERFUL: 0 synthetic examples. MASTERFUL: 20000 unlabeled examples. MASTERFUL: Training model with learned parameters partridge-boiled-cap in two phases. MASTERFUL: The first phase is supervised training with the learned parameters. MASTERFUL: The second phase is semi-supervised training to boost performance. MASTERFUL: Warming up model for supervised training. MASTERFUL: Warming up batch norm statistics (this could take a few minutes). MASTERFUL: Warming up training for 500 steps. 100%|██████████| 500 /500 [00:47<00:00, 10.59steps /s ] MASTERFUL: Validating batch norm statistics after warmup for stability (this could take a few minutes). MASTERFUL: Starting Phase 1: Supervised training until the validation loss stabilizes... Supervised Training: 100%|██████████| 6300 /6300 [02:33<00:00, 41.13steps /s ] MASTERFUL: Starting Phase 2: Semi-supervised training until the validation loss stabilizes... MASTERFUL: Warming up model for semi-supervised training. MASTERFUL: Warming up batch norm statistics (this could take a few minutes). MASTERFUL: Warming up training for 500 steps. 100%|██████████| 500 /500 [00:23<00:00, 20.85steps /s ] MASTERFUL: Validating batch norm statistics after warmup for stability (this could take a few minutes). Semi-Supervised Training: 100%|██████████| 11868 /11868 [08:06<00:00, 24.39steps /s ] |
分析结果
您传递到masterful.training.train 的模型 现已就地训练和更新,因此您可以像任何其他经过训练的 Keras 模型一样评估它。
masterful_metrics = model.evaluate( x_test, y_test, return_dict = True , verbose = 0 ) print (f "Baseline model accuracy: {baseline_metrics['categorical_accuracy']}" ) print (f "Masterful model accuracy: {masterful_metrics['categorical_accuracy']}" ) |
输出:
Baseline model accuracy: 0.436599999666214 Masterful model accuracy: 0.558899998664856 |
可视化结果
如您所见,准确率从 0.45 左右提高到 0.56。当然,更严格的研究将尝试消除基线训练和通过 Masterful 平台使用 SSL 进行的训练之间的其他差异,以及多次重复运行并生成误差线和 p 值。现在,让我们确保将其绘制为图表以帮助解释我们的结果。
import matplotlib.cm as cm from matplotlib.colors import Normalize data = (baseline_metrics[ 'categorical_accuracy' ], masterful_metrics[ 'categorical_accuracy' ]) fig, ax = plt.subplots( 1 , 1 ) ax.bar( range ( 2 ), data, color = ( 'gray' , 'red' )) plt.xlabel( "Training Method" ) plt.ylabel( "Accuracy" ) plt.xticks(( 0 , 1 ), ( "baseline" , "SSL with Masterful" )) plt.show() |
结论
恭喜!我们刚刚成功地采用了 SSL(可用的最先进的训练方法之一),通过简单的教程来提高模型的准确性。在此过程中,您避免了标记的成本和精力。
SSL 不仅仅适用于分类 – 各种风格适用于几乎任何计算机视觉任务。要深入了解该主题并了解 SSL 在对象检测中的实际应用,请在此处查看其他教程 。