第一段Python脚本是加载ImageNet数据集并解析类别标签的使用助手。 第二段Python脚本是利用在ImageNet数据集上预训练好的ResNet模型来实现基本的图像分类(由此来演示“标准”的图像分类)。 最后一段Python脚本用于执行一次对抗攻击,并且组成一张故意混淆我们的ResNet模型的对抗图像,而这两张图像对于肉眼来说看上去是一样的。
如何在Ubuntu系统下配置TensorFlow2.0 ?(How to installTensorFlow 2.0 on Ubuntu) 如何在macOS系统下配置TensorFlow2.0 ?How to install TensorFlow 2.0 on macOS
tree --dirsfirst
├── pyimagesearch
│ ├── __init__.py
│ ├── imagenet_class_index.json
│ └── utils.py
├── adversarial.png
├── generate_basic_adversary.py
├── pig.jpg
└── predict_normal.py
1 directory, 7 files
imagenet_class_index.json: 一个JSON文件,将ImageNet类别标签标记为可读的字符串。我们将会利用这个JSON文件来决定这样一组特殊标签的整数值索引,在构建对抗图像攻击时,这个索引将会给予我们帮助。 utils.py: 包含简单的Python辅助函数,用于载入和解析imagenet_class_index.json
predict_normal.py: 接收一张输入图像(pig.jpg),载入ResNet50模型,对输入图像进行分类。这个脚本的输出会是预测类别标签在ImageNet的类别标签索引。 generate_basic_adversary.py:利用predict_normal.py脚本中的输出,我们将构建一次对抗攻击来欺骗ResNet,这个脚本的输出(adversarial.png)将会存储在硬盘中。
"0": [
"1": [
"2": [
"3": [
"106": [
ImageNet标签的唯一标识符; 有可读性的类别标签。
接收一个输入标签; 转化成其对应标签的类别标签整数值索引。
# importnecessary packages
import json
import os
# build the path to theImageNet class label mappings file
labelPath = os.path.join(os.path.dirname(__file__),
# open theImageNet class mappings file and load the mappings as
# a dictionary with the human-readable class label as the keyand
# the integerindex as the value
withopen(labelPath)as f:
imageNetClasses = {labels[1]: int(idx)for(idx, labels)in
# check to see if the inputclass label has a corresponding
# integer index value, and if so return it; otherwise return
# a None-type value
return imageNetClasses.get(label, None)
如果在字典中存在改标签的话,则返回该标签的整数值索引; 否则返回None。
# import necessarypackages
from pyimagesearch.utils import get_class_idx
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.applications.resnet50 import decode_predictions
from tensorflow.keras.applications.resnet50 import preprocess_input
import numpy as np
import argparse
import imutils
import cv2
# swap color channels,preprocess the image, and add in a batch
# dimension
image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
image = preprocess_input(image)
image = cv2.resize(image, (224, 224))
image = np.expand_dims(image, axis=0)
# return the preprocessed image
return image
将图片的BGR通道组合转化为RGB; 执行preprocess_input函数,用于完成ResNet50中特别的预处理和比例缩放过程; 将图片大小调整为224×224; 增加一个批次维度。
# construct the argument parser and parsethe arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image",required=True,
help="pathto input image")
args = vars(ap.parse_args())
# load image fromdisk and make a clone for annotation
print("[INFO] loadingimage...")
image = cv2.imread(args["image"])
output = image.copy()
# preprocess the input image
output = imutils.resize(output, width=400)
preprocessedImage = preprocess_image(image)
# load thepre-trained ResNet50 model
print("[INFO] loadingpre-trained ResNet50 model...")
model = ResNet50(weights="imagenet")
# makepredictions on the input image and parse the top-3 predictions
print("[INFO] makingpredictions...")
predictions =model.predict(preprocessedImage)
predictions = decode_predictions(predictions, top=3)[0]
# loop over thetop three predictions
for(i, (imagenetID, label,prob))inenumerate(predictions):
# print the ImageNet class label ID of the top prediction to our
# terminal (we'll need thislabel for our next script which will
# perform the actual adversarial attack)
if i == 0:
print("[INFO] {} => {}".format(label, get_class_idx(label)))
# display the prediction to our screen
print("[INFO] {}.{}: {:.2f}%".format(i + 1, label, prob * 100))
# draw thetop-most predicted label on the image along with the
# confidence score
text = "{}:{:.2f}%".format(predictions[0][1],
predictions[0][2] * 100)
cv2.putText(output, text, (3, 20),cv2.FONT_HERSHEY_SIMPLEX, 0.8,
(0, 255, 0), 2)
# show the output image
cv2.imshow("Output", output)
$ pythonpredict_normal.py --image pig.jpg
[ ] loading image...
[ ] loadingpre-trained ResNet50 model...
[ ] making predictions...
[341 ] hog =>
[1. hog: 99.97% ]
[2.wild_boar: 0.03% ]
[3. piggy_bank: 0.00% ]
# import necessary packages
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.losses importSparseCategoricalCrossentropy
from tensorflow.keras.applications.resnet50 import decode_predictions
from tensorflow.keras.applications.resnet50 import preprocess_input
import tensorflow as tf
import numpy as np
import argparse
import cv2
# swap color channels, resizethe input image, and add a batch
# dimension
image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (224, 224))
image = np.expand_dims(image, axis=0)
# return the preprocessedimage
return image
defclip_eps(tensor, eps):
# clip the values of thetensor to a given range and return it
return tf.clip_by_value(tensor,clip_value_min=-eps,
defgenerate_adversaries(model, baseImage,delta, classIdx, steps=50):
# iterate over the number ofsteps
for step inrange(0, steps):
# record our gradients
with tf.GradientTape()as tape:
# explicitly indicate thatour perturbation vector should
# be tracked for gradient updates
model:ResNet50模型(如果你愿意,你可以换成其他预训练好的模型,例如VGG16,MobileNet等等); baseImage:原本没有被干扰的输入图像,我们有意针对这张图像创建对抗攻击,导致model参数对它进行错误的分类。 delta:噪声向量,将会被加入到baseImage中,最终导致错误分类。我们将会用梯度下降均值来更新这个delta 向量。 classIdx:通过predict_normal.py脚本所获得的类别标签整数值索引。 steps:梯度下降执行的步数(默认为50步)。
# add our perturbation vector to the base image and
# preprocess the resulting image
adversary = preprocess_input(baseImage + delta)
# run this newly constructed image tensor through our
# model and calculate theloss with respect to the
# *original* class index
predictions = model(adversary,training=False)
loss = -sccLoss(tf.convert_to_tensor([classIdx]),
# check to see if we arelogging the loss value, and if
# so, display it to our terminal
if step % 5 == 0:
print("step: {},loss: {}...".format(step,
# calculate the gradients ofloss with respect to the
# perturbation vector
gradients = tape.gradient(loss, delta)
# update the weights, clipthe perturbation vector, and
# update its value
optimizer.apply_gradients([(gradients, delta)])
delta.assign_add(clip_eps(delta, eps=EPS))
# return the perturbationvector
return delta
第7行用model参数导入的模型对新创建的对抗图像进行预测。 第8和9行针对原有的classIdx(通过运行predict_normal.py得到的top-1 ImageNet类别标签整数值索引)计算损失。 第12-14行表示每5步就显示一次损失值。
# construct the argumentparser and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--input", required=True,
help="path tooriginal input image")
ap.add_argument("-o", "--output", required=True,
help="path tooutput adversarial image")
ap.add_argument("-c", "--class-idx", type=int,required=True,
help="ImageNetclass ID of the predicted label")
args = vars(ap.parse_args())
--input: 输入图像的磁盘路径(例如pig.jpg); --output: 在构建进攻后的对抗图像输出(例如adversarial.png); --class-idex:ImageNet数据集中的类别标签整数值索引。我们可以通过执行在“非对抗图像的分类结果”章节中提到的predict_normal.py来获得这一索引。
# define theepsilon and learning rate constants
EPS = 2 / 255.0
LR = 0.1
# load the inputimage from disk and preprocess it
print("[INFO] loadingimage...")
image = cv2.imread(args["input"])
image = preprocess_image(image)
# load thepre-trained ResNet50 model for running inference
print("[INFO] loadingpre-trained ResNet50 model...")
model = ResNet50(weights="imagenet")
# initializeoptimizer and loss function
optimizer = Adam(learning_rate=LR)
sccLoss = SparseCategoricalCrossentropy()
# create a tensorbased off the input image and initialize the
# perturbation vector (we will update this vector via training)
baseImage = tf.constant(image,dtype=tf.float32)
delta = tf.Variable(tf.zeros_like(baseImage), trainable=True)
# generate the perturbation vector to create an adversarialexample
print("[INFO]generating perturbation...")
deltaUpdated = generate_adversaries(model, baseImage,delta,
# create theadversarial example, swap color channels, and save the
# output image to disk
print("[INFO]creating adversarial example...")
adverImage = (baseImage +deltaUpdated).numpy().squeeze()
adverImage = np.clip(adverImage, 0, 255).astype("uint8")
adverImage = cv2.cvtColor(adverImage,cv2.COLOR_RGB2BGR)
cv2.imwrite(args["output"], adverImage)
将超出[0,255] 范围的值裁剪掉; 将图片转化成一个无符号8-bit(unsigned 8-bit)整数(这样OpenCV才能对图片进行处理); 将通道顺序从RGB转换成BGR。
# run inferencewith this adversarial example, parse the results,
# and display the top-1 predicted result
print("[INFO]running inference on the adversarial example...")
preprocessedImage = preprocess_input(baseImage +deltaUpdated)
predictions =model.predict(preprocessedImage)
predictions = decode_predictions(predictions, top=3)[0]
label = predictions[0][1]
confidence = predictions[0][2] * 100
print("[INFO] label:{} confidence: {:.2f}%".format(label,
# draw the top-most predicted label on the adversarial imagealong
# with theconfidence score
text = "{}: {:.2f}%".format(label, confidence)
cv2.putText(adverImage, text, (3, 20),cv2.FONT_HERSHEY_SIMPLEX, 0.5,
(0, 255, 0), 2)
# show the output image
cv2.imshow("Output", adverImage)
$ python generate_basic_adversary.py --inputpig.jpg --output adversarial.png --class-idx 341
[INFO] loading image...
[INFO] loading pre-trained ResNet50 model...
[INFO] generatingperturbation...
step: 0, loss:-0.0004124982515349984...
step: 5, loss:-0.0010656398953869939...
step: 10, loss:-0.005332294851541519...
step: 15, loss: -0.06327803432941437...
step: 20, loss: -0.7707189321517944...
step: 25, loss: -3.4659299850463867...
step: 30, loss: -7.515471935272217...
step: 35, loss: -13.503922462463379...
step: 40, loss: -16.118188858032227...
step: 45, loss: -16.118192672729492...
[INFO] creating adversarial example...
[INFO] running inference on theadversarial example...
[INFO] label: wombat confidence: 100.00%
这张输入图片会被错误分类。 然而,肉眼看上去被扰乱的图片还是和之前一样。
