利用OpenCV和深度學習實現人臉檢測

2021-02-14 CVer

今天偷點兒懶，就沒有及時整理最新的paper，還請各位看官諒解。這裡整理了一份前段時間做的小demo，實現獻醜了

本文基於OpenCV3.3.1或以上版本（如OpenCV3.4）、DNN模塊和face_detector示例實現簡單、實時的人臉檢測。

往期回顧

[計算機視覺] 入門學習資料

[計算機視覺論文速遞] 2018-03-20

[計算機視覺論文速遞] 2018-03-18

註：

[1]：主要參考Face detection with OpenCV and deep learning這個英文教程，並作部分修改。

[2]：親測OpenCV3.3.0及以下版本，並沒有face_detector示例，且不支持face_detector。為了避免折騰，還是建議使用OpenCV3.3.1及以上（如OpenCV3.4）。

1 face_detector簡介

face_detector示例連結：

https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector

當電腦配置好OpenCV3.3.1或以上版本時，在opencv\samples\dnn也可以找到face_detector示例文件夾，如下圖所示：

使用OpenCV的DNN模塊以及Caffe模型，必須要有.prototxt和.caffemodel兩種文件。但face_detector文件夾中，只有.prototxt一類文件，即缺少訓練好的.caffemodel。.prototxt和.caffemodel的作用如下：

The .prototxt file(s) which define the model architecture (i.e., the layers themselves)

The .caffemodel file which contains the weights for the actual layers

face_detector文件分析：

本教程直接使用訓練好的.caffemodel來進行人臉檢測，即只需要.caffemodel和deploy.prototxt兩個文件。如果想要使用自己的數據集來訓練網絡，請參考"how_to_train_face_detector.txt"。

2 ResNet-10和SSD簡介

本教程屬於實戰篇，故不深入介紹算法內容，若對ResNet和SSD感興趣的同學，可以參考下述連結進行學習

[1]ResNet paper：https://arxiv.org/abs/1512.03385

[2]ResNet in Caffe：https://github.com/soeaver/caffe-model/tree/master/cls/resnet

[3]SSD paper：https://arxiv.org/abs/1512.02325

[4]SSD in Caffe：https://github.com/weiliu89/caffe/tree/ssd

3 .caffemodel下載

res10_300x300_ssd_iter_140000.caffemodel下載連結：https://anonfile.com/W7rdG4d0b1/face_detector.rar

4 C++版本代碼

4.1 圖像中的人臉檢測

對於OpenCV3.4版本，可直接使用opencv-3.4.1\samples\dnn文件夾中的resnet_ssd_face.cpp；

對於OpenCV3.3.1版本，可參考下述代碼：

face_detector_image.cpp

1
2
3
4
5
6
7#include <iostream>
8#include <opencv2/opencv.hpp>
9#include <opencv2/dnn.hpp>
10
11using namespace std;
12using namespace cv;
13using namespace cv::dnn;
14
15
16
17const size_t inWidth = 300;
18const size_t inHeight = 300;
19const double inScaleFactor = 1.0;
20const Scalar meanVal(104.0, 177.0, 123.0);
21
22
23
24int main(int argc, char** argv)
25{
26
27 Mat img;
28
29#if 0
30
31 if (argc < 2)
32 {
33 cerr<< "please input "<< endl;
34 cerr << "[Format]face_detector_img.exe image.jpg"<< endl;
35 return -1;
36 }
37
38 img = imread(argv[1]);
39
40#else
41
42 img = imread("iron_chic.jpg");
43#endif
44
45
46
47
48 float min_confidence = 0.5;
49
50 String modelConfiguration = "face_detector/deploy.prototxt";
51
52 String modelBinary = "face_detector/res10_300x300_ssd_iter_140000.caffemodel";
53
54 dnn::Net net = readNetFromCaffe(modelConfiguration, modelBinary);
55
56
57
58 if (net.empty())
59 {
60 cerr << "Can't load network by using the following files: " << endl;
61 cerr << "prototxt: " << modelConfiguration << endl;
62 cerr << "caffemodel: " << modelBinary << endl;
63 cerr << "Models are available here:" << endl;
64 cerr << "<OPENCV_SRC_DIR>/samples/dnn/face_detector" << endl;
65 cerr << "or here:" << endl;
66 cerr << "https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector" << endl;
67 exit(-1);
68 }
69
70
71
72
73 Mat inputBlob = blobFromImage(img, inScaleFactor, Size(inWidth, inHeight), meanVal, false, false);
74 net.setInput(inputBlob, "data");
75 Mat detection = net.forward("detection_out");
76
77
78
79 vector<double> layersTimings;
80 double freq = getTickFrequency() / 1000;
81 double time = net.getPerfProfile(layersTimings) / freq;
82
83 Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());
84 ostringstream ss;
85 ss << "FPS: " << 1000 / time << " ; time: " << time << "ms" << endl;
86
87 putText(img, ss.str(), Point(20,20), 0, 0.5, Scalar(0, 0, 255));
88
89 float confidenceThreshold = min_confidence;
90 for (int i = 0; i < detectionMat.rows; ++i)
91 {
92
93 float confidence = detectionMat.at<float>(i, 2);
94
95 if (confidence > confidenceThreshold)
96 {
97 int xLeftBottom = static_cast<int>(detectionMat.at<float>(i, 3) * img.cols);
98 int yLeftBottom = static_cast<int>(detectionMat.at<float>(i, 4) * img.rows);
99 int xRightTop = static_cast<int>(detectionMat.at<float>(i, 5) * img.cols);
100 int yRightTop = static_cast<int>(detectionMat.at<float>(i, 6) * img.rows);
101 Rect object((int)xLeftBottom, (int)yLeftBottom, (int (xRightTop - xLeftBottom),
102 (int)(yRightTop - yLeftBottom));
103 rectangle(img, object, Scalar(0, 255, 0));
104 ss.str("");
105 ss << confidence;
106 String conf(ss.str());
107 String label = "Face: " + conf;
108 int baseLine = 0;
109 Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
110 rectangle(img, Rect(Point(xLeftBottom, yLeftBottom-labelSize.height),
111 Size(labelSize.width, labelSize.height + baseLine)),
112 Scalar(255, 255, 255), CV_FILLED);
113 putText(img, label, Point(xLeftBottom, yLeftBottom),
114 FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
115
116 }
117 }
118
119 namedWindow("Face Detection", WINDOW_NORMAL);
120 imshow("Face Detection", img);
121 waitKey(0);
122
123return 0;
124
125}

檢測結果

4.2 攝像頭/視頻中的人臉檢測

face_detector_video.cpp

1
2
3
4
5
6#include <iostream>
7#include <cstdlib>
8#include <stdio.h>
9#include <opencv2/opencv.hpp>
10#include <opencv2/dnn.hpp>
11#include <opencv2/dnn/shape_utils.hpp>
12
13using namespace cv;
14using namespace cv::dnn;
15using namespace std;
16const size_t inWidth = 300;
17const size_t inHeight = 300;
18const double inScaleFactor = 1.0;
19const Scalar meanVal(104.0, 177.0, 123.0);
20
21int main(int argc, char** argv)
22{
23 float min_confidence = 0.5;
24 String modelConfiguration = "face_detector/deploy.prototxt";
25 String modelBinary = "face_detector/res10_300x300_ssd_iter_140000.caffemodel";
26
27 dnn::Net net = readNetFromCaffe(modelConfiguration, modelBinary);
28
29 if (net.empty())
30 {
31 cerr << "Can't load network by using the following files: " << endl;
32 cerr << "prototxt: " << modelConfiguration << endl;
33 cerr << "caffemodel: " << modelBinary << endl;
34 cerr << "Models are available here:" << endl;
35 cerr << "<OPENCV_SRC_DIR>/samples/dnn/face_detector" << endl;
36 cerr << "or here:" << endl;
37 cerr << "https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector" << endl;
38 exit(-1);
39 }
40
41 VideoCapture cap(0);
42 if (!cap.isOpened())
43 {
44 cout << "Couldn't open camera : " << endl;
45 return -1;
46 }
47 for (;;)
48 {
49 Mat frame;
50 cap >> frame;
51
52 if (frame.empty())
53 {
54 waitKey();
55 break;
56 }
57
58 if (frame.channels() == 4)
59 cvtColor(frame, frame, COLOR_BGRA2BGR);
60
61
62 Mat inputBlob = blobFromImage(frame, inScaleFactor,
63 Size(inWidth, inHeight), meanVal, false, false);
64
65
66
67 net.setInput(inputBlob, "data");
68
69
70
71 Mat detection = net.forward("detection_out");
72
73
74 vector<double> layersTimings;
75 double freq = getTickFrequency() / 1000;
76 double time = net.getPerfProfile(layersTimings) / freq;
77
78 Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());
79
80 ostringstream ss;
81 ss << "FPS: " << 1000 / time << " ; time: " << time << " ms";
82 putText(frame, ss.str(), Point(20, 20), 0, 0.5, Scalar(0, 0, 255));
83
84 float confidenceThreshold = min_confidence;
85 for (int i = 0; i < detectionMat.rows; i++)
86 {
87 float confidence = detectionMat.at<float>(i, 2);
88
89 if (confidence > confidenceThreshold)
90 {
91 int xLeftBottom = static_cast<int>(detectionMat.at<float>(i, 3) * frame.cols);
92 int yLeftBottom = static_cast<int>(detectionMat.at<float>(i, 4) * frame.rows);
93 int xRightTop = static_cast<int>(detectionMat.at<float>(i, 5) * frame.cols);
94 int yRightTop = static_cast<int>(detectionMat.at<float>(i, 6) * frame.rows);
95
96 Rect object((int)xLeftBottom, (int)yLeftBottom,
97 (int)(xRightTop - xLeftBottom),
98 (int)(yRightTop - yLeftBottom));
99
100 rectangle(frame, object, Scalar(0, 255, 0));
101
102 ss.str("");
103 ss << confidence;
104 String conf(ss.str());
105 String label = "Face: " + conf;
106 int baseLine = 0;
107 Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine);
108 rectangle(frame, Rect(Point(xLeftBottom, yLeftBottom - labelSize.height),
109 Size(labelSize.width, labelSize.height + baseLine)),
110 Scalar(255, 255, 255), CV_FILLED);
111 putText(frame, label, Point(xLeftBottom, yLeftBottom),
112 FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0));
113 }
114 }
115 cv::imshow("detections", frame);
116 if (waitKey(1) >= 0) break;
117 }
118 return 0;
119}

檢測結果

5 Python版本代碼

最簡單安裝Python版的OpenCV方法

對於OpenCV3.4版本，可直接使用opencv-3.4.1\samples\dnn文件夾中的resnet_ssd_face_python.py；

對於OpenCV3.3.1版本，可參考下述代碼（自己寫的）：

5.1 圖像中的人臉檢測

detect_faces.py

1
2
3
4
5import numpy as np
6import argparse
7import cv2
8
9
10ap = argparse.ArgumentParser()
11ap.add_argument("-i", "--image", required=True,
12 help="path to input image")
13ap.add_argument("-p", "--prototxt", required=True,
14 help="path to Caffe 'deploy' prototxt file")
15ap.add_argument("-m", "--model", required=True,
16 help="path to Caffe pre-trained model")
17ap.add_argument("-c", "--confidence", type=float, default=0.5,
18 help="minimum probability to filter weak detections")
19args = vars(ap.parse_args())
20
21
22print("[INFO] loading model...")
23net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
24
25
26
27image = cv2.imread(args["image"])
28(h, w) = image.shape[:2]
29blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 1.0,
30 (300, 300), (104.0, 177.0, 123.0))
31
32
33
34print("[INFO] computing object detections...")
35net.setInput(blob)
36detections = net.forward()
37
38
39for i in range(0, detections.shape[2]):
40
41
42 confidence = detections[0, 0, i, 2]
43
44
45
46 if confidence > args["confidence"]:
47
48
49 box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
50 (startX, startY, endX, endY) = box.astype("int")
51
52
53
54 text = "{:.2f}%".format(confidence * 100)
55 y = startY - 10 if startY - 10 > 10 else startY + 10
56 cv2.rectangle(image, (startX, startY), (endX, endY),
57 (0, 0, 255), 2)
58 cv2.putText(image, text, (startX, y),
59 cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2)
60
61
62cv2.imshow("Output", image)
63cv2.waitKey(0)

打開cmd命令提示符，切換至路徑下，輸入下述命令：

檢測結果

5.2 攝像頭/視頻中的人臉檢測

detect_faces_video.py

1
2
3
4
5from imutils.video import VideoStream
6import numpy as np
7import argparse
8import imutils
9import time
10import cv2
11
12
13ap = argparse.ArgumentParser()
14ap.add_argument("-p", "--prototxt", required=True,
15 help="path to Caffe 'deploy' prototxt file")
16ap.add_argument("-m", "--model", required=True,
17 help="path to Caffe pre-trained model")
18ap.add_argument("-c", "--confidence", type=float, default=0.5,
19 help="minimum probability to filter weak detections")
20args = vars(ap.parse_args())
21
22
23print("[INFO] loading model...")
24net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
25
26
27print("[INFO] starting video stream...")
28vs = VideoStream(src=0).start()
29time.sleep(2.0)
30
31
32while True:
33
34
35 frame = vs.read()
36 frame = imutils.resize(frame, width=400)
37
38
39 (h, w) = frame.shape[:2]
40 blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)), 1.0,
41 (300, 300), (104.0, 177.0, 123.0))
42
43
44
45 net.setInput(blob)
46 detections = net.forward()
47
48
49 for i in range(0, detections.shape[2]):
50
51
52 confidence = detections[0, 0, i, 2]
53
54
55
56 if confidence < args["confidence"]:
57 continue
58
59
60
61 box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
62 (startX, startY, endX, endY) = box.astype("int")
63
64
65
66 text = "{:.2f}%".format(confidence * 100)
67 y = startY - 10 if startY - 10 > 10 else startY + 10
68 cv2.rectangle(frame, (startX, startY), (endX, endY),
69 (0, 0, 255), 2)
70 cv2.putText(frame, text, (startX, y),
71 cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2)
72
73
74 cv2.imshow("Frame", frame)
75 key = cv2.waitKey(1) & 0xFF
76
77
78 if key == ord("q"):
79 break
80
81
82cv2.destroyAllWindows()
83vs.stop()

打開cmd命令提示符，切換至路徑下，輸入下述命令：

如果程序出錯，如ImportError: No module named imutils.video。這說明當前Python庫中沒有imutils庫，所以可以使用pip安裝：

檢測結果

總結

本教程介紹並使用了OpenCV最新提供的更加精確的人臉檢測器（與OpenCV的Haar級聯相比）。

這裡的OpenCV人臉檢測器是基於深度學習的，特別是利用ResNet和SSD框架作為基礎網絡。

感謝Aleksandr Rybnikov、OpenCV dnn模塊和Adrian Rosebrock等其他貢獻者的努力，我們可以在自己的應用中享受到這些更加精確的OpenCV人臉檢測器。

為了你的方便，我已經為你準備了本教程所使用的必要文件，請見下述內容。

代碼下載

deep-learning-face-detection.rar：https://anonfile.com/nft4G4d5b1/deep-learning-face-detection.rar

Reference

[1]Face detection with OpenCV and deep learning：https://www.pyimagesearch.com/2018/02/26/face-detection-with-opencv-and-deep-learning/

[2]face_detector：https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector

[3]opencv3.4 發布 dnnFace震撼來襲：http://blog.csdn.net/minstyrain/article/details/78907425

--我是可愛的分割線--

若喜歡Amusi推送的文章，請掃描下方二維碼關注CVer公眾號！

利用OpenCV和深度學習實現人臉檢測

相關焦點

如何利用樹莓派實現基於深度學習的目標檢測(入門)

用 cv2 實現人臉檢測

用OpenCV和Python模糊和匿名化人臉

給OpenCV初學者的禮物——OpenCV人臉檢測入門教程

獨家 | 利用OpenCV和深度學習來實現人類活動識別(附連結)

使用Python+OpenCV實現神經網絡預處理人臉圖像的快速指南

用OpenCV和深度學習進行年齡識別

使用Python+OpenCV+Dlib實現人臉檢測與人臉特徵關鍵點識別

OpenCV 4.x 中請別再用HAAR級聯檢測器檢測人臉!有更好更準的方法

Python不超過10行代碼就可實現人臉識別,教你辨別真假

人臉檢測發展:從VJ到深度學習(下)

OpenCV基於Landmark實現人臉交換

Python系列之三——人臉檢測、人臉識別

人臉檢測+數據訓練+人臉識別

OpenCV+Tensorflow實現實時人臉識別演示

OpenCV+深度學習預訓練模型,簡單搞定圖像識別 | 教程

OpenCV入門及應用案例:手把手教你做DNN圖像分類

基於深度學習的人臉識別技術全解

手把手教你使用樹莓派實現實時人臉檢測

人臉檢測發展:從VJ到深度學習(上)