人脸识别之 OpenCV、Dlib 对比

小马同学
人脸识别
2021-09-03
16热度
0评论

Learn OpenCV 网站博主 Vikas Gupta 博士对 OpenCV、Dlib 中四种人脸检测算法实现进行了比较分析，且对精度和速度都进行了量化：

一、OpenCV

1、OpenCV Haar Cascade 人脸检测

Python：

faceCascade = cv2.CascadeClassifier('./haarcascade_frontalface_default.xml')

faces = faceCascade.detectMultiScale(frameGray)

for face in faces:

x1, y1, w, h = face

x2 = x1 + w

y2 = y1 + h

C++：

faceCascadePath = "./haarcascade_frontalface_default.xml";

faceCascade.load( faceCascadePath )

std::vector<Rect> faces;

faceCascade.detectMultiScale(frameGray, faces);

for ( size_t i = 0; i < faces.size(); i++ )

{

int x1 = faces[i].x;

int y1 = faces[i].y;

int x2 = faces[i].x + faces[i].width;

int y2 = faces[i].y + faces[i].height;

}

上面的代码片段加载 haar 级联模型文件并将其应用于灰度图像。输出是一个包含检测到的人脸的列表。列表的每个成员又是一个包含 4 个元素的列表，表示左上角的 (x, y) 坐标以及检测到的人脸的宽度和高度。

优点：

1）几乎可以在CPU上实时工作；

2）简单的架构；

3）可以检测不同比例的人脸。

缺点：

1）会出现大量的把非人脸预测为人脸的情况；

2）不适用于非正面人脸图像；

3）不抗遮挡。

2、OpenCV DNN 人脸检测

从 OpenCV 3.3 版本后开始引入，算法出自论文《SSD: Single Shot MultiBox Detector》（https://arxiv.org/abs/1512.02325）。使用ResNet-10作为骨干网。

OpenCV 提供了两个模型：

1）原始 Caffe 实现的16位浮点型版本（5.4MB）;

2）TensorFlow 实现的8位量化版本（2.7MB）。

Python：

DNN = "TF"

if DNN == "CAFFE":

modelFile = "res10_300x300_ssd_iter_140000_fp16.caffemodel"

configFile = "deploy.prototxt"

net = cv2.dnn.readNetFromCaffe(configFile, modelFile)

else:

modelFile = "opencv_face_detector_uint8.pb"

configFile = "opencv_face_detector.pbtxt"

net = cv2.dnn.readNetFromTensorflow(modelFile, configFile)

C++：

const std::string caffeConfigFile = "./deploy.prototxt";

const std::string caffeWeightFile = "./res10_300x300_ssd_iter_140000_fp16.caffemodel";

const std::string tensorflowConfigFile = "./opencv_face_detector.pbtxt";

const std::string tensorflowWeightFile = "./opencv_face_detector_uint8.pb";

#ifdef CAFFE

Net net = cv::dnn::readNetFromCaffe(caffeConfigFile, caffeWeightFile);

#else

Net net = cv::dnn::readNetFromTensorflow(tensorflowWeightFile, tensorflowConfigFile);

#endif

我们使用上面的代码加载所需的模型。如果我们要使用 Caffe 的浮点模型，我们使用 caffemodel 和 prototxt 文件。否则，我们使用量化的张量流模型。还要注意我们读取 Caffe 和 Tensorflow 网络的方式的不同。

Python：

blob = cv2.dnn.blobFromImage(frameOpencvDnn, 1.0, (300, 300), [104, 117, 123], False, False)

net.setInput(blob)

detections = net.forward()

bboxes = []

for i in range(detections.shape[2]):

confidence = detections[0, 0, i, 2]

if confidence > conf_threshold:

x1 = int(detections[0, 0, i, 3] * frameWidth)

y1 = int(detections[0, 0, i, 4] * frameHeight)

x2 = int(detections[0, 0, i, 5] * frameWidth)

y2 = int(detections[0, 0, i, 6] * frameHeight)

C++：

#ifdef CAFFE

cv::Mat inputBlob = cv::dnn::blobFromImage(frameOpenCVDNN, inScaleFactor, cv::Size(inWidth, inHeight), meanVal, false, false);

#else

cv::Mat inputBlob = cv::dnn::blobFromImage(frameOpenCVDNN, inScaleFactor, cv::Size(inWidth, inHeight), meanVal, true, false);

#endif

net.setInput(inputBlob, "data");

cv::Mat detection = net.forward("detection_out");

cv::Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>());

for(int i = 0; i < detectionMat.rows; i++)

{

float confidence = detectionMat.at<float>(i, 2);

if(confidence > confidenceThreshold)

{

int x1 = static_cast<int>(detectionMat.at<float>(i, 3) * frameWidth);

int y1 = static_cast<int>(detectionMat.at<float>(i, 4) * frameHeight);

int x2 = static_cast<int>(detectionMat.at<float>(i, 5) * frameWidth);

int y2 = static_cast<int>(detectionMat.at<float>(i, 6) * frameHeight);

cv::rectangle(frameOpenCVDNN, cv::Point(x1, y1), cv::Point(x2, y2), cv::Scalar(0, 255, 0),2, 4);

}

在上面的代码中，图像被转换为 blob 并使用 forward() 函数通过网络。输出检测是一个 4-D 矩阵，其中

第三维迭代检测到的人脸。

第四维包含有关每个人脸的边界框和分数的信息。例如， detections[0,0,0,2] 给出了第一张脸的置信度分数，而 detections[0,0,0,3:6] 给出了边界框。

边界框的输出坐标在 [0,1] 之间归一化。因此，坐标应乘以原始图像的高度和宽度，以获得图像上正确的边界框。

优点：

1）在这四种方法中是最准确的；

2）在CPU上能够实时运行；

3）适用于不同的人脸方向：上，下，左，右，侧面等。

4）甚至在严重遮挡下仍能工作；

5）可以检测各种尺度的人脸。

缺点：

作者认为没有什么大的缺点，不能使用 NVIDIA GPU 是个遗憾。

基于 DNN 的检测器克服了基于 Haar 级联检测器的所有缺点，而不会影响 Haar 提供的任何好处。除了比接下来讨论的基于 Dlib HoG 的人脸检测器慢之外，我们看不出这种方法有任何主要缺点。

可以肯定地说，是时候告别基于 Haar 的人脸检测器了，基于 DNN 的人脸检测器应该是 OpenCV 中的首选。

二、Dlib

1、Dlib HoG 人脸检测

这是一种广泛使用的人脸检测模型，基于 HoG 特征和 SVM。该模型由 5 个 HOG 过滤器构建而成——前视、左视、右视、前视但向左旋转，以及前视但向右旋转。

Python：

hogFaceDetector = dlib.get_frontal_face_detector()
faceRects = hogFaceDetector(frameDlibHogSmall, 0)
for faceRect in faceRects:
x1 = faceRect.left()
y1 = faceRect.top()
x2 = faceRect.right()
y2 = faceRect.bottom()

C++：

frontal_face_detector hogFaceDetector = get_frontal_face_detector();
// Convert OpenCV image format to Dlib's image format
cv_image<bgr_pixel> dlibIm(frameDlibHogSmall);
// Detect faces in the image
std::vector<dlib::rectangle> faceRects = hogFaceDetector(dlibIm);
for ( size_t i = 0; i < faceRects.size(); i++ )
{
int x1 = faceRects[i].left();
int y1 = faceRects[i].top();
int x2 = faceRects[i].right();
int y2 = faceRects[i].bottom();
cv::rectangle(frameDlibHog, Point(x1, y1), Point(x2, y2), Scalar(0,255,0), (int)(frameHeight/150.0), 4);
}

在上面的代码中，我们首先加载人脸检测器。然后我们将图像传递给检测器。第二个参数是我们想要放大图像的次数。您越高档，检测到较小面孔的机会就越大。然而，放大图像将对计算速度产生重大影响。输出是带有 (x, y) 对角坐标的面列表的形式。

优点：

1）CPU上最快的方法；

2）适用于正面和略微非正面的人脸；

3）与其他三个相比模型很小；

4）在小的遮挡下仍可工作。

缺点：

1）不能检测小脸，因为它训练数据的最小人脸尺寸为80×80，但是用户可以用较小尺寸的人脸数据自己训练检测器；

2）边界框通常排除前额的一部分甚至下巴的一部分；

3）在严重遮挡下不能很好地工作；

4）不适用于侧面和极端非正面，如俯视或仰视。

2、Dlib CNN 人脸检测

算法来自论文《Max-Margin Object Detection》（https://arxiv.org/abs/1502.00046）。

Python：

dnnFaceDetector = dlib.cnn_face_detection_model_v1("./mmod_human_face_detector.dat")

faceRects = dnnFaceDetector(frameDlibHogSmall, 0)

for faceRect in faceRects:

x1 = faceRect.rect.left()

y1 = faceRect.rect.top()

x2 = faceRect.rect.right()

y2 = faceRect.rect.bottom()

C++：

String mmodModelPath = "./mmod_human_face_detector.dat";

net_type mmodFaceDetector;

deserialize(mmodModelPath) >> mmodFaceDetector;

// Convert OpenCV image format to Dlib's image format

cv_image<bgr_pixel> dlibIm(frameDlibMmodSmall);

matrix<rgb_pixel> dlibMatrix;

assign_image(dlibMatrix, dlibIm);

// Detect faces in the image

std::vector<dlib::mmod_rect> faceRects = mmodFaceDetector(dlibMatrix);

for ( size_t i = 0; i < faceRects.size(); i++ )

{

int x1 = faceRects[i].rect.left();

int y1 = faceRects[i].rect.top();

int x2 = faceRects[i].rect.right();

int y2 = faceRects[i].rect.bottom();

cv::rectangle(frameDlibMmod, Point(x1, y1), Point(x2, y2), Scalar(0,255,0), (int)(frameHeight/150.0), 4);

}

优点：

1）适用于不同的人脸方向；

2）对遮挡鲁棒；

3）在GPU上工作得非常快；

4）非常简单的训练过程。

缺点：

1）CPU速度很慢；

2）不能检测小脸，因为它训练数据的最小人脸尺寸为80×80，但是用户可以用较小尺寸的人脸数据自己训练检测器；

3）人脸包围框甚至小于 DLib HoG 人脸检测器。

原文链接：https://learnopencv.com/face-detection-opencv-dlib-and-deep-learning-c-python/

人脸识别 之 OpenCV、Dlib 对比

一、OpenCV

二、Dlib

人脸识别之 OpenCV、Dlib 对比