[CV]人脸识别检测数据集

做了一段时间的人脸识别和检测,在这里列一下用过的数据集。基本上,大家近期也都是在这几个数据集上检测自己的算法。网上这方面的总结虽然不少,但是一则有些是多年前的数据,或是规模太小或是链接已经失效,再则是数据集的测试协议定义不明,不尽适合用来和其它方法做比较。

1. Labeled Faces in the Wild:
做人脸识别,准确的说是人脸验证(Face Verification),UMass的LFW估计是最近被用的最多的了,LFW采用的测试协议也已经被几个新的数据集沿用了。人脸验证是指,给定两张人脸的照片,算法需要判断它们是不是来自同一个人。最新的结果(ICCV2013),在限制条件最少的协议下,识别的准确率现在已经高达96%了。[广告^_^] 在限制条件最严的协议下,我们的CVPR2013的结果曾经是最好的。最近被Fisher Vector超过了.. 我们还会回来的…

2. YouTube Faces DB:
YouTube Video Faces也是用来做人脸验证的,和LFW不同的是,在这个数据集下,算法需要判断两段视频里面是不是同一个人。有不少在照片上有效的方法,在视频上未必有效/高效。[广告^_^] 在这个数据集上,我们的最新的结果超过81%,目前还没有看到更高的准确率。

3. FDDB:
FDDB也是UMass的数据集,被用来做人脸检测(Face Detection)。这个数据集比较大,比较有挑战性。而且作者提供了程序用来评估检测结果,所以在这个数据上面比较算法也相对公平。FDDB的一个问题是它的标注是椭圆而不是一般用的长方形,这一点可以会导致检测结果评估不准确。不过因为标准统一,问题倒也不大。[广告^_^] 我们ICCV2013的文章在这个数据上面有不错的结果。

4. The Gallagher Collection Person Dataset:
这也是一个做人脸检测的数据集,是Andrew Gallagher的家庭相册。虽然不是给人脸识别设计的,但是很接近实际应用的场景。很适合用来测试自己的方法。

5. The Annotated Faces in the Wild (AFW) testset:
这还是一个做人脸检测的数据集,随UCI的Xiangxin Zhu在CVPR2012的文章发布。值得注意的是在他们的主页有公开的源代码。虽然人脸检测做了很久,但是效果比较好的,可以在网上方便的得到的检测库除了OpenCV以外并不多见。

6. CMU Dataset:
做人脸检测的数据集,这是一个很有些年头的数据集了,虽然大家最近不常用这个数据,但是这不代表这个老数据集很容易对付。最新的检测算法往往需要比较稠密的取比较复杂的特征,这在这个黑白而且分辨率不高的数据集上未必可行。

7. POS Labeled Faces in the Wild:
这个数据我还没有用过,是最近才发布的一个更大型的LFW。可以用来做人脸识别。看起来很不错的样子。

自己感觉比较好用的数据集就是这些了。
感觉不知道应该写点什么,想来还是写写自己专业相关的内容会比较有趣。

[OpenCV] detectMultiScale: output detection score

OpenCV provides quite decent implementation of the Viola-Jones Face detector.

A quick example looks like this (OpenCV 2.4.5 tested):

// File: main.cc
#include 

using namespace cv;

int main(int argc, char **argv) {

    CascadeClassifier cascade;
    const float scale_factor(1.2f);
    const int min_neighbors(3);

    if (cascade.load("./lbpcascade_frontalface.xml")) {

        for (int i = 1; i < argc; i++) {

            Mat img = imread(argv[i], CV_LOAD_IMAGE_GRAYSCALE);
            equalizeHist(img, img);
            vector objs;
            cascade.detectMultiScale(img, objs, scale_factor, min_neighbors);

            Mat img_color = imread(argv[i], CV_LOAD_IMAGE_COLOR);
            for (int n = 0; n < objs.size(); n++) {
                rectangle(img_color, objs[n], Scalar(255,0,0), 8);
            }
            imshow("VJ Face Detector", img_color);
            waitKey(0);
        }
    }

    return 0;
}
g++ -std=c++0x -I/usr/local/include `pkg-config --libs opencv` main.cc -o main

The detection results are as shown below:
result

For more serious user, it would be nice to have a detection result for each detected face.
The OpenCV provides a overloaded function designed for this usage which is lack of detailed documentation:

vector reject_levels;
vector level_weights;
cascade.detectMultiScale(img, objs, reject_levels, level_weights, scale_factor, min_neighbors);

The reject_levels and level_weights will keep being empty until you write it like this (The whole file):

// File: main.cc
#include 

using namespace cv;

int main(int argc, char **argv) {

    CascadeClassifier cascade;
    const float scale_factor(1.2f);
    const int min_neighbors(3);

    if (cascade.load("./lbpcascade_frontalface.xml")) {

        for (int i = 1; i < argc; i++) {

            Mat img = imread(argv[i], CV_LOAD_IMAGE_GRAYSCALE);
            equalizeHist(img, img);
            vector objs;
            vector reject_levels;
            vector level_weights;
            cascade.detectMultiScale(img, objs, reject_levels, level_weights, scale_factor, min_neighbors, 0, Size(), Size(), true);

            Mat img_color = imread(argv[i], CV_LOAD_IMAGE_COLOR);
            for (int n = 0; n < objs.size(); n++) {
                rectangle(img_color, objs[n], Scalar(255,0,0), 8);
                putText(img_color, std::to_string(level_weights[n]),
                        Point(objs[n].x, objs[n].y), 1, 1, Scalar(0,0,255));
            }
            imshow("VJ Face Detector", img_color);
            waitKey(0);
        }
    }

    return 0;
}

However, this will give you a large number of detected rectangles:
result-org

This is because OpenCV skips the step of filtering out the overlapped small rectangles. I have no idea whether this is by design. But output likes this would not be helpful at least in my own case.

So we would need to make our own changes in the OpenCV's source code.
There are different ways to design detection score, such as
"In the OpenCV implementation, stage_sum is computed and compared against the i stage_threshold for each stage to accept/reject a candidate window. We define the detection score for a candidate window as K*stage_when_rejected + stage_sum_for_stage_when_rejected. If a window is accepted by the cascade, we just K*last_stage + stage_sum_for_last_stage. Choosing K as a large value e.g., 1000, we ensure that windows rejected at stage i have higher score than those rejected at stage i-1." from http://vis-www.cs.umass.edu/fddb/faq.html

Actually, I found a straightforward design of detection score works well in my own work. In the last stage of the face detector in OpenCV, detection rectangles are grouped into clustered to eliminated small overlapped rectangles while keeping the most potential rectangles. The number of final detected faces is at most same as the number of clusters. So we can simply use the number of rectangles grouped into the cluster as the detection score of the associated final rectangle, which may not be accurate but could work.

To make this change, in OpenCV-2.4.5, find the file modules/objdetect/src/cascadedetect.cpp (line 200)

// modules/objdetect/src/cascadedetect.cpp (line 200)
// int n1 = levelWeights ? rejectLevels[i] : rweights[i]; //< comment out this line
int n1 = rweights[i]; //< the change

We then modify the file main.cc accordingly:

// File: main.cc
#include 

using namespace cv;

int main(int argc, char **argv) {

    CascadeClassifier cascade;
    const float scale_factor(1.2f);
    const int min_neighbors(3);

    if (cascade.load("./lbpcascade_frontalface.xml")) {

        for (int i = 1; i < argc; i++) {

            Mat img = imread(argv[i], CV_LOAD_IMAGE_GRAYSCALE);
            equalizeHist(img, img);
            vector objs;
            vector reject_levels;
            vector level_weights;
            cascade.detectMultiScale(img, objs, reject_levels, level_weights, scale_factor, min_neighbors, 0, Size(), Size(), true);

            Mat img_color = imread(argv[i], CV_LOAD_IMAGE_COLOR);
            for (int n = 0; n < objs.size(); n++) {
                rectangle(img_color, objs[n], Scalar(255,0,0), 8);
                putText(img_color, std::to_string(reject_levels[n]),
                        Point(objs[n].x, objs[n].y), 1, 1, Scalar(0,0,255));
            }
            imshow("VJ Face Detector", img_color);
            waitKey(0);
        }
    }

    return 0;
}

And we can have the detection scores like this:
result-final