在SLAM系統問題中,根據經典的框架,分為:前端、後端、迴環、建圖四大部分。所謂前端,就是視覺裡程計的部分,視覺裡程計又稱前端。在視覺構建中,前端的作用就是把圖像的信息處理為相關後續可以使用的計算向量,就是通過圖像得出相機的運動信息,為後續計算提供可能。
圖像處理中,特徵點指的是圖像灰度值發生劇烈變化的點或者在圖像邊緣上曲率較大的點(即兩個邊緣的交點)。圖像特徵點在基於特徵點的圖像匹配算法中有著十分重要的作用。圖像特徵點能夠反映圖像本質特徵,能夠標識圖像中目標物體。通過特徵點的匹配能夠完成圖像的匹配。
在傳統的計算機視覺中,特徵點是描述圖像特徵的像素描述,一般由像素特徵和像素描述組成,這裡的像素描述也可以稱為特徵描述。在我們這一章所講述的視覺裡程計中,在像素級的描述,將視覺裡程計的算法大致分為2大類,一類為我們的特徵點法,一類為我們的直接法。
2)特徵點為了準確的估計出相機的運動信息,我們需要選擇合適的特徵點。圖像的本質是像素,像素的本質是色彩和亮度,在像素的層面上考慮,所謂特徵點就是具有代表性的像素點或者像素點的集合。當然,這個特徵點必須足夠穩定,足夠有代表性,否則就不能描述出一個圖像的特徵。
圖像中,角點、邊緣、塊等等,都可能出現比較有代表性的特徵點,所謂代表性,就是它比較特殊。在傳統的角點特徵點上,研究出現了很多特徵點提取的相關算法,比如:Harris、GFTT等等,各有千秋,那麼在SLAM系統中,在像素特徵點的層面上,比較有代表性的特徵點有哪些呢?
特徵點,在學術界的描述為:關鍵點+描述組成。在SLAM中,比較合適、典型的特徵點就是ORB特徵點。
3)ORBORB是ORiented Brief的簡稱。他的提出論文的原文在:Ethan Rublee and Vincent Rabaud and Kurt Konolige and Gary Bradski, ORB: an efficient alternative to SIFT or SURF, ICCV 2011
其由2部分組成:FAST角點和BRIEF描述。
FAST角點,正如其名,特點就是快。為什麼快呢,其實也非常簡單,就是其只比較像素的亮度值,其具體的方法和理論這裡不做贅述。
ORB特徵點對其的改進有哪些呢?就是利用圖像金字塔,建立了尺度不變性,何為圖像金字塔,就是可以理解為多個大小圖像的合集。除此之外還有什麼呢?還有尺度不變性,這是如何構建的呢?其實也非常簡單,利用灰度質心法,簡單得出一個方向向量。就大大改進了FAST角點。
BRIEF的優點當然在於速度快,缺點當然也很多:不具備旋轉不變性、對噪聲敏感、不具備尺度不變性等等,看到這裡,是不是感覺上面的改進有點聰明了。BRIEF描述子是用二進位進行描述的描述向量。對於關鍵點的描述子,BRIEF描述子也非常簡單粗暴,直接在關鍵點附近隨機生成像素,通過二進位編碼組成描述向量。這裡就對關鍵點進行了描述。
2.特徵代碼實踐1)特徵匹配特徵匹配有多種方式方法:暴力(Brute-Force)匹配:一種描述符匹配算法,該方法會比較兩個描述符,並產生匹配結果列表,第一個描述符的所有特徵都拿來和第二個進行比較。除此之外,還有K-最近鄰(knn)匹配:在所有的機器學習suan算法中,knn可能是最簡單的。
讓我們用代碼實踐一下特徵匹配算法:
#include <iostream> #include <opencv2 core="" core.hpp=""> #include <opencv2 features2d="" features2d.hpp=""> #include <opencv2 highgui="" highgui.hpp=""> #include <chrono> using namespace std; using namespace cv; int main(int argc, char **argv) { if (argc != 3) { cout << "usage: feature_extraction img1 img2" << endl; return 1; } Mat img_1 = imread(argv[1], CV_LOAD_IMAGE_COLOR); Mat img_2 = imread(argv[2], CV_LOAD_IMAGE_COLOR); assert(img_1.data != nullptr && img_2.data != nullptr); std::vector<keypoint> keypoints_1, keypoints_2; Mat descriptors_1, descriptors_2; Ptr<featuredetector> detector = ORB::create(); Ptr<descriptorextractor> descriptor = ORB::create(); Ptr<descriptormatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming"); chrono::steady_clock::time_point t1 = chrono::steady_clock::now(); detector->detect(img_1, keypoints_1); detector->detect(img_2, keypoints_2); descriptor->compute(img_1, keypoints_1, descriptors_1); descriptor->compute(img_2, keypoints_2, descriptors_2); chrono::steady_clock::time_point t2 = chrono::steady_clock::now(); chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1); cout << "extract ORB cost = " << time_used.count() << " seconds. " << endl; Mat outimg1; drawKeypoints(img_1, keypoints_1, outimg1, Scalar::all(-1), DrawMatchesFlags::DEFAULT); imshow("ORB features", outimg1); vector<dmatch> matches; t1 = chrono::steady_clock::now(); matcher->match(descriptors_1, descriptors_2, matches); t2 = chrono::steady_clock::now(); time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1); cout << "match ORB cost = " << time_used.count() << " seconds. " << endl; auto min_max = minmax_element(matches.begin(), matches.end(), [](const DMatch &m1, const DMatch &m2) { return m1.distance < m2.distance; }); double min_dist = min_max.first->distance; double max_dist = min_max.second->distance; printf("-- Max dist : %f \n", max_dist); printf("-- Min dist : %f \n", min_dist); std::vector<dmatch> good_matches; for (int i = 0; i < descriptors_1.rows; i++) { if (matches[i].distance <= max(2 * min_dist, 30.0)) { good_matches.push_back(matches[i]); } } Mat img_match; Mat img_goodmatch; drawMatches(img_1, keypoints_1, img_2, keypoints_2, matches, img_match); drawMatches(img_1, keypoints_1, img_2, keypoints_2, good_matches, img_goodmatch); imshow("all matches", img_match); imshow("good matches", img_goodmatch); waitKey(0); return 0; }2)ORB特徵對於ORB特徵,我們也用代碼實踐一下加深理解:
using namespace std; string first_file = "./1.png"; string second_file = "./2.png"; typedef vector<uint32_t> DescType; /@@** * compute descriptor of orb keypoints * @param img input image * @param keypoints detected fast keypoints * @param descriptors descriptors * * NOTE: if a keypoint goes outside the image boundary (8 pixels), descriptors will not be computed and will be left as * empty */ void ComputeORB(const cv::Mat &img, vector<cv::keypoint> &keypoints, vector<desctype> &descriptors); /@@** * brute-force match two sets of descriptors * @param desc1 the first descriptor * @param desc2 the second descriptor * @param matches matches of two images */ void BfMatch(const vector<desctype> &desc1, const vector<desctype> &desc2, vector<cv::dmatch> &matches); int main(int argc, char **argv) { cv::Mat first_image = cv::imread(first_file, 0); cv::Mat second_image = cv::imread(second_file, 0); assert(first_image.data != nullptr && second_image.data != nullptr); chrono::steady_clock::time_point t1 = chrono::steady_clock::now(); vector<cv::keypoint> keypoints1; cv::FAST(first_image, keypoints1, 40); vector<desctype> descriptor1; ComputeORB(first_image, keypoints1, descriptor1); vector<cv::keypoint> keypoints2; vector<desctype> descriptor2; cv::FAST(second_image, keypoints2, 40); ComputeORB(second_image, keypoints2, descriptor2); chrono::steady_clock::time_point t2 = chrono::steady_clock::now(); chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1); cout << "extract ORB cost = " << time_used.count() << " seconds. " << endl; vector<cv::dmatch> matches; t1 = chrono::steady_clock::now(); BfMatch(descriptor1, descriptor2, matches); t2 = chrono::steady_clock::now(); time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1); cout << "match ORB cost = " << time_used.count() << " seconds. " << endl; cout << "matches: " << matches.size() << endl; cv::Mat image_show; cv::drawMatches(first_image, keypoints1, second_image, keypoints2, matches, image_show); cv::imshow("matches", image_show); cv::imwrite("matches.png", image_show); cv::waitKey(0); cout << "done." << endl; return 0; }
點擊閱讀全文,查看完整代碼點擊閱讀全文,查看完整代碼點擊閱讀全文,查看完整代碼
void ComputeORB(const cv::Mat &img, vector<cv::keypoint> &keypoints, vector<desctype> &descriptors) { const int half_patch_size = 8; const int half_boundary = 16; int bad_points = 0; for (auto &kp: keypoints) { if (kp.pt.x < half_boundary || kp.pt.y < half_boundary || kp.pt.x >= img.cols - half_boundary || kp.pt.y >= img.rows - half_boundary) { bad_points++; descriptors.push_back({}); continue; } float m01 = 0, m10 = 0; for (int dx = -half_patch_size; dx < half_patch_size; ++dx) { for (int dy = -half_patch_size; dy < half_patch_size; ++dy) { uchar pixel = img.at<uchar>(kp.pt.y + dy, kp.pt.x + dx); m10 += dx * pixel; m01 += dy * pixel; } } float m_sqrt = sqrt(m01 * m01 + m10 * m10) + 1e-18; float sin_theta = m01 / m_sqrt; float cos_theta = m10 / m_sqrt; DescType desc(8, 0); for (int i = 0; i < 8; i++) { uint32_t d = 0; for (int k = 0; k < 32; k++) { int idx_pq = i * 32 + k; cv::Point2f p(ORB_pattern[idx_pq * 4], ORB_pattern[idx_pq * 4 + 1]); cv::Point2f q(ORB_pattern[idx_pq * 4 + 2], ORB_pattern[idx_pq * 4 + 3]); cv::Point2f pp = cv::Point2f(cos_theta * p.x - sin_theta * p.y, sin_theta * p.x + cos_theta * p.y) + kp.pt; cv::Point2f qq = cv::Point2f(cos_theta * q.x - sin_theta * q.y, sin_theta * q.x + cos_theta * q.y) + kp.pt; if (img.at<uchar>(pp.y, pp.x) < img.at<uchar>(qq.y, qq.x)) { d |= 1 << k; } } desc[i] = d; } descriptors.push_back(desc); } cout << "bad/total: " << bad_points << "/" << keypoints.size() << endl; } void BfMatch(const vector<desctype> &desc1, const vector<desctype> &desc2, vector<cv::dmatch> &matches) { const int d_max = 40; for (size_t i1 = 0; i1 < desc1.size(); ++i1) { if (desc1[i1].empty()) continue; cv::DMatch m{i1, 0, 256}; for (size_t i2 = 0; i2 < desc2.size(); ++i2) { if (desc2[i2].empty()) continue; int distance = 0; for (int k = 0; k < 8; k++) { distance += _mm_popcnt_u32(desc1[i1][k] ^ desc2[i2][k]); } if (distance < d_max && distance < m.distance) { m.distance = distance; m.trainIdx = i2; } } if (m.distance < d_max) { matches.push_back(m); } } }3.總結以上,我們就簡單入門了視覺裡程計中的ORB相關特徵,特徵是圖像最基礎的概念之一,在傳統圖像算法中,這是非常重要的知識點,ORB,這是一個近年來非常經典的特徵描述,希望可以幫助到大家理解。
點擊閱讀原文,查看完整代碼