原文連結
https://zhuanlan.zhihu.com/p/53172090本文將介紹如何利用Opencv,對簡單場景下的車道線進行離線識別。梳理整個識別過程的邏輯,並對過程中使用的相關知識點進行介紹。正文中使用C++實現,在文末也會附上利用python實現的代碼,讀者完全可以依照本文復現該項目。
整體思路簡單的車道線識別可由以下幾步完成:
讀取視頻-灰度變換-高斯濾波-邊緣檢測-感興趣區域檢測-霍夫變換-車道線擬合-圖片混合
在下面的內容中,將按照以上步驟一步步實現,最終實現對車道線的檢測。大家都知道,視頻是由一幀幀的圖像組成,因此對視頻的車道線檢測本質上是對圖像的車道線進行檢測。
實現單張圖片車道線檢測導入包含的庫文件
#include <iostream>
#include <opencv2/opencv.hpp>
#include<vector>
#include<numeric>
#include<string>
using namespace std;
using namespace cv;讀取圖片
//*************reading image******************
Mat image;
image = imread("/home/project1/test1.jpg");
if(image.empty()){
cout <<"reading error"<<endl;
return -1;
}在Opencv中,圖像的數據格式是Mat,相當於一個矩陣.這個步驟雖然簡單,但有兩點需要注意:一是imread後面的文件地址,在linux和windows下斜線的方向可能不一樣,需要注意,最好是使用全局路徑,不容易出錯; 二是在讀入圖像後,最好加一段image.empty()來判斷是否正確導入了圖片.
灰度變換
//***************gray image*******************
Mat image_gray;
cvtColor(image,image_gray, CV_BGR2GRAY);使用Opencv中的cvtColor函數可以直接將RGB的圖像轉換成灰度圖.這個函數有三個輸入參數,分別是:輸入圖像,輸出圖像,格式轉化類別.
高斯濾波
Mat image_gau;
GaussianBlur(image_gray, image_gau, Size(5,5),0,0);使用高斯濾波,也叫高斯模糊,能夠剔除原圖像中的一些噪點.比如,如果不使用高斯濾波,直接處理原圖,圖中一些無關緊要的特徵就無法避開,影響後面的處理.相反,通過高斯模糊之後,一些不那麼清晰的噪點就被刪除掉了.
函數GaussianBlur的四個參數分別為:輸入圖像,輸出圖像,高斯內核,高斯內核在X方向的標準偏差,高斯內核在Y方向上的標準偏差.
其中,高斯內核是由width和height兩個維度構成,這兩個維度可以使用不同的值,但是必須是正奇數或者為0; 高斯內核在XY兩個方向上的標準偏差,通常設置為0(具體如何調參數暫未研究).
邊緣檢測
//******************canny*********************
Mat image_canny;
Canny(image_gau, image_canny,100, 200, 3);Canny邊緣檢測函數共有5個輸入參數,分別為:輸入圖像、輸出圖像、閾值1、閾值2、sobel算子的孔徑參數.
閾值:低於閾值1的像素點會被認為不是邊緣,高於閾值2的像素點會被認為是邊緣,在閾值1和閾值2之間的像素點,如果與高於閾值2的像素點相鄰,則認為是邊緣,否則認為不是邊緣.soble算子孔徑參數,一般默認為3,即表示為一個3*3的矩陣.sobel算子與高斯拉普拉斯算子都是常用的邊緣算子.
ROI感興趣區域
Mat dstImg;
Mat mask = Mat::zeros(image_canny.size(), CV_8UC1);
Point PointArray[4];
PointArray[0] = Point(0, mask.rows);
PointArray[1] = Point(400,330);
PointArray[2] = Point(570,330);
PointArray[3] = Point(mask.cols, mask.rows);
fillConvexPoly(mask,PointArray,4,Scalar(255));
bitwise_and(mask,image_canny,dstImg);從上圖可以看出,通過邊緣檢測得到的圖片包含了很多環境信息,這些是我們不感興趣的,需要提取我們需要都得信息.觀察原圖可知,車道先一般位於圖片下方的一個梯形區域,手動設定4個點,組成梯形區域的四個頂點.利用fillConvexPoly函數可以畫出多邊形,這個函數共有4個參數:空圖(大小與原圖一致)、頂點信息、多邊形的邊數、線條顏色.
將梯形掩模區域與原圖進行bitwise_and操作,可以只得到感興趣區域內的邊緣檢測圖,從途中可以看出只有車道線信息.bitwise_and函數是將兩張圖片做「與」操作,共有3個輸入參數,分別是:掩模圖、原圖、輸出圖.需要注意的是,三張圖的大小和顏色通道數量.
霍夫變換
通過上面的操作,得到的是組成車道線的一些像素點,但這些點都是一個個獨立的像素點,沒有連成線。霍夫變換可以通過像素點找到圖中的直線。霍夫變換有3種,標準霍夫變換,多尺度霍夫變換和累計概率霍夫變換,前兩種使用HoughLines函數,最後一種使用HoughLines函數實現。累計霍夫變換的執行效率更高,所以一般更多的傾向使用累計概率霍夫變換.
霍夫變換將在迪卡爾坐標系下的線條轉換到極坐標系下,迪卡爾坐標下通過一個點的所有直線的集合在極坐標系下是一條正弦曲線.正弦曲線的交點,表示這些曲線代表的點在同一條直線上.霍夫變換就通過找這些交點,確定哪些像素點是在同一條直線上.
vector<Vec4i> lines; //包含4個int類型的結構體
int rho = 1;
double theta = CV_PI/180;
int threshold = 30;
int min_line_len = 30;
int max_line_gap = 20;
HoughLinesP(dstImg,lines,rho,theta,threshold,min_line_len,max_line_gap);Opencv中HoughLinesP函數共有7個參數:輸入原圖像(單通道二進位圖像,canny的結果),輸出線的兩個端點(x1, y1, x2, y2),rho直線搜索時的步長(單位為像素),theta直線搜索時的角度步長單位為弧度,threshold多少個點交在一起才認為是一條直線(int),min_linelen最低線段長度默認為0,max_line_gap兩條直線並列多遠的時候認為是兩條(默認為0).
車道線擬合
//簡單的車道線擬合
Mat image_draw = Mat::zeros(image_canny.size(),CV_8UC3);
for(size_t i= 0;i<lines.size();i++){
Vec4i L = lines[i];
line(image_draw, Point(L[0],L[1]),Point(L[2],L[3]),Scalar(0,0,255),3,LINE_AA);
}最簡單的車道線擬合,直接將霍夫變換找到的直線畫出來,對於連續的線段,沒有影響,但是如果車道線有虛線,就會出現不連續的情況.
如下圖
為了解決虛線之間不連續的問題,需要對霍夫變換得到的線段進行處理。一張圖片通過霍夫變換得到的線段有很多,在這裡可以根據斜率分為兩類,左車道線和右車道線。在分類的過程中需要注意的是圖像的坐標系:左上角為原點,x正方向朝右側,y的正向朝下。
霍夫變換得到的lines中是兩個點,通過兩個點,可以計算得到斜率和截距。對一張圖片中,同一側的斜率和截距進行平均,然後直接利用平均後的參數可以直接畫出一條完整的直線
/***************draw line update********************************
Mat image_draw = Mat::zeros(image_canny.size(),CV_8UC3);
vector<int> right_x, right_y, left_x, left_y;
double slope_right_sum;
double b_right_sum ;
double slope_left_sum ;
double b_left_sum ;
double slope_right_mean;
double slope_left_mean;
double b_right_mean;
double b_left_mean;
vector<double> slope_right, slope_left,b_right, b_left;
for(size_t i= 0;i<lines.size();i++){
Vec4i L;
double slope,b;
L = lines[i];
slope = (L[3]-L[1])*1.0/(L[2]-L[0]);
b = L[1]-L[0]*slope;
if (slope >=0.2){
slope_right.push_back(slope);
b_right.push_back(b);
//right_x.push_back((L[0],L[2]));
//right_y.push_back((L[1],L[3]));
}
else{
slope_left.push_back(slope);
b_left.push_back(b);
// left_x.push_back((L[0],L[2]));
// right_y.push_back((L(1),L[3]));
}
}
//accumulate 實現vector內值的累加,輸出格式與最後一個參數的數據格式一致。
slope_right_sum = accumulate(slope_right.begin(), slope_right.end(),0.0);
b_right_sum = accumulate(b_right.begin(), b_right.end(),0.0);
slope_left_sum = accumulate(slope_left.begin(),slope_left.end(),0.0);
b_left_sum = accumulate(b_left.begin(),b_left.end(),0.0);
slope_right_mean = slope_right_sum/slope_right.size();
slope_left_mean = slope_left_sum/slope_left.size();
b_right_mean = b_right_sum/b_right.size();
b_left_mean = b_left_sum/b_left.size();
cout <<"slope_right: "<<slope_right_sum<<endl;
double x1r = 550;
double x2r = 850;
double x1l = 120;
double x2l = 425;
int y1r = slope_right_mean * x1r + b_right_mean;
int y2r = slope_right_mean * x2r + b_right_mean;
int y1l = slope_left_mean * x1l + b_left_mean;
int y2l = slope_left_mean * x2l + b_left_mean;
line(image_draw, Point(x1r,y1r),Point(x2r,y2r),Scalar(0,0,255),5,LINE_AA);
line(image_draw, Point(x1l,y1l),Point(x2l,y2l),Scalar(0,0,255),5,LINE_AA);圖像混合
將畫出來的直線疊加到原圖像上,可以使用addWeighted函數,實現圖像加權疊加.addWeighted函數共有6個參數,分別為:原圖1、圖1的透明度、原圖2、圖2的透明度、加權值(一般設置為0)、輸出圖.
//*************mix two image*************************
Mat image_mix = Mat::zeros(image_canny.size(),CV_8UC3);
addWeighted(image_draw,1,image,1,0.0,image_mix);通過以上9個步驟,我們完成了對與單張圖片的車道線檢測。
對視頻的車道線檢測將2中9個步驟重構成一個類image_process
重構的image_process具有兩個成員變量:原圖和結果圖,一個成員函數:對圖像的車道線識別。另外還有一個構造函數和一個析構函數。具體代碼如下:
//image_process.h
#ifndef PROJECT1_IMAGE_PROCESS_H
#define PROJECT1_IMAGE_PROCESS_H
#include<iostream>
#include<opencv2/opencv.hpp>
using namespace std;
using namespace cv;
class image_process {
public:
Mat image_src;
Mat image_dst;
image_process(Mat image);
Mat process();
~image_process();
};//process_image.cpp
#include "image_process.h"
#include<iostream>
#include<opencv2/opencv.hpp>
using namespace std;
using namespace cv;
//構造函數
image_process::image_process(Mat image):image_src(image){}
//成員函數
Mat image_process::process(){
//*************reading image******************
Mat image;
image = image_src ;
if(image.empty()){
cout <<"reading error"<<endl;
}
//***************gray image*******************
Mat image_gray;
cvtColor(image,image_gray, CV_BGR2GRAY);
//************gaussian smoothing**************
Mat image_gau;
GaussianBlur(image_gray, image_gau, Size(5,5),0,0);
//******************canny*********************
Mat image_canny;
Canny(image_gau, image_canny,100, 200, 3);
//**************interesting aera*************
Mat dstImg;
Mat mask = Mat::zeros(image_canny.size(), CV_8UC1);
Point PointArray[4];
PointArray[0] = Point(0, mask.rows);
PointArray[1] = Point(400,330);
PointArray[2] = Point(570,330);
PointArray[3] = Point(mask.cols, mask.rows);
fillConvexPoly(mask,PointArray,4,Scalar(255));
bitwise_and(mask,image_canny,dstImg);
//************************houghline*******************
vector<Vec4i> lines;
int rho = 1;
double theta = CV_PI/180;
int threshold = 30;
int min_line_len = 100;
int max_line_gap = 100;
HoughLinesP(dstImg,lines,rho,theta,threshold,min_line_len,max_line_gap);
//cout<<lines[1]<<endl;
//***************draw line update********************************
Mat image_draw = Mat::zeros(image_canny.size(),CV_8UC3);
vector<int> right_x, right_y, left_x, left_y;
double slope_right_sum;
double b_right_sum ;
double slope_left_sum ;
double b_left_sum ;
double slope_right_mean;
double slope_left_mean;
double b_right_mean;
double b_left_mean;
vector<double> slope_right, slope_left,b_right, b_left;
for(size_t i= 0;i<lines.size();i++){
Vec4i L;
double slope,b;
L = lines[i];
slope = (L[3]-L[1])*1.0/(L[2]-L[0]);
b = L[1]-L[0]*slope;
if (slope >=0.2){
slope_right.push_back(slope);
b_right.push_back(b);
//right_x.push_back((L[0],L[2]));
//right_y.push_back((L[1],L[3]));
}
else{
slope_left.push_back(slope);
b_left.push_back(b);
// left_x.push_back((L[0],L[2]));
// right_y.push_back((L(1),L[3]));
}
}
slope_right_sum = accumulate(slope_right.begin(), slope_right.end(),0.0);
b_right_sum = accumulate(b_right.begin(), b_right.end(),0.0);
slope_left_sum = accumulate(slope_left.begin(),slope_left.end(),0.0);
b_left_sum = accumulate(b_left.begin(),b_left.end(),0.0);
slope_right_mean = slope_right_sum/slope_right.size();
slope_left_mean = slope_left_sum/slope_left.size();
b_right_mean = b_right_sum/b_right.size();
b_left_mean = b_left_sum/b_left.size();
cout <<"slope_right: "<<slope_right_sum<<endl;
double x1r = 550;
double x2r = 850;
double x1l = 120;
double x2l = 425;
int y1r = slope_right_mean * x1r + b_right_mean;
int y2r = slope_right_mean * x2r + b_right_mean;
int y1l = slope_left_mean * x1l + b_left_mean;
int y2l = slope_left_mean * x2l + b_left_mean;
line(image_draw, Point(x1r,y1r),Point(x2r,y2r),Scalar(0,0,255),5,LINE_AA);
line(image_draw, Point(x1l,y1l),Point(x2l,y2l),Scalar(0,0,255),5,LINE_AA);
//*************mix two image*************************
Mat image_mix = Mat::zeros(image_canny.size(),CV_8UC3);
addWeighted(image_draw,1,image,1,0.0,image_mix);
//**************out put****************************
return image_mix;
}
//析構函數
image_process::~image_process() {}主函數
可以直接使用capture函數讀取視頻,然後將視頻中的每一幀圖像傳遞給frame。通過構造函數將frame傳遞給處理單張圖片的類,然後調用成員函數進行處理,最後顯示。
waitKey()中.如果沒有參數,代表窗口會一直等待直到我們對窗口進行操作.有參數時,等待固定的時間後自動關閉.例如waitKey(30)窗口等待30ms後關閉,接著顯示下一幀的圖像.
#include <iostream>
#include <opencv2/opencv.hpp>
#include<vector>
#include <opencv2/highgui/highgui.hpp>
#include"image_process.h"
#include<string>
using namespace std;
using namespace cv;
int main(){
Mat image;
Mat image_result;
VideoCapture capture("/home/solidYellowLeft.mp4");
Mat frame;
if(!capture.isOpened()) {
cout << "can not open video" << endl;
return -1;
}
while(capture.isOpened()){
capture>>frame;
image_process image2(frame);
Mat image_result2;
image_result2 = image2.process();
imshow("result_video",image_result2);
waitKey(30);
}
}python實現:
import math
def grayscale(img):
"""Applies the Grayscale transform
This will return an image with only one color channel
but NOTE: to see the returned image as grayscale
(assuming your grayscaled image is called 'gray')
you should call plt.imshow(gray, cmap='gray')"""
return cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# Or use BGR2GRAY if you read an image with cv2.imread()
# return cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
def canny(img, low_threshold, high_threshold):
"""Applies the Canny transform"""
return cv2.Canny(img, low_threshold, high_threshold)
def gaussian_blur(img, kernel_size):
"""Applies a Gaussian Noise kernel"""
return cv2.GaussianBlur(img, (kernel_size, kernel_size), 0)
def region_of_interest(img, vertices):
"""
Applies an image mask.
Only keeps the region of the image defined by the polygon
formed from `vertices`. The rest of the image is set to black.
"""
#defining a blank mask to start with
mask = np.zeros_like(img)
#defining a 3 channel or 1 channel color to fill the mask with depending on the input image
if len(img.shape) > 2:
channel_count = img.shape[2] # i.e. 3 or 4 depending on your image
ignore_mask_color = (255,) * channel_count
else:
ignore_mask_color = 255
#filling pixels inside the polygon defined by "vertices" with the fill color
cv2.fillPoly(mask, vertices, ignore_mask_color)
#returning the image only where mask pixels are nonzero
masked_image = cv2.bitwise_and(img, mask)
return masked_image
def draw_lines(img, lines, color=[255, 0, 0], thickness=8):
"""
NOTE: this is the function you might want to use as a starting point once you want to
average/extrapolate the line segments you detect to map out the full
extent of the lane (going from the result shown in raw-lines-example.mp4
to that shown in P1_example.mp4).
Think about things like separating line segments by their
slope ((y2-y1)/(x2-x1)) to decide which segments are part of the left
line vs. the right line. Then, you can average the position of each of
the lines and extrapolate to the top and bottom of the lane.
This function draws `lines` with `color` and `thickness`.
Lines are drawn on the image inplace (mutates the image).
If you want to make the lines semi-transparent, think about combining
this function with the weighted_img() function below
"""
# for line in lines:
# for x1,y1,x2,y2 in line:
# cv2.line(img, (x1, y1), (x2, y2), color, thickness)
right_x =[]
right_y =[]
left_x =[]
left_y =[]
left_slope =[]
right_slope =[]
for line in lines:
for x1, y1, x2, y2 in line:
slope = ((y2-y1)/(x2-x1))
if slope >=0.2:
#right_slope.extend(int(slope))
right_x.extend((x1, x2))
right_y.extend((y1, y2))
elif slope <= -0.2:
#left_slope.extend(int(slope))
left_x.extend((x1, x2))
left_y.extend((y1, y2))
right_fit= np.polyfit(right_x, right_y, 1)
right_line = np.poly1d(right_fit)
x1R = 550
y1R = int(right_line(x1R))
x2R = 850
y2R = int(right_line(x2R))
cv2.line(img, (x1R, y1R), (x2R, y2R), color, thickness)
left_fit= np.polyfit(left_x, left_y, 1)
left_line = np.poly1d(left_fit)
x1L = 120
y1L = int(left_line(x1L))
x2L = 425
y2L = int(left_line(x2L))
cv2.line(img, (x1L, y1L), (x2L, y2L), color, thickness)
def hough_lines(img, rho, theta, threshold, min_line_len, max_line_gap):
"""
`img` should be the output of a Canny transform.
Returns an image with hough lines drawn.
"""
lines = cv2.HoughLinesP(img, rho, theta, threshold, np.array([]), minLineLength=min_line_len, maxLineGap=max_line_gap)
line_img = np.zeros((img.shape[0], img.shape[1], 3), dtype=np.uint8)
draw_lines(line_img, lines)
return line_img
# Python 3 has support for cool math symbols.
def weighted_img(img, initial_img, α=0.8, β=1., λ=0.):
"""
`img` is the output of the hough_lines(), An image with lines drawn on it.
Should be a blank image (all black) with lines drawn on it.
`initial_img` should be the image before any processing.
The result image is computed as follows:
initial_img * α + img * β + λ
NOTE: initial_img and img must be the same shape!
"""
return cv2.addWeighted(initial_img, α, img, β, λ)def pipeline(input_image):
image = input_image
import os
os.listdir("test_images/")
gray=grayscale(image)
# Gaussian smoothing
kernel_size = 5
gau=gaussian_blur(gray,kernel_size)
# Canny
low_threshold = 100
high_threshold =200
edges=canny(gau, low_threshold, high_threshold)
imshape = image.shape
vertices = np.array([[(0,imshape[0]),(400, 330), (570, 330), (imshape[1],imshape[0])]], dtype=np.int32)
region=region_of_interest(edges, vertices)
rho = 1 # distance resolution in pixels of the Hough grid
theta = np.pi/180 # angular resolution in radians of the Hough grid
threshold = 30 # minimum number of votes (intersections in Hough grid cell)
min_line_len = 20 #minimum number of pixels making up a line
max_line_gap = 20
line_img=hough_lines(region, rho, theta, threshold, min_line_len, max_line_gap)
line_last=weighted_img(line_img, image, α=0.8, β=1., λ=0.)
return line_lastfrom moviepy.editor import VideoFileClip
from IPython.display import HTML
def process_image(image):
# NOTE: The output you return should be a color image (3 channel) for processing video below
# TODO: put your pipeline here,
# you should return the final output (image where lines are drawn on lanes)
result = pipeline(image)
return result
white_output = 'test_videos_output/solidWhiteRight.mp4'
## To speed up the testing process you may want to try your pipeline on a shorter subclip of the video
## To do so add .subclip(start_second,end_second) to the end of the line below
## Where start_second and end_second are integer values representing the start and end of the subclip
## You may also uncomment the following line for a subclip of the first 5 seconds
#clip1 = VideoFileClip("test_videos/solidWhiteRight.mp4").subclip(0,5)
clip1 = VideoFileClip("test_videos/solidWhiteRight.mp4")
white_clip = clip1.fl_image(process_image) #NOTE: this function expects color images!!
%time white_clip.write_videofile(white_output, audio=False)