Monocular cameras are the most common in life, but now dual-camera and three-camera phones have basically replaced monocular phones. Let’s first talk about the shortcomings of monocular cameras. There is a scale problem in the use of a monocular camera. Let's take a look at the following picture first.
When we directly observe this picture, we cannot judge the authenticity of the people on the palm of the picture. They may be a group of very large but far away people, or they may be a group of very close and small models. This kind of problem is called Is the uncertainty of the scale. Monocular cameras cannot eliminate the uncertainty of scale by relying on vision alone, so more and more people use binocular cameras.
The picture above is a picture of a binocular fisheye camera that is undergoing corner extraction. The binocular camera consists of two monocular cameras placed horizontally. The distance between the two is called the baseline. Through the baseline, the distance of the pixel space is judged, which basically simulates the structure of the human eye. We first introduce the model principle of the binocular camera, and finally perform the calibration of the binocular camera.
As shown above, it is the center of the left and right cameras, the blue line represents the imaging plane, is the focal length, is the baseline of the binocular camera, and represents the depth, that is, the distance. The imaging point of the space point in the left and right cameras is recorded as the corresponding pixel coordinates. Due to the existence of the camera baseline, the imaging point of the space point P on the left and right cameras should be offset only in the axis under ideal conditions. According to the triangle similarity principle, we can get the following relationship:
Simplification can get:
Among them is parallax, vision is inversely proportional to distance, the larger the inspection, the closer the distance, the smaller the inspection, the farther the distance.
As can be seen from the previous section, binocular vision is based on the fact that the imaging planes of the left and right cameras are coplanar and the centers of the left and right cameras are aligned horizontally. However, in reality, due to the manufacturing process, there is no binocular camera that fully meets the conditions, because we need to calibrate and correct the binocular camera. The purpose is to strictly align the left and right images in the horizontal direction, so that they are polarized. Located on the same horizontal line, so that the left and right corner points can be quickly matched. We only need to calibrate the position and pose relationship of the left and right cameras, that is, the rotation matrix and the translation matrix.
Let’s recall the content of the previous issue. In the previous issue of monocular vision calibration, we introduced the homography matrix. Through the homography matrix, we can get the pose relationship between the checkerboard and the camera. There is a hidden constraint here, that is, the checkerboard plane is a strict plane relationship, but the bi-objective determination is based on the non-coplanar condition to make the derivation calculation, so here we cannot calibrate through the homography matrix. Extremely restrained.
The above figure is a schematic diagram of the epipolar constraint. The two black boxes on the left and the right represent the left and right frames of images, respectively representing the center of the two cameras, that is, the baseline. The projection of the spatial point on the left camera is, and the projection on the right camera is. The green plane is called the epipolar plane, which is the point projected on the left plane, and the point projected on the right plane, called the pole. And is the intersection line between the epipolar plane and the image plane, which is called the epipolar line. All the epipolar lines of the same camera intersect at the poles, and the epipolar lines of the same antipolar plane correspond one-to-one.
Assuming that we get a point in the left image, we don't need to search the whole image in the right image. The extreme constraint compresses the search space to a line, which greatly reduces the amount of calculation. Next, we will derive the formula for the pole constraint.
Denoted and respectively as sum, its physical meaning is the coordinates of the space point in the left and right camera coordinate system, the baseline is the translation matrix, denoted, the transformation from left to right is:
Because the rotation matrix is an orthogonal matrix, the transformation can be obtained by using the properties of the orthogonal matrix:
Using the coplanar three vectors to first make the inner product, and then make the property that the outer product is equal to 0, we can get:
Bring in and organize to get:
Expand the calculation by the cross product determinant, and get:
Which represents the antisymmetric matrix, namely:
Bring it in again, you can get:
Will bring it in again, and finally get:
That is, the Essential Matirx (essential matrix), which is only related to the rotation matrix and the translation matrix. If you bring it in, you can get:
It is the Fundamental Matrix, which is not only related to the pose matrix, but also related to the internal parameter matrix.
Therefore, in the dual-target calibration, we first calibrate the internal distortion parameters of the left and right cameras, and then search and match the corresponding corner points according to the limit constraints based on the corner points extracted by the left and right cameras, calculate the basic matrix, and finally restore the AND.
We still use kalibr for dual-objective determination. In kalibr dual-objective determination, the dijkstra algorithm is used for the epipolar search, which is a greedy algorithm. It is not the focus of my work direction. I will not give a detailed introduction here. If you are interested, you can do it yourself Search and learn.
Shake the camera in front of the calibration board to collect a video of about 2 minutes. The binocular picture is shown above. The frame rate of the camera I used is 30Hz, and then I use the command line to calibrate. The related commands and file parameters were introduced in the previous issue. No more description here.
kalibr_calibrate_cameras --target 2april_6x6.yaml --bag someone.bag --models pinhole-equi pinhole-equi --topics/cam0/image_raw/cam1/image_raw
In the calibration process, first initialize the left and right cameras and their respective internal parameter distortions as above, then delete invalid points according to the epipolar constraint, and finally perform nonlinear optimization to iteratively improve the accuracy. The final calibration results are as follows:
cam0: cam_overlaps:  camera_model: pinhole distortion_coeffs: [-0.004409832074633388, -0.028437047407405637, 0.047809470007966905, -0.019523614941648112] distortion_model: equidistant intrinsics: [291.25351307096554, 291.4512970525461, 312.1777832831178, 207.2118864947473] resolution: [640, 400] rostopic:/cam0/image_raw cam1: T_cn_cnm1: -[0.9999855873598229, -0.003937660912322113, 0.003649643705021668, -0.11901593055536226] -[0.003962421582737153, 0.9999690142275357, -0.0068021908246687045, -0.0026895627717159953] -[-0.003622745897063787, 0.006816554214125997, 0.9999702047065279, 0.001454727262296312] -[0.0, 0.0, 0.0, 1.0] cam_overlaps:  camera_model: pinhole distortion_coeffs: [0.0033687943075567796, -0.025984158273620172, 0.030334620264045077, -0.009876257637354948] distortion_model: equidistant intrinsics: [289.9578326809571, 289.88607463958084, 313.7645462282471, 206.31419561407202] resolution: [640, 400] rostopic:/cam1/image_raw
It can be seen that the baseline result is 11.9cm, which is similar to the baseline installation result of our binocular camera. We think the calibration result is correct. At this point, the calibration of the binocular camera has been completed. In the next issue, we will introduce the calibration of the binocular inertial (vio) system.
SLAM calibration series articles
Python high-performance series of articles