Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection
Introduction and related work
In this blog, we will present a new deep learning algorithm called "Keep your Eyes on the Lane", created for lane detection in the context of autonomous driving.
Lane Detection is a very important issue when talking about autonomous driving. In a real-world scenario, the algorithm which is used for detecting the lanes should be able to deal with several adverse factors like extreme light or difficult weather conditions. Moreover, it should be able to recognize the car's surroundings even when the lane is blocked by an object, like another car.
Recent work in this field uses different architectures involving Convolutional Neural Networks. However, many of them are very complex and slow, therefore not suitable for real-world situations, where the detection has to be made on a real-time basis.
Proposed architecture
This paper proposes a new architecture, called LaneATT, which is both faster and more accurate than the current state-of-the art methods.
The architecture is composed of multiple parts:
1) An RGB image captured from the front-camera of a car is given to a backbone network for computing its features. The backbone can be any CNN architecture. Usually, one network from the ResNet family is used.
2) The features from the output of the backbone are given to the Feature Pooling component. Here, LaneATT uses some predefined set of lines called "anchors". Each of the anchors is projected on the feature map output, resulting in a set of local features.
3) The attention mechanism computes a set of global features for each anchor, beside the local features that have already been computed at the previous stage. The global feature of an anchor is a learnable linear combination of all the other anchors, which is computed via the fully-connected layer Latt. The reason why such global features are needed is that sometimes the local features of a given anchor are not sufficient to decide whether there is a lane in that given area (for example, in case of the lane being masked by another object). Therefore, the information given by other anchors may complement the local ones.
4) The prediction stage. Each anchor, using the local features concatenated with the global ones, predicts whether there is or not a lane close to that anchor, and if it is, its coordinates. This is done by 2 parallel fully-connected layers: one for classification (whether there is a lane or not) and one for regression (the position of the lane if it exists).
5) Non Maximal Supression. This is a stage that eliminates duplicate lanes. Because the set of anchors is chosen such that it covers pretty much any position a lane could have, it is common that several predictions for different anchors recognize the same lane. Therefore, Non Maximal Supression is applied to eliminate predictions of lanes which are very close to each other.
Comparison with other models
Conclusion
LaneATT is a robust model for lane detection used for autonomous driving. It supports real-time execution and, because of its attention mechanism, it can detect the lanes even when dealing with difficult conditions.
Niciun comentariu:
Trimiteți un comentariu