Problem:
Stitching together multiple images provide a wider field of view, but traditional methods of stitching suffer from irregular boundaries and distortions. Cropping the stitched image defeats the purpose of the operation by restricting the field of view, while completion methods don’t reproduce the scenery correctly. For high content fidelity, image rectangling has been used, but these methods suffer when the images contain few strait lines, as they are based on line detection for the mesh deformation.
Proposed solution:
To address the problem, a one-stage learning baseline is proposed in which we predefine a rigid target mesh and predict the initial one by using a fully CNN to estimate a content-aware mesh from a stitched image using a residual progressive regression strategy.
Methodology:
Feature Extractor:
A stack of simple convolution-pooling blocks to extract high-level semantic features from the input
Mesh motion regressor:
After feature extraction, an adaptive pooling layer is utilized to fix the resolution of feature maps. Subsequently, we design a fully convolutional structure as the mesh motion regressor to predict the horizontal and vertical motions of every vertex based on the regular mesh.
Residual progressive regression:
Estimate accurate mesh motions through a progressive manner. We warp the intermediate feature maps, improving the performance with a slight increase in the computation. Then, we design two regressors with the same structure to predict primary mesh motions and residual mesh motions, respectively.
The motivation of image rectangling is that the users are not satisfied with the irregular boundaries in stitched images. Therefore, the goal is to produce rectangular images that please most users.
Data:
As there is no proper dataset of pairs of stitched images and rectangular images, we build a deep image rectangling dataset with a wide range of irregular boundaries and scenes.Results:
Stitched images are used for rectangling using different algorithms. The results are shown below, where the solution produces fewer distortions in rectangling results.
More cross-dataset results are displayed in the figure below, which shows the superiority of rectangling over other solutions such as cropping and completion.
The proposed learning solution is significantly better than the traditional solution in every metric on DIR-D. This remarkable improvement is attributed to the content-preserving property that can preserve both linear and non-linear structures.
Bibliography:
- https://arxiv.org/pdf/2203.03831v1.pdf
Niciun comentariu:
Trimiteți un comentariu