Abstract: Fisheye cameras, valued for their wide field of view, play a crucial role in perceiving the surrounding environment of vehicles. However, there is a lack of specific research addressing the processing of significant distortion features in segmenting fish-eye images. Additionally, fish-eye images for autonomous driving face the challenge of few datasets, potentially causing over fitting and hindering the model's generalization ability.
Based on the semantic segmentation task, a method for transforming normal images into fish-eye images is proposed, which expands the fish-eye image dataset. By employing the Transformer network and the Across Feature Map Attention, the segmentation performance is further improved, achieving a 55.6% mIOU on Woodscape. Additionally, leveraging the concept of knowledge distillation, the network ensures a strong generalization based on dual-domain learning without compromising performance on Woodscape (54% mIOU).