Abstract:Person re-identification typically removes the last spatial down-sampling operation in the backbone to increase the resolution of the final output feature map and preserve more fine-grained features. However, this operation substantially reduces the size of receptive field, and a larger receptive field can provide more contextual information for person re-identification. At the same time, in the actual visual cortex, the receptive field of neurons in the same region are different, but this is largely ignored by the current design of pedestrian recognition networks. To solve the above problems, this article proposes a novel adaptive receptive field network. The design of the network is inspired by the visual system of living organisms. By setting different sized receptive field on the multi-branch network, combined with the attention mechanism to allow the network to select the appropriate receptive field characteristics, the network receptive field adaptive, and the use of packet convolution makes the adaptive receptive field module more lightweight. The receptive field is also increased in each branch using empty convolution to compensate for the reduction of the network receptive field by deleting the last downsampling operation. Experiments were performed on publicly available large-scale datasets, and the algorithm in this article showed a significant improvement over the baseline approach, with Rank1 and mAP on the DukeMTMC-reID, Market-1501 datasets reaching 89.2% and 76.0%, 95.2% and 87.2%, respectively, when using ResNet-50 as backbone. Compared with the existing methods, the algorithm of this article has a significant improvement in accuracy.