Abstract:Human action is the process of coordinating the direction of limb movement, the sequence of joint activity and the amplitude of motion. However, existing methods tend to directly model the original 3D skeletal joint information, which easily ignores the sequential relationship between limb joint activities, motion directionality and movement amplitude variation. Therefore, this paper proposes a skeletal convolutional neural network based on point-bone features in a sequence-driven and direction-driven manner to recognize human actions by characterizing the sequence of human joint point movements, inter-frame distances and skeletal bone direction vectors. The network consists of a sequence-driven unit and a direction-driven unit. The sequence-driven unit models the joint points at the end of the skeletal bone, and characterizes the sequence of joint movements and the magnitude of limb changes by using the joint arrangement and inter-frame distance information. The direction-driven unit uses the direction vector information of the skeletal bone to characterize the directionality of the limb movement. Finally, the sequence-driven unit is fused with the direction-driven unit features maps to classify and recognize human daily behavioral actions. The experimental results show that the results on two large datasets, NTU-RGB+D60 and NTU-RGB+D120, improve 2.6%, 3.5% and 5.9%, 6.1%, respectively, compared with the benchmark method. The proposed method can effectively utilize the synergistic complementarity between multiple features to deeply characterize human daily behavioral movements and effectively improve the accuracy of human action recognition.