Fault diagnosis technology can be used to detect the potential fault of the equipment by analyzing and detecting the signal, so as to ensure the operation safety and effectively improve the operation efficiency of the equipment. The bearings are widely used in rotating machinery and equipment. The fault of the bearings may seriously affect the normal operation of equipment, and inflict economic damage, even endanger the safety of staff. Therefore, it is of great theoretical and practical significance to monitor the health status of the bearing, find out the fault location and analyze its severity in time. In the actual engineering conditions, the operating environment and workspace of mechanical equipment are characterized by complexity and variability. The intelligent fault diagnosis method based on Artificial Neural Network (ANN) can effectively identify the health status of equipment, but the traditional ANN requires a large number of labeled samples for training, which greatly limits its application in equipment fault diagnosis. Also its adaptability to different working conditions is poor. In order to solve this problem, this article proposed a model of bearing fault diagnosis based on transfer learning theory. The model consists of stacked sparse AutoEncoder (SAE) and flexible maximum function (Softmax) regression. In this model, high order KL divergence (HKL) is used to train domain adaptive ability, which can transfer the working condition with a large number of known data to the similar condition with a small amount of data. Only a small amount of data is needed to train the model to adapt to the new working condition. The experimental data set of bearing from Case Western Reserve University was used to verify the effectiveness of the model.