The inspection of water conveyance tunnels plays an important role in water diversion projects. Siltation is an essential factor threatening the safety of water conveyance tunnels. Accurate and efficient identification of such siltation can reduce risks and enhance safety and reliability of these projects. The remotely operated vehicle (ROV) can detect such siltation. However, it needs to improve its intelligent recognition of image data it obtains. This paper introduces the idea of ensemble deep learning. Based on the VGG16 network, a compact convolutional neural network (CNN) is designed as a primary learner, called Silt-net, which is used to identify the siltation images. At the same time, the fully-connected network is applied as the meta-learner, and stacking ensemble learning is combined with the outputs of the primary classifiers to obtain satisfactory classification results. Finally, several evaluation metrics are used to measure the performance of the proposed method. The experimental results on the siltation dataset show that the classification accuracy of the proposed method reaches 97.2%, which is far better than the accuracy of other classifiers. Furthermore, the proposed method can weigh the accuracy and model complexity on a platform with limited computing resources.