👇Download Article👇
https://www.ijert.org/real-time-video...
IJERTV10IS120054
Real Time Video Captioning Using Deep Learning
Sandeep Samleti , Ashish Mishra , Alok Jhajhria , Shivam Kumar Rai, Gaurav Malik
In this world of advanced technology where everything is devel- oping at a very fast pace, video processing has become extremely important for various reasons. It has also become important so that various kinds of videos including surveillance, social and informa- tional videos get themselves into day-to-day life as well as into our environment. By video captioning various objects can be identified, video can be summarized and describes, data can be searched. Also, it can help blind people by describing the events happening around them as well as it can help in military operations and surveillance by detecting threats and help weapons and soldiers to destroy them. Video caption generator uses video encoder as well as caption de- coder framework. In this research paper we have discussed the two models, first one is Hierarchical model and second one is Multi stream hierarchical Boundary Model. The Hierarchical model is combined with steered captioning. Hierarchical model can basi- cally capture clip level temporal features from clips at fixed time steps to show a video. A fixed hierarchy model is taken with a soft hierarchy model with the help of intrinsic feature boundary cuts in Multi-stream Hierarchical Boundary model to define clips in a video whereas Steered captioning model is the attention model in which visual parameters are used to lead an attention model to appropriate locations in video. In this research a parametric Gauss- ian attention is also discussed. Fixed length video streams which are required by soft attention techniques is a limitation which is removed by Gaussian attention techniques.