Video Segmentation Techniques for Instructional Videos – Survey


  • Jyoti Parsola Departement of Computer Application
  • Durgaprasad Gangodkar Departement of Computer Science and Engineering Graphic Era (Deemed to be University), Dehradun, Uttarakhand, India
  • Ankush Mittal Departement of Computer Science and Engineering Graphic Era (Deemed to be University), Dehradun, Uttarakhand, India


Video Segmentation, E-Learning Applications


Low cost smart phones and easy internet access have caused an increase in viewership of e-learning video.
Usually the memory size of mobile phones is less therefore, it becomes extremely important to reduce size of
these instructional videos. Video segmentation is the fundamental task of reducing size of e-learning videos.
This paper gives an overview of existing techniques used for video segmentation of e-learning videos. Most of
the methods used so far for segmenting instructional video are broadly categorized into i) feature extraction
based segmentation ii) motion based segmentation. The performance, comparative merits and limitations of
each approach is thoroughly examined and contradicted. The analysis is beneficial for appropriate use of
existing methods and for enhancing their performance or forming new methods on the basis of existing methods
by combining one or two methods together


Download data is not yet available.


Amir, A., Ashour, G., & Srinivasan, S. (2001, January). Towards automatic real time preparation of on-line

video proceedings for conference talks and presentations. In Proceedings of the 34th Annual Hawaii

International Conference on System Sciences (pp. 8-pp). IEEE.

Asghar, M. N., Hussain, F., & Manton, R. (2014). Video indexing: a survey. International Journal of Computer

and Information Technology, 3(01), 148-169.

Baidya, E., & Goel, S. (2014, August). LectureKhoj: automatic tagging and semantic segmentation of online

lecture videos. In 2014 Seventh International Conference on Contemporary Computing (IC3) (pp. 37-43). IEEE.

Banerjee, P., Bhattacharya, U., & Chaudhuri, B. B. (2014, September). Automatic detection of handwritten

texts from video frames of lectures. In 2014 14th International Conference on Frontiers in Handwriting

Recognition (pp. 627-632). IEEE.

Bianchi, M. (1998, July). Auto auditorium: a fully automatic, multi-camera system to televise auditorium

presentations. In Proc. of Joint DARPA/NIST Smart Spaces Technology Workshop.

Journal of Graphic Era University

Vol. 7, Issue 2, 90-107, 2019

ISSN: 0975-1416 (Print), 2456-4281 (Online)

Brejl, M., & Sonka, M. (2000). Object localization and border detection criteria design in edge-based image

segmentation: automated learning from examples. IEEE Transactions on Medical imaging, 19(10), 973-985.

Chen, W. T., Liu, W. C., & Chen, M. S. (2010). Adaptive color feature extraction based on image color

distributions. IEEE Transactions on Image Processing, 19(8), 2005-2016,

Cheng, H. D., Jiang, X. H., Sun, Y., & Wang, J. (2001). Color image segmentation: advances and

prospects. Pattern Recognition, 34(12), 2259-2281.

Choudary, C., & Liu, T. (2007). Extracting content from instructional videos by statistical modelling and

classification. Pattern Analysis and Applications, 10(2), 69-81.

Davila, K., & Zanibbi, R. (2017, November). Whiteboard video summarization via spatio-temporal conflict

minimization. In 2017 14th IAPR International Conference on Document Analysis and Recognition

(ICDAR) (Vol. 1, pp. 355-362). IEEE.

Dickson, P. E., Adrion, W. R., & Hanson, A. R. (2008, December). Whiteboard content extraction and analysis

for the classroom environment. In 2008 Tenth IEEE International Symposium on Multimedia (pp. 702-707).


Dickson, P., Adrion, W. R., & Hanson, A. (2006, December). Automatic capture of significant points in a

computer based presentation. In Eighth IEEE International Symposium on Multimedia (ISM'06) (pp. 921-926).


Dong, A., & Li, H. (2005, December). Educational documentary video segmentation and access through

combination of visual, audio and text understanding. In Proceedings of the Fifth IEEE International Symposium

on Signal Processing and Information Technology, 2005. (pp. 652-657). IEEE.

Dorai, C., Oria, V., & Neelavalli, V. (2003, September). Structuralizing educational videos based on

presentation content. In Proceedings 2003 International Conference on Image Processing (Cat. No.

CH37429) (Vol. 2, pp. II-1029). IEEE.

Ekinci, M., & Gedikli, E. (2003, November). Background estimation based people detection and tracking for

video surveillance. In International Symposium on Computer and Information Sciences (pp. 421-429). Springer,

Berlin, Heidelberg.

Franklin, D., & Hammond, K. (2001, May). The intelligent classroom: providing competent assistance.

In Proceedings of the Fifth International Conference on Autonomous Agents (pp. 161-168). ACM.

Fu, K. S., & Mui, J. K. (1981). A survey on image segmentation. Pattern Recognition, 13(1), 3-16.

Haubold, A., & Kender, J. R. (2005, November). Augmented segmentation and visualization for presentation

videos. In Proceedings of the 13th Annual ACM International Conference on Multimedia (pp. 51-60). ACM.

He, L. W., & Zhang, Z. (2006). Real-time whiteboard capture and processing using a video camera for remote

collaboration. IEEE Transactions on Multimedia, 9(1), 198-206.

Imran, A. S., Chanda, S., Cheikh, F. A., Franke, K., & Pal, U. (2012, November). Cursive handwritten

segmentation and recognition for instructional videos. In 2012 Eighth International Conference on Signal Image

Technology and Internet Based Systems (pp. 155-160). IEEE.

Javed, O., Shafique, K., & Shah, M. (2002, December). A hierarchical approach to robust background

subtraction using color and gradient information. In Workshop on Motion and Video Computing, 2002.

Proceedings. (pp. 22-27). IEEE.

Jeong, H. J., Kim, T. E., & Kim, M. H. (2012, December). An accurate lecture video segmentation method by

using sift and adaptive threshold. In Proceedings of the 10th International Conference on Advances in Mobile

Computing & Multimedia (pp. 285-288). ACM.

Journal of Graphic Era University

Vol. 7, Issue 2, 90-107, 2019

ISSN: 0975-1416 (Print), 2456-4281 (Online)

Ju, S. X., Black, M. J., Minneman, S., & Kimber, D. (1998). Summarization of videotaped presentations:

automatic analysis of motion and gesture. IEEE Transactions on Circuits and Systems for Video

Technology, 8(5), 686-696.

Lee, G. C., Yeh, F. H., Chen, Y. J., & Chang, T. K. (2017). Robust handwriting extraction and lecture video

summarization. Multimedia Tools and Applications, 76(5), 7067-7085.

Li, H., & Dong, A. (2006, August). Hierarchical segmentation of presentation videos through visual and text

analysis. In 2006 IEEE International Symposium on Signal Processing and Information Technology (pp. 314-


Lin, M., Nunamaker, J. F., Chau, M., & Chen, H. (2004, January). Segmentation of lecture videos based on text:

a method combining multiple linguistic features. In 37th Annual Hawaii International Conference on System

Sciences, 2004. Proceedings of the (pp. 9-pp). IEEE.

Lin, Y. T., Tsai, H. Y., Chang, C. H., & Lee, G. C. (2010, September). Learning-focused structuring for

blackboard lecture videos. In 2010 IEEE Fourth International Conference on Semantic Computing (pp. 149-


Liu, Q., Rui, Y., Gupta, A., & Cadiz, J. J. (2001, March). Automating camera management for lecture room

environments. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 442-

. ACM.

Liu, T., & Choudary, C. (2006). Content-adaptive wireless streaming of instructional videos. Multimedia Tools

and Applications, 28(2), 157-171.

Liu, T., & Kender, J. R. (2002). Rule-based semantic summarization of instructional videos. In Proceedings.

International Conference on Image Processing (Vol. 1, pp. I-I). IEEE.

Liu, T., Hjelsvold, R., & Kender, J. R. (2002). Analysis and enhancement of videos of electronic slide

presentations. In Proceedings. IEEE International Conference on Multimedia and Expo (Vol. 1, pp. 77-80).


Lowe, D. G. (2004). Distinctive image features from scale-invariant key points. International Journal of

Computer Vision, 60(2), 91-110.

Ma, D., & Agam, G. (2012, January). Lecture video segmentation and indexing. In Document Recognition and

Retrieval XIX (Vol. 8297, p. 82970V). International Society for Optics and Photonics.

Masneri, S., & Schreer, O. (2014, January). SVM-based video segmentation and annotation of lectures and

conferences. In 2014 International Conference on Computer Vision Theory and Applications (VISAPP) (Vol. 2,

pp. 425-432). IEEE.

Mittal, A., Gupta, S., Jain, S., & Jain, A. (2006). Content-based adaptive compression of educational videos

using phase correlation techniques. Multimedia Systems, 11(3), 249-259.

Mukhopadhyay, S., & Smith, B. (1999, October). Passive capture and structuring of lectures. In ACM

Multimedia (1) (pp. 477-487).

Ngo, C. W., Wang, F., & Pong, T. C. (2003, December). Structuring lecture videos for distance learning

applications. In Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings. (pp.

-222). IEEE.

Onishi, M., Izumi, M., & Fukunaga, K. (2000). Blackboard segmentation using video image of lecture and its

applications. In Proceedings 15th International Conference on Pattern Recognition. ICPR-2000 (Vol. 4, pp. 615-


Journal of Graphic Era University

Vol. 7, Issue 2, 90-107, 2019

ISSN: 0975-1416 (Print), 2456-4281 (Online)

Pal, N. R., & Pal, S. K. (1993). A review on image segmentation techniques. Pattern Recognition, 26(9), 1277-

Prabhu, N., Kumar, R. P., Punitha, T., & Srinivasan, R. (2008, October). Whiteboard documentation through

foreground object detection and stroke classification. In 2008 IEEE International Conference on Systems, Man

and Cybernetics (pp. 336-340). IEEE.

Ram, A. R., & Chaudhuri, S. (2009, August). Automatic capsule preparation for lecture video. In 2009

International Workshop on Technology for Education (pp. 10-16). IEEE.

Subudhi, B. N., Veerakumar, T., Yadav, D., Suryavanshi, A. P., & Disha, S. N. (2017, January). Video

skimming for lecture video sequences using histogram based low level features. In 2017 IEEE 7th International

Advance Computing Conference (IACC) (pp. 684-689). IEEE.

Tang, L., & Kender, J. R. (2005, July). Semantic indexing for instructional video via combination of

handwriting recognition and information retrieval. In 2005 IEEE International Conference on Multimedia and

Expo (pp. 920-923). IEEE.

Tuna, T., Joshi, M., Varghese, V., Deshpande, R., Subhlok, J., & Verma, R. (2015, October). Topic based

segmentation of classroom videos. In 2015 IEEE Frontiers in Education Conference (FIE) (pp. 1-9). IEEE.

Wallick, M. N., Heck, R. M., & Gleicher, M. L. (2005, March). Marker and chalkboard regions. In Proceedings

of Mirage (pp. 223-228).

Wang, F., Ngo, C. W., & Pong, T. C. (2007). Lecture video enhancement and editing by integrating posture,

gesture, and text. IEEE Transactions on Multimedia, 9(2), 397-409.

Yadid, S., & Yahav, E. (2016, October). Extracting code from programming tutorial videos. In Proceedings of

the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and

Software (pp. 98-111). ACM.

Yang, H., Siebert, M., Luhne, P., Sack, H., & Meinel, C. (2011, December). Automatic lecture video indexing

using video OCR technology. In 2011 IEEE International Symposium on Multimedia (pp. 111-116). IEEE.

Yang, H., Siebert, M., Luhne, P., Sack, H., & Meinel, C. (2011, November). Lecture video indexing and

analysis using video ocr technology. In 2011 Seventh International Conference on Signal Image Technology &

Internet-Based Systems (pp. 54-61). IEEE.

Yokoi, T., & Fujiyoshi, H. (2006, July). Generating a time shrunk lecture video by event detection. In 2006

IEEE International Conference on Multimedia and Expo (pp. 641-644). IEEE.




How to Cite

Parsola, J., Gangodkar, D., & Mittal, A. (2023). Video Segmentation Techniques for Instructional Videos – Survey. Journal of Graphic Era University, 7(2), 90–107. Retrieved from




Most read articles by the same author(s)