Abstract: Research work has been directed toward humancomputer interaction, and this work should establish an end-toend real-time video emotion detection system based on Vision Transformers (ViT) ...