Skip to content

πŸš€πŸ€–πŸ§  Vision transformer for video interpolation . Take your video to another level with AI

License

Notifications You must be signed in to change notification settings

nagarajRPoojari/video-interpolation-AI

Repository files navigation

Vision Transformer for video interpolation πŸ€–πŸ“·βœŒοΈ

This project implements state-of-the-art model Vision Transformer for video frame interpolation to increase its FPS , and compares its performance with traditional approaches like Deep Voxel flow , Super slomo model.

Table of Contents

Introduction

Here I implemented VIFT and Super Slomo model as published in these vift, slomo respectively .

  • Deep Voxel Flow
    • uses Optical flow & CNN approach
    • unable to handle complex motions
  • Super Slomo
    • It replaces Optical flow by flow interpretation Unet like architecture .
    • computationally expensive
  • Video Frame Interpolation Transformer
    • It uses Swin Transformer blocks ( Shifted Window transformer ) to reduce time complexity from quadratic to linear .
    • much smaller compared to Super Slomo , while still achieving better performance .

Demo

Results

model / metric Parameters (M) PSNR ( peek-signal-to-noise-ratio ) SSMI ( structural similarity index )
Deep voxel flow - 27.6 0.92
Super Slomo 38 31.4 0.94
VIFT 7 35.1 0.96

References

License

This project is licensed under the MIT License. See the LICENSE.md file for details.

About

πŸš€πŸ€–πŸ§  Vision transformer for video interpolation . Take your video to another level with AI

Topics

Resources

License

Stars

Watchers

Forks