Temporally Consistent Semantic Video Editing

ECCV 2022

Yiran Xu¹, Badour AlBahar², Jia-Bin Huang¹,

¹University of Maryland, College Park, ²Virginia Tech

arXiv Code

Abstract

Generative adversarial networks (GANs) have demonstrated impressive image generation quality and semantic editing capability of real images, e.g., changing object classes, modifying attributes, or transferring styles. However, applying these GAN-based editing to a video independently for each frame inevitably results in temporal flickering artifacts. We present a simple yet effective method to facilitate temporally coherent video editing. Our core idea is to minimize the temporal photometric inconsistency by optimizing both the latent code and the pre-trained generator. We evaluate the quality of our editing on different domains and GAN inversion techniques and show favorable results against the baselines.

Results

In-domain editing on Internet Videos

Our method can be applied to diverse videos from the Internet. We can edit it with available in-domain editing methods, e.g., InterfaceGAN, StyleCLIP.

Input

Stitch in Time [Tzaban, et al.]

Ours

Out-of-domain editing on RAVDESS dataset

Our method can also be applied to Out-of-domain editing. We show the results on RAVDESS data below. The Out-of-domain editing is StyleGAN-NADA.

Please note that we do not apply stitching here.

Input

DVP

Ours

BibTeX

@article{xu2022videoeditgan,
  author    = {Xu, Yiran and AlBahar, Badour and Huang, Jia-Bin},
  title     = {Temporally consistent semantic video editing},
  journal   = {arXiv preprint arXiv: 2206.10590},
  year      = {2022},
}

Temporally Consistent Semantic Video Editing

ECCV 2022

Abstract

Results

In-domain editing on Internet Videos

Input

Stitch in Time [Tzaban, et al.]

Ours

Out-of-domain editing on RAVDESS dataset

Input

DVP

Ours

Related Links

GAN inversion

GAN-based editing

Blind Video Temporal Consistency

BibTeX