pix2pix : Image Translation with GAN (2)

Image-to-Image Translation with Conditional Adversarial Networks (pix2pix)

published to CVPR2017 by Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros



Learn pair-wise images of $S$ and $T$ like below

  • BW & Color image
  • Street Scene & Label
  • Facade & Label
  • Aerial & Map
  • Day & Night
  • Edges & Photo

source image $x \in S$, target image (label) $y \in T$ is pair-wise.

thus it is Supervised Learning

Generator of pix2pix

$G(x,z)$ where $x$: image and $z$: noise

Use U-Net shaped network

  • known to be powerful at segmentation task
  • use spatial information from features of bottom layer
  • use dropout as noise in decoder part

Discriminator of pix2pix

Loss function


$x$: source image, $y$: target image, $z$: noise

Use Adversarial loss and L1 loss

\begin{equation}
\mathcal{L}_{cGAN}(G,D) = \mathbb{E}_{x,y \sim p_{data}(x,y)}[\log D(x,y)] + \mathbb{E}_{x \sim p_{data}(x), z \sim p_z(z)}[\log (1-D(x,G(x,z)))]
\end{equation}

\begin{equation}
\mathcal{L}_{L1}(G) = \mathbb{E}_{x,y \sim p_{data}(x,y),z \sim p_z(z)}[||y-G(x,z)||_1]
\end{equation}

Result







Do demo!
https://affinelayer.com/pixsrv/