Lightweight Generative Adversarial Networks for Text−Guided Image Manipulation
Bowen Li‚ Xiaojuan Qi‚ Philip H. S. Torr and Thomas Lukasiewicz
Abstract
We propose a novel lightweight generative adversarial network for image manipulation using natural language descriptions, which allows users to efficiently modify parts of an image matching a given text that describes desired visual attributes, while preserving other text-irrelevant contents. Besides, a new word-level discriminator is proposed, which provides the generator with fine-grained training feedback at word-level, to facilitate training a generator that can correctly focus on specific visual attributes of an image, and then edit them without affecting other contents that are not described in the text. Compared to the state of the art, our method can produce higher-quality modified results aligned with the given descriptions, and is even one order of magnitude faster, which is more friendly to memory-limited devices. Extensive experimental results on two benchmark datasets demonstrate that our method can better disentangle different visual attributes, then correctly map these attributes to corresponding semantic words, and thus achieve a more accurate image modification using natural language descriptions.