Neural style transfer aims to generate an image that retains the semantic content of a photograph while adopting the visual appearance of an artwork.
Traditional methods, such as the optimization-based approach introduced by Gatys et al. [1], rely on iterative updates during inference. While effective, these methods are computationally intensive and unsuitable for real-time applications. To address this limitation, feedforward networks trained with perceptual loss have been proposed, enabling style transfer in a single forward pass.
This project implements a fast neural style transfer pipeline based on the framework introduced by Johnson et al. [2], with minor modifications.
In the following sections, stylization results are presented first, followed by installation instructions and technical details. For an in-depth explanation, please see the project report.
Below are sample outputs from five different pre-trained models. Each image showcases content images stylized with a specific artistic style:
A Gradio web application has been built and deployed on Hugging Face Spaces, allowing users to upload content images and apply stylization using the models trained in this study.
π Try it out on the live application.
- Clone the repository
git clone https://github.com/ebylmz/fast-neural-style-transfer.git
cd fast-neural-style-transfer- Create and activate a Conda environment
conda create -n nst python=3.9
conda activate nst- Install dependencies
pip install -e .-
Transformer Network
- Input: RGB content image
- Architecture: Convolutional layers with instance normalization and ReLU activations
- Residual blocks (ResNet-style) at the core
- Upsampling via nearest-neighbor + convolution
-
Perceptual Loss (VGG16-based)
-
Content Loss: Feature reconstruction loss from layer
relu2_2of VGG16. -
Style Loss: Gram matrix loss across layers
relu1_2,relu2_2,relu3_3. - Total Variation Loss: Regularizer to enforce spatial smoothness and suppress noise.
The optimization objective:
$L_{total}$ =$\lambda_c$ * L_content +$\lambda_s$ * L_style +$\lambda_{tv}$ * L_TV -
Content Loss: Feature reconstruction loss from layer
The transformation network was trained using the Adam optimizer with a fixed learning rate of
Each style model was trained with fixed content and total variation loss weights of
Below are the training curves and snapshots for each trained style model:
This project demonstrates how fast neural style transfer can be achieved using feedforward convolutional networks trained with perceptual loss functions. By training separate models for different artistic styles, stylized images can be generated in real-time with a single forward pass.
Through careful tuning of style weights and leveraging high-level VGG features, the models strike a balance between preserving content structure and capturing the aesthetics of the reference artwork.
[1] A Neural Algorithm of Artistic Style (Gatys et al., 2015)
[2] Perceptual Losses for Real-Time Style Transfer and Super-Resolution (Johnson et al., 2016)
This project is licensed under the MIT License - see the LICENSE file for details.












