diff --git a/paddle2.0_docs/dcgan_face/dcgan_face.ipynb b/paddle2.0_docs/dcgan_face/dcgan_face.ipynb index 1846926e5ba..81f5d9cb691 100644 --- a/paddle2.0_docs/dcgan_face/dcgan_face.ipynb +++ b/paddle2.0_docs/dcgan_face/dcgan_face.ipynb @@ -2,9 +2,7 @@ "cells": [ { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "# 通过DCGAN实现人脸图像生成\n", "\n", @@ -16,9 +14,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 1 简介\n", "本项目基于paddlepaddle,结合生成对抗网络(DCGAN),通过弱监督学习的方式,训练生成真实人脸照片\n", @@ -57,9 +53,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 2 环境设置及数据集\n", "\n", @@ -80,9 +74,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "### 2.1 数据集预处理\n", "多线程处理,以裁切坐标(0,10)和(64,74),将输入网络的图片裁切到64*64。" @@ -91,9 +83,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "from PIL import Image\n", @@ -189,9 +179,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 3 模型组网\n", "### 3.1 定义数据预处理工具-Paddle.io.Dataset\n", @@ -201,9 +189,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "import os\n", @@ -261,9 +247,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "### 3.2 测试Paddle.io.DataLoader并输出图片" ] @@ -271,9 +255,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "\n", @@ -311,9 +293,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "### 3.3 权重初始化\n", "在 DCGAN 论文中,作者指定所有模型权重应从均值为0、标准差为0.02的正态分布中随机初始化。 \n", @@ -323,9 +303,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "conv_initializer=paddle.nn.initializer.Normal(mean=0.0, std=0.02)\n", @@ -334,9 +312,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "### 3.4 判别器\n", "如上文所述,生成器$D$是一个二进制分类网络,它以图像作为输入,输出图像是真实的(相对应$G$生成的假样本)的概率。输入$Shape$为[3,64,64]的RGB图像,通过一系列的$Conv2d$,$BatchNorm2d$和$LeakyReLU$层对其进行处理,然后通过全连接层输出的神经元个数为2,对应两个标签的预测概率。\n", @@ -351,9 +327,7 @@ { "cell_type": "code", "execution_count": 9, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "import paddle\n", @@ -415,15 +389,13 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "### 3.5 生成器\n", "生成器$G$旨在映射潜在空间矢量$z$到数据空间。由于我们的数据是图像,因此转换$z$到数据空间意味着最终创建具有与训练图像相同大小[3,64,64]的RGB图像。在网络设计中,这是通过一系列二维卷积转置层来完成的,每个层都与$BatchNorm$层和$ReLu$激活函数。生成器的输出通过$tanh$函数输出,以使其返回到输入数据范围[−1,1]。值得注意的是,在卷积转置层之后存在$BatchNorm$函数,因为这是DCGAN论文的关键改进。这些层有助于训练过程中的梯度更好地流动。 \n", "\n", "**生成器网络结构** \n", - "![](https://ai-studio-static-online.cdn.bcebos.com/ca0434dd681849338b1c0c46285616f72add01ab894b4e95848daecd5a72e3cb)\n", + "![models](./images/models.png)\n", "\n", "* 将$BatchNorm$批归一化中$momentum$参数设置为0.5\n", "\n", @@ -434,9 +406,7 @@ { "cell_type": "code", "execution_count": 1, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "\n", @@ -501,9 +471,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "### 3.6 损失函数\n", "选用BCELoss,公式如下:\n", @@ -515,9 +483,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "###损失函数\n", @@ -526,9 +492,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 4 模型训练\n", " 设置的超参数为:\n", @@ -545,9 +509,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "import IPython.display as display\n", @@ -650,9 +612,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "plt.figure(figsize=(15, 6))\n", @@ -668,26 +628,22 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 5 模型评估\n", "### 生成器$G$和判别器$D$的损失与迭代变化图\n", - "![](https://ai-studio-static-online.cdn.bcebos.com/0c8cff8bf3a540bcb601bd24012f67dda622b0391f1d412ea2372693b67b4541)\n", + "![loss](./images/loss.png)\n", "\n", "### 对比真实人脸图像(图一)和生成人脸图像(图二)\n", "#### 图一\n", - "![](https://ai-studio-static-online.cdn.bcebos.com/622fff7b67a240ff8a1fceb209fc27445c929099365c44e8bd006569bf26e0a6)\n", + "![face_image1](./images/face_image1.jpeg)\n", "### 图二\n", - "![](https://ai-studio-static-online.cdn.bcebos.com/ed33cf82762d4ae79feef82b77604273303cc38b8f5d4728a632b492f2665ed4)\n" + "![face_image2](./images/face_image2.jpeg)\n" ] }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 6 模型预测\n", "### 输入随机数让生成器$G$生成随机人脸\n", @@ -697,9 +653,7 @@ { "cell_type": "code", "execution_count": null, - "metadata": { - "collapsed": false - }, + "metadata": {}, "outputs": [], "source": [ "device = paddle.set_device('gpu')\n", @@ -726,9 +680,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 7 项目总结\n", "简单介绍了一下DCGAN的原理,通过对原项目的改进和优化,一步一步依次对生成器和判别器以及训练过程进行介绍。\n", @@ -737,9 +689,7 @@ }, { "cell_type": "markdown", - "metadata": { - "collapsed": false - }, + "metadata": {}, "source": [ "## 8 参考文献\n", "[1] Goodfellow, Ian J.; Pouget-Abadie, Jean; Mirza, Mehdi; Xu, Bing; Warde-Farley, David; Ozair, Sherjil; Courville, Aaron; Bengio, Yoshua. Generative Adversarial Networks. 2014. arXiv:1406.2661 [stat.ML].\n", @@ -756,9 +706,9 @@ ], "metadata": { "kernelspec": { - "display_name": "PaddlePaddle 2.0.0b0 (Python 3.5)", + "display_name": "Python 3", "language": "python", - "name": "py35-paddle1.2.0" + "name": "python3" }, "language_info": { "codemirror_mode": { @@ -770,7 +720,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.4" + "version": "3.8.2" } }, "nbformat": 4, diff --git a/paddle2.0_docs/dcgan_face/images/face_image1.jpeg b/paddle2.0_docs/dcgan_face/images/face_image1.jpeg new file mode 100644 index 00000000000..5106ea520c9 Binary files /dev/null and b/paddle2.0_docs/dcgan_face/images/face_image1.jpeg differ diff --git a/paddle2.0_docs/dcgan_face/images/face_image2.jpeg b/paddle2.0_docs/dcgan_face/images/face_image2.jpeg new file mode 100644 index 00000000000..d6e826af68f Binary files /dev/null and b/paddle2.0_docs/dcgan_face/images/face_image2.jpeg differ diff --git a/paddle2.0_docs/dcgan_face/images/loss.png b/paddle2.0_docs/dcgan_face/images/loss.png new file mode 100644 index 00000000000..1bef2dfe858 Binary files /dev/null and b/paddle2.0_docs/dcgan_face/images/loss.png differ diff --git a/paddle2.0_docs/dcgan_face/images/models.png b/paddle2.0_docs/dcgan_face/images/models.png new file mode 100644 index 00000000000..bbe1c8fb700 Binary files /dev/null and b/paddle2.0_docs/dcgan_face/images/models.png differ