Wasserstein GAN critic training ambiguity

I'm running a DCGAN-based GAN, and am experimenting with WGANs, but am a bit confused about how to train the WGAN.

In the official Wasserstein GAN PyTorch implementation, the discriminator/critic is said to be trained Diters (usually 5) times per each generator training.

Does this mean that the critic/discriminator trains on Diters batches or the whole dataset Diters times? If I'm not mistaken, the official implementation suggests the discriminator/critic is trained on the whole dataset Diters times, but other implementations of WGAN (in PyTorch and TensorFlow etc.) do the opposite.

Which is correct? The WGAN paper (to me, at least), indicates that it is Diters batches. Training on the whole dataset is obviously orders of magnitude slower.

Thanks in advance!

asked Nov 20 '18 at 20:58

krustybek

425

add a comment |

I'm running a DCGAN-based GAN, and am experimenting with WGANs, but am a bit confused about how to train the WGAN.

In the official Wasserstein GAN PyTorch implementation, the discriminator/critic is said to be trained Diters (usually 5) times per each generator training.

Which is correct? The WGAN paper (to me, at least), indicates that it is Diters batches. Training on the whole dataset is obviously orders of magnitude slower.

Thanks in advance!

asked Nov 20 '18 at 20:58

krustybek

425

add a comment |

I'm running a DCGAN-based GAN, and am experimenting with WGANs, but am a bit confused about how to train the WGAN.

In the official Wasserstein GAN PyTorch implementation, the discriminator/critic is said to be trained Diters (usually 5) times per each generator training.

Which is correct? The WGAN paper (to me, at least), indicates that it is Diters batches. Training on the whole dataset is obviously orders of magnitude slower.

Thanks in advance!

asked Nov 20 '18 at 20:58

krustybek

425

I'm running a DCGAN-based GAN, and am experimenting with WGANs, but am a bit confused about how to train the WGAN.

In the official Wasserstein GAN PyTorch implementation, the discriminator/critic is said to be trained Diters (usually 5) times per each generator training.

Which is correct? The WGAN paper (to me, at least), indicates that it is Diters batches. Training on the whole dataset is obviously orders of magnitude slower.

Thanks in advance!

python-3.x tensorflow machine-learning deep-learning pytorch

asked Nov 20 '18 at 20:58

krustybek

425

asked Nov 20 '18 at 20:58

krustybek

425

asked Nov 20 '18 at 20:58

krustybek

425

asked Nov 20 '18 at 20:58

krustybek

425

asked Nov 20 '18 at 20:58

krustybek

425

add a comment |

1 Answer
1

active

oldest

votes

The correct is to consider an iteration as a batch.
In the original paper, for each iteration of the critic/discriminator they are sampling a batch of size m of the real data and a batch of size m of prior samples p(z) to work it. After the critic is trained over Diters iterations, they train the generator which also starts by the sampling of a batch of prior samples of p(z).
Therefore, each iteration is working on a batch.

In the official implementation this is also happening. What may be confusing is that they use the variable name niter to represent the number of epochs to train the model. Although they use a different scheme to set Diters at lines 162-166:

# train the discriminator Diters times

    if gen_iterations < 25 or gen_iterations % 500 == 0:

        Diters = 100

    else:

        Diters = opt.Diters

they are, as in the paper, training the critic over Diters batches.

answered Nov 20 '18 at 23:36

K. Bogdan

1413

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53401431%2fwasserstein-gan-critic-training-ambiguity%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

# train the discriminator Diters times

    if gen_iterations < 25 or gen_iterations % 500 == 0:

        Diters = 100

    else:

        Diters = opt.Diters

they are, as in the paper, training the critic over Diters batches.

answered Nov 20 '18 at 23:36

K. Bogdan

1413

add a comment |

# train the discriminator Diters times

    if gen_iterations < 25 or gen_iterations % 500 == 0:

        Diters = 100

    else:

        Diters = opt.Diters

they are, as in the paper, training the critic over Diters batches.

answered Nov 20 '18 at 23:36

K. Bogdan

1413

add a comment |

# train the discriminator Diters times

    if gen_iterations < 25 or gen_iterations % 500 == 0:

        Diters = 100

    else:

        Diters = opt.Diters

they are, as in the paper, training the critic over Diters batches.

answered Nov 20 '18 at 23:36

K. Bogdan

1413

# train the discriminator Diters times

    if gen_iterations < 25 or gen_iterations % 500 == 0:

        Diters = 100

    else:

        Diters = opt.Diters

they are, as in the paper, training the critic over Diters batches.

answered Nov 20 '18 at 23:36

K. Bogdan

1413

answered Nov 20 '18 at 23:36

K. Bogdan

1413

answered Nov 20 '18 at 23:36

K. Bogdan

1413

answered Nov 20 '18 at 23:36

K. Bogdan

1413

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfrgtkky