Multiprocessing for Tensorflow model does not return outputs and halts
up vote
0
down vote
favorite
I am using multiprocessing to train a Tensorflow model, splitting x_train and y_train in chunks of equal size to use 16 processors, then use an ensemble.
I am using a train function for Tensorflow:
def train(parte):
with tf.Session(graph=graph, config=config) as sess:
sess.run(init)
for step in range(1, training_epochs+1):
Xt, Yt = next_batch(batch_size, X0[parte], Y0[parte])
batch_x, batch_y = Xt,Yt
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y, is_training: True})
if step % display_step == 0 or step == 1:
loss, acc = sess.run([loss_op, accuracy], feed_dict={
X: batch_x, Y: batch_y, is_training: False})
test_len = Y0_test[parte].shape[0]
test_data = X0_test[parte]
test_label = Y0_test[parte]
val_acc = sess.run(accuracy, feed_dict={X: test_data, Y: test_label, is_training: False})
if loss<0.4:
print("Step " + str(step) + ", Minibatch Loss= " +
"{:.4f}".format(loss) + ", Training Accuracy= " +
"{:.3f}".format(acc) + ", Test Accuracy= " +
"{:.3f}".format(val_acc))
pred00 = sess.run([prediction],feed_dict={X: X0_test[parte], is_training: False})
return loss
Then map each chunk of the train
function to the pool:
if __name__ == '__main__':
pool = multiprocessing.Pool(16)
results = pool.map(train, np.arange(0,15))
pool.close()
pool.join()
The model is saving weights properly, however there are two problems: 1. the output on every 10 epochs is only being printed sometimes and 2. the pool process is not being closed, halting Jupyter cell being evaluated.
Is there a way I can make this work?
tensorflow multiprocessing python-multiprocessing
add a comment |
up vote
0
down vote
favorite
I am using multiprocessing to train a Tensorflow model, splitting x_train and y_train in chunks of equal size to use 16 processors, then use an ensemble.
I am using a train function for Tensorflow:
def train(parte):
with tf.Session(graph=graph, config=config) as sess:
sess.run(init)
for step in range(1, training_epochs+1):
Xt, Yt = next_batch(batch_size, X0[parte], Y0[parte])
batch_x, batch_y = Xt,Yt
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y, is_training: True})
if step % display_step == 0 or step == 1:
loss, acc = sess.run([loss_op, accuracy], feed_dict={
X: batch_x, Y: batch_y, is_training: False})
test_len = Y0_test[parte].shape[0]
test_data = X0_test[parte]
test_label = Y0_test[parte]
val_acc = sess.run(accuracy, feed_dict={X: test_data, Y: test_label, is_training: False})
if loss<0.4:
print("Step " + str(step) + ", Minibatch Loss= " +
"{:.4f}".format(loss) + ", Training Accuracy= " +
"{:.3f}".format(acc) + ", Test Accuracy= " +
"{:.3f}".format(val_acc))
pred00 = sess.run([prediction],feed_dict={X: X0_test[parte], is_training: False})
return loss
Then map each chunk of the train
function to the pool:
if __name__ == '__main__':
pool = multiprocessing.Pool(16)
results = pool.map(train, np.arange(0,15))
pool.close()
pool.join()
The model is saving weights properly, however there are two problems: 1. the output on every 10 epochs is only being printed sometimes and 2. the pool process is not being closed, halting Jupyter cell being evaluated.
Is there a way I can make this work?
tensorflow multiprocessing python-multiprocessing
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I am using multiprocessing to train a Tensorflow model, splitting x_train and y_train in chunks of equal size to use 16 processors, then use an ensemble.
I am using a train function for Tensorflow:
def train(parte):
with tf.Session(graph=graph, config=config) as sess:
sess.run(init)
for step in range(1, training_epochs+1):
Xt, Yt = next_batch(batch_size, X0[parte], Y0[parte])
batch_x, batch_y = Xt,Yt
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y, is_training: True})
if step % display_step == 0 or step == 1:
loss, acc = sess.run([loss_op, accuracy], feed_dict={
X: batch_x, Y: batch_y, is_training: False})
test_len = Y0_test[parte].shape[0]
test_data = X0_test[parte]
test_label = Y0_test[parte]
val_acc = sess.run(accuracy, feed_dict={X: test_data, Y: test_label, is_training: False})
if loss<0.4:
print("Step " + str(step) + ", Minibatch Loss= " +
"{:.4f}".format(loss) + ", Training Accuracy= " +
"{:.3f}".format(acc) + ", Test Accuracy= " +
"{:.3f}".format(val_acc))
pred00 = sess.run([prediction],feed_dict={X: X0_test[parte], is_training: False})
return loss
Then map each chunk of the train
function to the pool:
if __name__ == '__main__':
pool = multiprocessing.Pool(16)
results = pool.map(train, np.arange(0,15))
pool.close()
pool.join()
The model is saving weights properly, however there are two problems: 1. the output on every 10 epochs is only being printed sometimes and 2. the pool process is not being closed, halting Jupyter cell being evaluated.
Is there a way I can make this work?
tensorflow multiprocessing python-multiprocessing
I am using multiprocessing to train a Tensorflow model, splitting x_train and y_train in chunks of equal size to use 16 processors, then use an ensemble.
I am using a train function for Tensorflow:
def train(parte):
with tf.Session(graph=graph, config=config) as sess:
sess.run(init)
for step in range(1, training_epochs+1):
Xt, Yt = next_batch(batch_size, X0[parte], Y0[parte])
batch_x, batch_y = Xt,Yt
sess.run(train_op, feed_dict={X: batch_x, Y: batch_y, is_training: True})
if step % display_step == 0 or step == 1:
loss, acc = sess.run([loss_op, accuracy], feed_dict={
X: batch_x, Y: batch_y, is_training: False})
test_len = Y0_test[parte].shape[0]
test_data = X0_test[parte]
test_label = Y0_test[parte]
val_acc = sess.run(accuracy, feed_dict={X: test_data, Y: test_label, is_training: False})
if loss<0.4:
print("Step " + str(step) + ", Minibatch Loss= " +
"{:.4f}".format(loss) + ", Training Accuracy= " +
"{:.3f}".format(acc) + ", Test Accuracy= " +
"{:.3f}".format(val_acc))
pred00 = sess.run([prediction],feed_dict={X: X0_test[parte], is_training: False})
return loss
Then map each chunk of the train
function to the pool:
if __name__ == '__main__':
pool = multiprocessing.Pool(16)
results = pool.map(train, np.arange(0,15))
pool.close()
pool.join()
The model is saving weights properly, however there are two problems: 1. the output on every 10 epochs is only being printed sometimes and 2. the pool process is not being closed, halting Jupyter cell being evaluated.
Is there a way I can make this work?
tensorflow multiprocessing python-multiprocessing
tensorflow multiprocessing python-multiprocessing
edited Nov 13 at 14:32
asked Nov 13 at 12:39
Rubens_Z
418415
418415
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53281195%2fmultiprocessing-for-tensorflow-model-does-not-return-outputs-and-halts%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown