Set up devices on models trained on GPUs and make predictions on CPUs… here is a solution to the problem.
Set up devices on models trained on GPUs and make predictions on CPUs
I trained a model on the GPU and saved it like this (export_path my output directory).
builder = tf.saved_model.builder.SavedModelBuilder(export_path)
tensor_info_x = tf.saved_model.utils.build_tensor_info(self. Xph)
tensor_info_y = tf.saved_model.utils.build_tensor_info(self.predprob)
tensor_info_it = tf.saved_model.utils.build_tensor_info(self.istraining)
tensor_info_do = tf.saved_model.utils.build_tensor_info(self.dropout)
prediction_signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs={'myx': tensor_info_x, 'istraining': tensor_info_it, 'dropout': tensor_info_do},
outputs={'ypred': tensor_info_y},
method_name=tf.saved_model.signature_constants. PREDICT_METHOD_NAME))
builder.add_meta_graph_and_variables(
net, [tf.saved_model.tag_constants. SERVING],
signature_def_map={
tf.saved_model.signature_constants. DEFAULT_SERVING_SIGNATURE_DEF_KEY:
prediction_signature },)
builder.save()
Now I’m trying to load it and run the prediction. If I’m on the GPU it works fine, but without the GPU I get:
tensorflow.python.framework.errors_impl. InvalidArgumentError: Cannot assign a device for operation 'rnn/while/rnn/multi_rnn_cell/cell_0/cell_0/layer_norm_basic_lstm_cell/dropout/add/Enter': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.
Now I read about the tf.train.import_meta_graph and clear_device options, but I can’t get the job done. I’m loading my model:
predict_fn = predictor.from_saved_model(modelname)
The above mentioned error is thrown at this point. modelName is the full file name of the PB file. Is there a way to go through the nodes of the graph and set up the device manually (or do something similar)?
I’m using tensorflow 1.8.0
I see Can a model trained on gpu used on cpu for inference and vice versa? I don’t think I’m copying.
The difference with this question is that I want to know after training