Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does nghttpx ingress intercept errors? #82

Open
ingridgoh opened this issue Mar 7, 2018 · 4 comments
Open

Does nghttpx ingress intercept errors? #82

ingridgoh opened this issue Mar 7, 2018 · 4 comments

Comments

@ingridgoh
Copy link

Hello,

I currently have tensorflow serving deployed in a container and I've noticed that where there are any prediction errors, the actual error stack is not returned to the client when using nghttpx ingress. The following are my observations (all aspects/environment is kept constant except for the usage of an intermediate ingress):

1. Client Request --> Load Balancer --> Ingress --> Container (Tensorflow-serving)
Observation: Error is obscured from the client, a generic error message is received
Error Received:
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INTERNAL, details="Received RST_STREAM with error code 2")

2. Client Request --> Load Balancer --> Container (Tensorflow-serving)
Observation: Detailed error stack is returned to client
Error Received:
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Matrix size-incompatible: In[0]: [3592,10], In[1]: [3592,10]
[[Node: MatMul = MatMul[T=DT_FLOAT, _output_shapes=[[?,10]], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_x_0_0, Variable/read)]]")

Thank you!

@tatsuhiro-t
Copy link
Contributor

Could you provide a way how to reproduce this, for example, using https://github.com/tensorflow/serving/tree/master/tensorflow_serving/example ?

@ingridgoh
Copy link
Author

The error mentioned was a failed inference query against a DNN model. However you do not need to replicate the exact error that I have received since all error thrown by TF-Serving server will result in the "Received RST_STREAM with error code 2" error if the ingress is used. You could take https://github.com/tensorflow/serving/blob/master/tensorflow_serving/example/inception_client.py as an example and tweak the script such that a random matrix is sent in the request instead of an image (Please note that I'm a complete novice at this):

e.g.:

rand_array = np.random.rand(10, 3592)
request = predict_pb2.PredictRequest()
request.model_spec.name = MODEL_NAME
request.model_spec.signature_name = 'predict_images'
request.inputs['inputs'].CopyFrom(
    tf.contrib.util.make_tensor_proto(rand_array, dtype=tf.float32)
)

Here's a simple architectural diagram for my setup:
image

@tatsuhiro-t
Copy link
Contributor

I tried to reproduce the issue with the following patch to tensorflow_serving/example/mnist_client.py:

diff --git a/tensorflow_serving/example/mnist_client.py b/tensorflow_serving/example/mnist_client.py
index 947f7c4..93f1e91 100644
--- a/tensorflow_serving/example/mnist_client.py
+++ b/tensorflow_serving/example/mnist_client.py
@@ -146,8 +146,9 @@ def do_inference(hostport, work_dir, concurrency, num_tests):
     request.model_spec.name = 'mnist'
     request.model_spec.signature_name = 'predict_images'
     image, label = test_data_set.next_batch(1)
+    rand_array = numpy.random.rand(10, 3592)
     request.inputs['images'].CopyFrom(
-        tf.contrib.util.make_tensor_proto(image[0], shape=[1, image[0].size]))
+        tf.contrib.util.make_tensor_proto(rand_array, dtype=tf.float32))
     result_counter.throttle()
     result_future = stub.Predict.future(request, 5.0)  # 5 seconds
     result_future.add_done_callback(

But, I got the same error messages with, or without proxy:

AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Matrix size-incompatible: In[0]: [10,3592], In[1]: [784,10]
[[Node: MatMul = MatMul[T=DT_FLOAT, _output_shapes=[[?,10]], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_x_0_0, Variable/read)]]")

Which version of Ingress controller are you using? It is worth to try the latest version.

@ingridgoh
Copy link
Author

I was using v0.28.0. I've updated the controller to 0.31.0 but the same behaviour still occurs. The following are the exact errors I have received:

With ingress:

$ python 1_non_mlpkit_our_data.py
Traceback (most recent call last):
  File "1_non_mlpkit_our_data.py", line 94, in <module>
    print stub.Predict(request, 120)
  File "/Users/setup/virtualenv/lib/python2.7/site-packages/grpc/beta/_client_adaptations.py", line 309, in __call__
    self._request_serializer, self._response_deserializer)
  File "/Users/setup/virtualenv/lib/python2.7/site-packages/grpc/beta/_client_adaptations.py", line 195, in _blocking_unary_unary
    raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INTERNAL, details="Received RST_STREAM with error code 2")

Without ingress

$ python 1_non_mlpkit_our_data.py
Traceback (most recent call last):
  File "1_non_mlpkit_our_data.py", line 94, in <module>
    print stub.Predict(request, 120)
  File "/Users/setup/virtualenv/lib/python2.7/site-packages/grpc/beta/_client_adaptations.py", line 309, in __call__
    self._request_serializer, self._response_deserializer)
  File "/Users/setup/virtualenv/lib/python2.7/site-packages/grpc/beta/_client_adaptations.py", line 195, in _blocking_unary_unary
    raise _abortion_error(rpc_error_call)
grpc.framework.interfaces.face.face.AbortionError: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Matrix size-incompatible: In[0]: [3592,10], In[1]: [3592,10]
	 [[Node: MatMul = MatMul[T=DT_FLOAT, _output_shapes=[[?,10]], transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"](_arg_x_0_0, Variable/read)]]")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants