TFLite GPUv2: ADD(x, 1e-5) results in severely wrong output #67216

gustavla · 2024-05-09T03:06:22Z

System information

Samsung Galaxy S23 / Android 13 /Snapdragon® 8 Gen 2 | SM8550
GPUv2 delegate
TFLite 2.16.1

Assets:

Model: https://qaihub-public-issues.s3.us-west-2.amazonaws.com/tflite/67216_post_add_numerical_issues.tflite
Inputs: https://qaihub-public-issues.s3.us-west-2.amazonaws.com/tflite/67216_post_add_numerical_issues_inputs.npz (saved with numpy.savez)

Please take a look at two outputs in particular of this network:

key = "model_13/featurefusion_network/encoder/query_layer/norm/LayerNormalization/moments/variance" (variance)
key2 = "model_13/featurefusion_network/encoder/query_layer/norm/LayerNormalization/batchnorm/add" (add)

The variable variance gets fed into ADD(x, 0.000009999999747378752) and comes out as add.

I ran this on the CPU (xnnpack) and the GPU (GPUv2) and got totally different results.

variance looks like this across CPU and GPU (so far consistent):

(Pdb) p cpu[key][0].ravel()
array([0.8386347 , 0.8353483 , 0.83554685, 0.8366282 , 0.8377434 ,
       0.8369055 , 0.8419936 , 0.8433927 , 0.83845955, 0.83644855,
       0.8404068 , 0.8368349 , 0.8335228 , 0.8401757 , 0.83619094,
       0.8386446 ], dtype=float32)
(Pdb) p gpu[key][0].ravel()
array([0.8378906 , 0.83496094, 0.8354492 , 0.8364258 , 0.83691406,
       0.8364258 , 0.8417969 , 0.84375   , 0.83740234, 0.8354492 ,
       0.84033203, 0.83691406, 0.8334961 , 0.83984375, 0.8359375 ,
       0.8383789 ], dtype=float32)

add looks like this across CPU and GPU:

(Pdb) p cpu[key2][0].ravel()
array([0.83864474, 0.8353583 , 0.83555686, 0.8366382 , 0.8377534 ,
       0.8369155 , 0.8420036 , 0.84340274, 0.83846956, 0.83645856,
       0.8404168 , 0.8368449 , 0.8335328 , 0.8401857 , 0.83620095,
       0.83865464], dtype=float32)
(Pdb) p gpu[key2][0].ravel()
array([-0.78222656,  0.20910645, -0.72802734,  0.32421875, -0.78125   ,
        0.20947266, -0.72753906,  0.3244629 , -0.7817383 ,  0.2097168 ,
       -0.7265625 ,  0.3251953 , -0.78125   ,  0.2097168 , -0.72802734,
        0.3244629 ], dtype=float32)

Here, the values on the GPU has gone completely off the rails. They do not look random though, since there is a periodicity to the output (error alternates between around 1.6 and 0.6).

Standalone code to reproduce the issue
This should be simple to set up through benchmark tool or any other way to run GPUv2 directly. I ran it through Qualcomm's AI Hub (https://aihub.qualcomm.com), so I'm attaching the script that I used as a reference. This also shows how the example inputs can be loaded into python.

import numpy as np
import qai_hub as hub

inputs = np.load("67216_post_add_numerical_issues_inputs.npz")

model = hub.upload_model("67216_post_add_numerical_issues.tflite")
device = hub.Device("Samsung Galaxy S23")
input_data = hub.upload_dataset({
    "image": [inputs["image"]], 
    "feature_template": [inputs["feature_template"]],
    "pos_template": [inputs["pos_template"]],
    "pos_search": [inputs["pos_search"]],
})

job_cpu = hub.submit_inference_job(
    model,
    device=device,
    inputs=input_data,
    options="--compute_unit cpu",
)

job_gpu = hub.submit_inference_job(
    model,
    device=device,
    inputs=input_data,
    options="--compute_unit gpu",
)

cpu = job_cpu.download_output_data()
gpu = job_gpu.download_output_data()

key = "model_13/featurefusion_network/encoder/query_layer/norm/LayerNormalization/moments/variance"
key2 = "model_13/featurefusion_network/encoder/query_layer/norm/LayerNormalization/batchnorm/add"

print(gpu[key2][0].ravel())

The text was updated successfully, but these errors were encountered:

sawantkumar · 2024-05-13T07:30:20Z

Hi @gustavla ,

I replicated your issue using Qualcom Ai hub, and i got the same results as you. Let me verify the same through an Android app and I will get back to you.

gustavla added the comp:lite TF Lite related issues label May 9, 2024

google-ml-butler bot assigned Venkat6871 May 9, 2024

tilakrayal added TFLiteGpuDelegate TFLite Gpu delegate issue TF 2.16 type:bug Bug labels May 9, 2024

tilakrayal assigned sawantkumar and unassigned sawantkumar and Venkat6871 May 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TFLite GPUv2: ADD(x, 1e-5) results in severely wrong output #67216

TFLite GPUv2: ADD(x, 1e-5) results in severely wrong output #67216

gustavla commented May 9, 2024 •

edited

sawantkumar commented May 13, 2024

TFLite GPUv2: ADD(x, 1e-5) results in severely wrong output #67216

TFLite GPUv2: ADD(x, 1e-5) results in severely wrong output #67216

Comments

gustavla commented May 9, 2024 • edited

sawantkumar commented May 13, 2024

gustavla commented May 9, 2024 •

edited