Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ResNet preproc version 2 (with image decoding) #627

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jantonguirao
Copy link
Contributor

ResNet preprocessing model version 2 (with image decoding)

Description

It adds a new version of the ResNet preprocessing model, including image decoding. This is the result of the work from the ONNX preprocessing working-group (see https://github.com/onnx/working-groups/tree/main/preprocessing)

Model

Model Download Download (with sample test data) ONNX version Opset version Accuracy
ResNet-preproc-v2 4.0KB 868 KB 1.15.0 20 N/A

Source

Created via ONNX parser:

import onnx
from onnx import parser
from onnx import checker

model_f = './resnet-preproc-v2-20/resnet-preproc-v2-20.onnx'

resnet_preproc = parser.parse_model('''
<
  ir_version: 8,
  opset_import: [ "" : 20, "local" : 1 ],
  metadata_props: [ "preprocessing_fn" : "local.preprocess"]
>
resnet_preproc_g (seq(uint8[?]) images) => (float[B, 3, 224, 224] preproc_data)
{
    preproc_data = local.preprocess(images)
}

<
  opset_import: [ "" : 20 ],
  domain: "local",
  doc_string: "Preprocessing function."
>
preprocess (input_batch) => (output_tensor) {
    tmp_seq = SequenceMap <
        body = sample_preprocessing(uint8[?] sample_in) => (float[3, 224, 224] sample_out) {
            image = ImageDecoder (sample_in)
            target_size = Constant <value = int64[2] {256, 256}> ()
            image_resized = Resize <mode = \"linear\",
                                    antialias = 1,
                                    axes = [0, 1],
                                    keep_aspect_ratio_policy = \"not_smaller\"> (sample_in, , , target_size)

            target_crop = Constant <value = int64[2] {224, 224}> ()
            image_sliced = CenterCropPad <axes = [0, 1]> (image_resized, target_crop)

            kMean = Constant <value = float[3] {123.675, 116.28, 103.53}> ()
            kStddev = Constant <value = float[3] {58.395, 57.12, 57.375}> ()
            im_norm_tmp1 = Cast <to = 1> (image_sliced)
            im_norm_tmp2 = Sub (im_norm_tmp1, kMean)
            im_norm = Div (im_norm_tmp2, kStddev)

            sample_out = Transpose <perm = [2, 0, 1]> (im_norm)
        }
    > (input_batch)
    output_tensor = ConcatFromSequence < axis = 0, new_axis = 1 >(tmp_seq)
}

''')
checker.check_model(resnet_preproc)

with open(model_f, "wb") as f:
    f.write(resnet_preproc.SerializeToString())
print("OK")

Input

Sequence of encoded images (uint8 1D tensor)

Preprocessing

This is a preprocessing model

Output

Single tensor with float32 [N, 3, 224, 224], where N is the number of elements in the input sequence

Postprocessing

N/A

Model Creation

Dataset (Train and validation)

N/A

Training

N/A

Validation accuracy

N/A

Test Data Creation

Data was created by encoding the input images from the first Resnet preproc model:

import numpy as np
import onnx
from onnx import numpy_helper
from onnx.onnx_data_pb2 import SequenceProto
import cv2


decoded_input_f = './resnet-preproc-v1-18/test_data_set_0/input_0.pb'
encoded_input_f = './resnet-preproc-v2-20/test_data_set_0/input_0.pb'


def read_sequenceproto_pb_file(filename):
    """Return tuple of sequence name and list of numpy.ndarray of the data from a pb file containing a SequenceProto."""
    seq = SequenceProto()
    with open(filename, 'rb') as f:
        seq.ParseFromString(f.read())
    list_of_arrays = numpy_helper.to_list(seq)
    return seq.name, list_of_arrays

name, list_of_arrays = read_sequenceproto_pb_file(decoded_input_f)

encoded_array_list = []
for arr in list_of_arrays:
    encoded = cv2.imencode('.bmp', arr)[1]
    encoded_array_list.append(encoded)

with open(encoded_input_f, "wb") as f:
    f.write(numpy_helper.from_list(encoded_array_list, name).SerializeToString())
print("OK")


name2, list_of_arrays2 = read_sequenceproto_pb_file(encoded_input_f)
decoded_array_list = []
for arr in list_of_arrays2:
    decoded = cv2.imdecode(arr, cv2.IMREAD_COLOR)
    print(decoded.shape)

References

Link to paper or references.

Contributors

Joaquin Anton (NVIDIA)

License

Add license information - on default, Apache 2.0


Signed-off-by: Joaquin Anton <janton@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant