ENH: Computer Vision pipelines #560

radheyakale · 2022-06-21T07:45:36Z

No description provided.

* TODO: Need a robust way to find if the MLHandler expects image data for training/testing * TODO: Separate class is required for training and testing CV models * TODO: Code has to be made generic to dynamically select Resnet50/VGG, etc. models * TODO: Model training parameters viz. epochs, batch size have to be taken from the user with defaults set * TODO: setup and get_model functions have to be made more scalable * TODO: ModelStore __init__ has to be made scalable

…mlhandler

radheyakale · 2022-06-21T07:47:45Z

TODO: Need a robust way to find if the MLHandler expects image data for training/testing
TODO: Separate class is required for training and testing CV models
TODO: Code has to be made generic to dynamically select Resnet50/VGG, etc. models
TODO: Model training parameters viz. epochs, batch size have to be taken from the user with defaults set
TODO: setup and get_model functions have to be made more scalable
TODO: ModelStore init has to be made scalable

@jaidevd @sanand0 Please add/edit/delete any more TODOs that you feel may be required.

jaidevd · 2022-06-21T08:29:52Z

@radheyakale

Start by creating a wrapper class for Keras apps which inherits from this. Refer to the HFTransformer class for an example.
Move as much code away from MLHandler as possible. It's okay if the example only supports ResNet50 for now.
Provide a sample gramex.yaml which shows the usage.

radheyakale · 2022-06-24T11:58:57Z

@radheyakale

Start by creating a wrapper class for Keras apps which inherits from this. Refer to the HFTransformer class for an example.

Move as much code away from MLHandler as possible. It's okay if the example only supports ResNet50 for now.

Provide a sample gramex.yaml which shows the usage.

@jaidevd Please look at the recent commit fe2c85e. Is this the correct approach to implement it?

* Predict is working with default VGG16 and Resnet models. * TODO: Test and add all other models supported by Keras * TODO: Implement training functionality * TODO: Implement functionality to predict from trained models provided by user/trained in gramex

jaidevd · 2022-06-25T05:09:03Z

gramex/handlers/mlhandler.py

-        if op.exists(cls.store.model_path):  # If the pkl exists, load it
-            if op.isdir(cls.store.model_path):
-                mclass, wrapper = ml.search_modelclass(mclass)
-                cls.model = locate(wrapper).from_disk(mclass, cls.store.model_path)


This condition is required for loading transformers. Why is this removed?

jaidevd · 2022-06-25T05:12:05Z

gramex/ml_api.py

@@ -46,6 +46,10 @@
        "statsmodels.tsa.statespace.sarimax",
    ],
    "gramex.ml_api.HFTransformer": ["gramex.transformers"],
+    "gramex.ml_api.KerasApplications": [
+        "tensorflow.keras.applications.vgg16",
+        "tensorflow.keras.applications.resnet50"


You can just leave this at tensorflow.keras.applications, because

from tensorflow.keras.applications import *

covers everything

gramex/ml_api.py

jaidevd · 2022-06-25T05:15:30Z

gramex/ml_api.py

@@ -426,3 +435,49 @@ def _predict(
    ):
        text = X["text"]
        return self.model.predict(text)
+
+
+class KerasApplications(AbstractModel):


Call this KerasApplication - singular.

gramex/handlers/mlhandler.py

jaidevd · 2022-06-25T05:42:36Z

gramex/handlers/mlhandler.py

+                    data = imutils.resize(cv2.imdecode(np.fromstring(
+                        self.request.files['image'][0].body, np.uint8), cv2.IMREAD_UNCHANGED),
+                        width=224)
+                    data = cv2.resize(data, (224, 224))


All of this logic should be in the wrapper class.

OpenCV should not be a dependency. For loading images from files / streams, use PIL.Image.open

For resizing, use skimage.transform.resize or tf.image.resize or PIL.Image.resize.

The target size after resizing should not be hardcoded to [224, 224] - there are other sizes in Keras apps too. For this, you can check the shape of the input tensor in the corresponding model with model.input_shape.

jaidevd · 2022-06-25T05:47:55Z

gramex/handlers/mlhandler.py

                self.set_header('Content-Disposition',
                                f'attachment; filename={op.basename(self.store.model_path)}')
                with open(self.store.model_path, 'rb') as fout:
                    self.write(fout.read())
            elif '_model' in self.args:
                self.write(json.dumps(self.model.get_params(), indent=2))
+            elif len(self.request.files.keys()) and \
+                self.request.files['image'][0].content_type in \
+                    ['image/jpeg', 'image/jpg', 'image/png']:


This check should not happen here. As far as possible, send the request payload blindly to the _fit or _predict methods. The wrapper class should take care of everything if it is written correctly.

Also note that there are two ways one can send images in a request.

Send files (multipart form data) - in which case you look at the mimetypes of the files received and open them accordingly. For this, take a look at the definition of self._parse_multipart_form_data

Send the raw bytestream of an image with a Content-Type: image/whatever header. In this case, write functions called _parse_image_jpeg, _parse_image_png, etc. The handler will automatically call them.

Basically the handler knows how to parse data given the content type of the request.

jaidevd · 2022-06-25T05:48:33Z

gramex/handlers/mlhandler.py

+                    if 'training_data' in data.keys():
+                        training_results = yield gramex.service.threadpool.submit(
+                            self._train, data=data['training_data'].iloc[0])
+                        self.write(json.dumps(training_results, indent=2, cls=CustomJSONEncoder))


What does this do?

jaidevd · 2022-06-25T05:49:36Z

gramex/handlers/mlhandler.py

+            json.dump(class_names, fout)
+        keras_model.save(config_dir)
+        return class_names
+


Remove this and move it to the wrapper class.

* Training keras models would be done in ml_api in KerasApplication wrapper * _parse_multipart_form_data used for parsing images

* Removed unnecessary code from ModelStore

…mlhandler

jaidevd · 2022-08-01T08:54:10Z

gramex/handlers/mlhandler.py

@@ -184,7 +186,10 @@ def _transform(self, data, **kwargs):
        return data

    def _predict(self, data=None, score_col=''):
-        self._check_model_path()
+        import io
+        if type(data) == io.BytesIO:


Use isinstance for checking types.

jaidevd · 2022-08-01T08:57:39Z

gramex/handlers/mlhandler.py

+                    data = self.args['training_data']
+                    training_results = yield gramex.service.threadpool.submit(
+                        self._train, data=data)
+                    self.write(json.dumps(training_results, indent=2, cls=CustomJSONEncoder))


Training should not happen in GET. Only in POST or PUT.

jaidevd · 2022-08-01T09:00:16Z

gramex/handlers/mlhandler.py

+                self.store.load('class'), self.store.load('params'),
+                data=data, target_col=target_col,
+                nums=self.store.load('nums'), cats=self.store.load('cats')
+            )


@jaidevd - find a way to remove dataframe-specific code from MLHandler - it should deal only with train / predict semantics.

jaidevd · 2022-08-01T09:02:10Z

gramex/ml_api.py

+                       input_tensor=None,
+                       input_shape=None,
+                       pooling=None,
+                       classes=1000)


Move this to __init__.

jaidevd · 2022-08-01T09:02:25Z

gramex/ml_api.py

+                       input_tensor=None,
+                       input_shape=None,
+                       pooling=None,
+                       classes=1000)


This should do only self.model.predict.

* Model initialisation code moved to __init__ * Training added to POST request

radheyakale added 2 commits June 21, 2022 13:07

Merge branch 'master' of https://github.com/gramener/gramex into rdk-…

d97c32e

…mlhandler

radheyakale requested review from sanand0 and jaidevd June 21, 2022 07:45

radheyakale changed the title ~~Computer Vision pipelines~~ ENH: Computer Vision pipelines Jun 21, 2022

radheyakale force-pushed the rdk-mlhandler branch from d7791e6 to fe2c85e Compare June 24, 2022 12:06

jaidevd requested changes Jun 25, 2022

View reviewed changes

radheyakale added 2 commits June 29, 2022 15:48

WIP: Code refactoring

fcc597a

* Training keras models would be done in ml_api in KerasApplication wrapper * _parse_multipart_form_data used for parsing images

WIP: Code optimisation

c479f08

radheyakale force-pushed the rdk-mlhandler branch from 970a8cc to c479f08 Compare June 29, 2022 14:22

radheyakale added 2 commits July 6, 2022 13:30

Code optimisation

b35a446

* Removed unnecessary code from ModelStore

Merge branch 'master' of https://github.com/gramener/gramex into rdk-…

76a7f44

…mlhandler

jaidevd requested changes Aug 1, 2022

View reviewed changes

radheyakale added 2 commits August 1, 2022 15:30

WIP: Code refactoring

f23b98a

* Model initialisation code moved to __init__ * Training added to POST request

WIP

856be6b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Computer Vision pipelines #560

ENH: Computer Vision pipelines #560

radheyakale commented Jun 21, 2022 •

edited

radheyakale commented Jun 21, 2022 •

edited

jaidevd commented Jun 21, 2022

radheyakale commented Jun 24, 2022 •

edited

jaidevd Jun 25, 2022

jaidevd Jun 25, 2022

jaidevd Jun 25, 2022

jaidevd Jun 25, 2022

jaidevd Jun 25, 2022

jaidevd Jun 25, 2022

jaidevd Jun 25, 2022

jaidevd Aug 1, 2022

jaidevd Aug 1, 2022

jaidevd Aug 1, 2022

jaidevd Aug 1, 2022

jaidevd Aug 1, 2022

ENH: Computer Vision pipelines #560

Are you sure you want to change the base?

ENH: Computer Vision pipelines #560

Conversation

radheyakale commented Jun 21, 2022 • edited

radheyakale commented Jun 21, 2022 • edited

jaidevd commented Jun 21, 2022

radheyakale commented Jun 24, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

radheyakale commented Jun 21, 2022 •

edited

radheyakale commented Jun 21, 2022 •

edited

radheyakale commented Jun 24, 2022 •

edited