Product Engineering Review: Recommendations_product_search_Opensource_Python #585

chetan-hirapara · 2024-04-08T07:24:34Z

Reviewer 1 comments:

Warnings are ignored
Plotly is used ploting.
Query used instead of
DataFrame manipulation functions
Section 6, query is used for VectorDistance instead of teradataml function
In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe.
Reviewer 1 suggestions:
Warnings should not be ignored as they give additional insights
We can leverage teradataml inbuilt plot functionality for product recommendations. in 3.1.1
In section 3.1.1, for select and join, query is used, can use teradataml functions like join and select on dataframe itself.
In section 6, teradataml VectorDistance function can be used instead of query.
In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe, we can do that on teradataml
Dataframe so no need to convert it to pandas dataframe.

Reviewer 2 comments:

Looks like notebook is not runnable.
Use fastexport to export data to CSV.

Reviewer 2 suggestions:

Function

copy_emb_to_sql does not have any definition. Please make sure to add definition for this function.
notebook exports data to CSV using pandas. teradataml supports fastexport to export data to CSV.
section 5 uses SQL Query to check the records. Use DataFrame.shape

The text was updated successfully, but these errors were encountered:

chetan-hirapara · 2024-04-08T12:11:59Z

Reviewer 1 comments:

Warnings are ignored --- warnings are hidden in all the notebooks as a standard followed to keep the notebook clean.
Plotly is used ploting. --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap.
Query used instead of DataFrame manipulation functions --- Fixed in pass6
Section 6, query is used for VectorDistance instead of teradataml function --- Fixed in pass6
In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe. --- Few code is fixed in pass6. remaining is not able to fix using teradataml dataframe. As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.
Reviewer 1 suggestions:
Warnings should not be ignored as they give additional insights --- warnings are hidden in all the notebooks as a standard followed to keep the notebook clean.
We can leverage teradataml inbuilt plot functionality for product recommendations. in 3.1.1 --- Fixed in pass6
In section 3.1.1, for select and join, query is used, can use teradataml functions like join and select on dataframe itself. --- Fixed in pass6
In section 6, teradataml VectorDistance function can be used instead of query. --- Fixed in pass6
In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe, we can do that on teradataml --- Few code is fixed in pass6. remaining is not able to fix using teradataml dataframe. As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.
Dataframe so no need to convert it to pandas dataframe. --- As most of the places it is fixed. Thanks

Reviewer 2 comments:

Looks like notebook is not runnable. --- It is runnable. We know it will take more time than usual one. We have written clear instruction at each places where it will be taking more than 15 sec
Use fastexport to export data to CSV. --- Fixed in pass6

Reviewer 2 suggestions:

Function

copy_emb_to_sql does not have any definition. Please make sure to add definition for this function. --- It is already exists inside helper function sql_helper_func.py
notebook exports data to CSV using pandas. teradataml supports fastexport to export data to CSV. --- Fixed in pass6
section 5 uses SQL Query to check the records. Use DataFrame.shape --- Section 4 and 5 is condtional step. So at section 5 I won't have dataframe if user skip Section 4. So it is mandatory to check table

chetan-hirapara added the enhancement Adding graphs and improving output label Apr 8, 2024

chetan-hirapara self-assigned this Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Product Engineering Review: Recommendations_product_search_Opensource_Python #585

Product Engineering Review: Recommendations_product_search_Opensource_Python #585

chetan-hirapara commented Apr 8, 2024

chetan-hirapara commented Apr 8, 2024 •

edited

Product Engineering Review: Recommendations_product_search_Opensource_Python #585

Product Engineering Review: Recommendations_product_search_Opensource_Python #585

Comments

chetan-hirapara commented Apr 8, 2024

chetan-hirapara commented Apr 8, 2024 • edited

chetan-hirapara commented Apr 8, 2024 •

edited