Product Engineering Review: Recommendations_product_search_OpenAI_Python #584

chetan-hirapara · 2024-04-08T07:16:23Z

Reviewer 1 comments:

Reviewer 1 suggestions:

tdml supports all fundamental plots, Which can be used for plotting Scatter and bar graph.
VectorDistane function is supported in teradataml.
Lot of data movements
are present, If created
table are not required
in future, then volatile
feature can be used.

Reviewer 2 comments:

Reviewer 2 suggestions:

teradataml supports groupby and join.Create two dataframes and do a join instead of doing a SQL Query
One of the cell explains about products for snacks but uses dataframe.sort . Instead use dataframe.drop_duplicate which shows all
the distinct products.

Reviewer 3 comments:

Reviewer 3 suggestions:

chetan-hirapara · 2024-04-08T08:43:07Z

Reviewer 1 comments:

sqle queries are directly executed. --- Fixed in pass6
Plotting use plotty. --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap._

Reviewer 1 suggestions:

tdml supports all fundamental plots, Which can be used for plotting Scatter and bar graph. --- _Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions.
VectorDistane function is supported in teradataml. --- Fixed in pass6
Lot of data movements are present, If created table are not required in future, then volatile feature can be used. --- Thanks, fixed

Reviewer 2 comments:

Reviewer 2 suggestions:

teradataml supports groupby and join. Create two dataframes and do a join instead of doing a SQL Query --- Fixed in pass6
One of the cell explains about products for snacks but uses dataframe.sort . Instead use dataframe.drop_duplicate which shows all
the distinct products. --- There was not a duplicate products, sort is used just for display proeucts in proper sequential IDs, instead of random numbers.

Reviewer 3 comments:

In section 6, TD_VECTORDISTANCE is run using SQL query. --- Fixed in pass6
In section 7.1, scatter plot is plotted using plotly.express --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap.
In section7.1, pandas merge is used to merge dfs --- As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.
Pandas read_csv is used at multiple places --- This is for batch/chunking approach. we have to generate the embeddings in batches

Reviewer 3 suggestions:

In section 6, pythonic function VectorDistance can be used. --- We have VectorDistance in teradataml so no need to use python one
In section 7.1, tdml scatter plots can be used --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap.
In section 7.1, merging of dataframes can be done by using tdml DF’s merge --- As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.
read_csv of tdml can be used wherever possible. --- This is for batch/chunking approach. we have to generate the embeddings in batches

chetan-hirapara self-assigned this Apr 8, 2024

chetan-hirapara added the enhancement Adding graphs and improving output label Apr 8, 2024

Provide feedback