Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Product Engineering Review: Recommendations_product_search_OpenAI_Python #584

Open
chetan-hirapara opened this issue Apr 8, 2024 · 1 comment
Assignees
Labels
enhancement Adding graphs and improving output

Comments

@chetan-hirapara
Copy link
Collaborator

chetan-hirapara commented Apr 8, 2024

Reviewer 1 comments:

  • sqle queries are directly executed.
  • Plotting use plotty.

Reviewer 1 suggestions:

  • tdml supports all fundamental plots, Which can be used for plotting Scatter and bar graph.
  • VectorDistane function is supported in teradataml.
  • Lot of data movements
    are present, If created
    table are not required
    in future, then volatile
    feature can be used.

Reviewer 2 comments:

  • SQL Queries are used.

Reviewer 2 suggestions:

  • teradataml supports groupby and join.Create two dataframes and do a join instead of doing a SQL Query
  • One of the cell explains about products for snacks but uses dataframe.sort . Instead use dataframe.drop_duplicate which shows all
    the distinct products.

Reviewer 3 comments:

  • In section 6, TD_VECTORDISTANCE is run using SQL query.
  • In section 7.1, scatter plot is plotted using plotly.express
  • In section7.1, pandas merge is used to merge dfs
  • Pandas read_csv is used at multiple places

Reviewer 3 suggestions:

  • In section 6, pythonic function VectorDistance can be used.
  • In section 7.1, tdml scatter plots can be used
  • In section 7.1, merging of dataframes can be done by using tdml DF’s merge
  • read_csv of tdml can be used wherever possible.
  • read_csv of tdml can be used wherever possible.
@chetan-hirapara chetan-hirapara self-assigned this Apr 8, 2024
@chetan-hirapara chetan-hirapara added the enhancement Adding graphs and improving output label Apr 8, 2024
@chetan-hirapara
Copy link
Collaborator Author

chetan-hirapara commented Apr 8, 2024

Reviewer 1 comments:

  • sqle queries are directly executed. --- Fixed in pass6
  • Plotting use plotty. --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap._

Reviewer 1 suggestions:

  • tdml supports all fundamental plots, Which can be used for plotting Scatter and bar graph. --- _Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions.

  • VectorDistane function is supported in teradataml. --- Fixed in pass6

  • Lot of data movements are present, If created table are not required in future, then volatile feature can be used. --- Thanks, fixed

Reviewer 2 comments:

  • SQL Queries are used. --- Fixed in pass6

Reviewer 2 suggestions:

  • teradataml supports groupby and join. Create two dataframes and do a join instead of doing a SQL Query --- Fixed in pass6
  • One of the cell explains about products for snacks but uses dataframe.sort . Instead use dataframe.drop_duplicate which shows all
    the distinct products. --- There was not a duplicate products, sort is used just for display proeucts in proper sequential IDs, instead of random numbers.

Reviewer 3 comments:

  • In section 6, TD_VECTORDISTANCE is run using SQL query. --- Fixed in pass6
  • In section 7.1, scatter plot is plotted using plotly.express --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap.
  • In section7.1, pandas merge is used to merge dfs --- As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.
  • Pandas read_csv is used at multiple places --- This is for batch/chunking approach. we have to generate the embeddings in batches

Reviewer 3 suggestions:

  • In section 6, pythonic function VectorDistance can be used. --- We have VectorDistance in teradataml so no need to use python one
  • In section 7.1, tdml scatter plots can be used --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap.
  • In section 7.1, merging of dataframes can be done by using tdml DF’s merge --- As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.
  • read_csv of tdml can be used wherever possible. --- This is for batch/chunking approach. we have to generate the embeddings in batches

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding graphs and improving output
Projects
None yet
Development

No branches or pull requests

1 participant