Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Product Engineering Review: Recommendations_product_search_Opensource_Python #585

Open
chetan-hirapara opened this issue Apr 8, 2024 · 1 comment
Assignees
Labels
enhancement Adding graphs and improving output

Comments

@chetan-hirapara
Copy link
Collaborator

Reviewer 1 comments:

  • Warnings are ignored

  • Plotly is used ploting.

  • Query used instead of
    DataFrame manipulation functions

  • Section 6, query is used for VectorDistance instead of teradataml function

  • In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe.

  • Reviewer 1 suggestions:

  • Warnings should not be ignored as they give additional insights

  • We can leverage teradataml inbuilt plot functionality for product recommendations. in 3.1.1

  • In section 3.1.1, for select and join, query is used, can use teradataml functions like join and select on dataframe itself.

  • In section 6, teradataml VectorDistance function can be used instead of query.

  • In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe, we can do that on teradataml
    Dataframe so no need to convert it to pandas dataframe.

Reviewer 2 comments:

  • Looks like notebook is not runnable.
  • Use fastexport to export data to CSV.

Reviewer 2 suggestions:

Function

  • copy_emb_to_sql does not have any definition. Please make sure to add definition for this function.
  • notebook exports data to CSV using pandas. teradataml supports fastexport to export data to CSV.
  • section 5 uses SQL Query to check the records. Use DataFrame.shape
@chetan-hirapara chetan-hirapara added the enhancement Adding graphs and improving output label Apr 8, 2024
@chetan-hirapara chetan-hirapara self-assigned this Apr 8, 2024
@chetan-hirapara
Copy link
Collaborator Author

chetan-hirapara commented Apr 8, 2024

Reviewer 1 comments:

  • Warnings are ignored --- warnings are hidden in all the notebooks as a standard followed to keep the notebook clean.

  • Plotly is used ploting. --- Scatterplot with bubble and hover details are note supported in teradataml hence used plotly functions. Same for heatmap.

  • Query used instead of DataFrame manipulation functions --- Fixed in pass6

  • Section 6, query is used for VectorDistance instead of teradataml function --- Fixed in pass6

  • In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe. --- Few code is fixed in pass6. remaining is not able to fix using teradataml dataframe. As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.

  • Reviewer 1 suggestions:

  • Warnings should not be ignored as they give additional insights --- warnings are hidden in all the notebooks as a standard followed to keep the notebook clean.

  • We can leverage teradataml inbuilt plot functionality for product recommendations. in 3.1.1 --- Fixed in pass6

  • In section 3.1.1, for select and join, query is used, can use teradataml functions like join and select on dataframe itself. --- Fixed in pass6

  • In section 6, teradataml VectorDistance function can be used instead of query. --- Fixed in pass6

  • In section 7, dataframe manipulation functions like select, merge are done on pandas dataframe, we can do that on teradataml --- Few code is fixed in pass6. remaining is not able to fix using teradataml dataframe. As this dataframes has more than 3000 columns, which is not able to handle by teradataml dataframe. teradataml dataframe is throwing error: OperationalError: [Version 17.20.0.0] [Session 1281] [Teradata Database] [Error 3919] Table has too many columns.
    Dataframe so no need to convert it to pandas dataframe. --- As most of the places it is fixed. Thanks

Reviewer 2 comments:

  • Looks like notebook is not runnable. --- It is runnable. We know it will take more time than usual one. We have written clear instruction at each places where it will be taking more than 15 sec
  • Use fastexport to export data to CSV. --- Fixed in pass6

Reviewer 2 suggestions:

Function

  • copy_emb_to_sql does not have any definition. Please make sure to add definition for this function. --- It is already exists inside helper function sql_helper_func.py
  • notebook exports data to CSV using pandas. teradataml supports fastexport to export data to CSV. --- Fixed in pass6
  • section 5 uses SQL Query to check the records. Use DataFrame.shape --- Section 4 and 5 is condtional step. So at section 5 I won't have dataframe if user skip Section 4. So it is mandatory to check table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding graphs and improving output
Projects
None yet
Development

No branches or pull requests

1 participant