Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UV Computation #535

Open
qingyuanxingsi opened this issue Dec 11, 2020 · 2 comments
Open

UV Computation #535

qingyuanxingsi opened this issue Dec 11, 2020 · 2 comments

Comments

@qingyuanxingsi
Copy link

Given a log(dataframe) containing the purchase records of users, how can i compute the unique count of goods purchased by a certain user?(Supposing that one can purchase a certain good multiple times) Would you be so kind to give me some demo? Much thanks!

@tovbinm
Copy link
Collaborator

tovbinm commented Dec 11, 2020

It's a classical case for aggregation. Simply aggregate the dataframe by (user + item), e.g. https://mungingdata.com/apache-spark/aggregations/

@qingyuanxingsi
Copy link
Author

@tovbinm Oh,I was wondering if there was any build-in support? I want to use with the window operator?I found sum aggregator in the code, but failed to find the count(distinct) operator?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants