-
Notifications
You must be signed in to change notification settings - Fork 692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/expectation maximization #1424
base: dev
Are you sure you want to change the base?
Conversation
I would be happy to consolidate with #1420 My implementation is a bit different in that it iterates over each cpd in series (1 at a time), rather than updating all of them per iteration. I think this can be adjusted pretty simply - |
…o/pgmpy into feature/expectation-maximization
Codecov Report
@@ Coverage Diff @@
## dev #1424 +/- ##
==========================================
- Coverage 93.93% 93.50% -0.44%
==========================================
Files 134 135 +1
Lines 14150 14227 +77
==========================================
+ Hits 13292 13303 +11
- Misses 858 924 +66
Continue to review full report at Codecov.
|
@kyjohnso Thanks for the PR. Yes, I think it would be a great idea to merge our two implementations. I had a quick look at your code and I have a few comments. Please correct me if I am misunderstanding anything.
In my opinion, the best ways to move forward would be to maybe use my implementation (as I have some optimizations implemented: inference and value caching) and add the extra features from your implementation on top of that. What do you think? |
hi @ankurankan yea I agree that your approach is the best way forward. Here are my thoughts:
Yes, lets move forward with your implementation and then add the sample by sample latent variables. Do you want me to wait until your pull request is approved and then I can help update it? Also, I can help with unit tests and example ipynb's or certainly help with the core EM work as well. Thanks for the quick response and I look forward to working with you - |
@kyjohnso I think we are on the same page about the implementation then. I agree with your idea of having the option of specifying a prior distribution as well, else use random distributions. If you would like could you maybe have a look at my PR as there's some bug that I am not able to figure out? In my implementation, the values always converge after the 2nd iteration which I don't think is correct. I will try to explain the steps of my implementation briefly to make it easier to understand if you go through it:
Also, I am using the following two models for testing the implementation:
|
@ankurankan yep - I am happy to look at your PR, I will probably have some time in the next couple of days - in the instance that I find something, or figure out how to add the features we wrote about, would you prefer that I fork your fork, or would you rather incorporate my code another way? |
@kyjohnso Thanks a lot. I have pushed a new branch |
@kyjohnso Hi, I don't know if you got a chance to look at my implementation yet. But I was going through it again and seems like it was working fine all along. For fully observed datasets, the learned parameters are quite accurate. With latent variables, it's not very accurate, but I think your implementation also gives similar results if I am not mistaken? I was just expecting it to be more accurate in the latent variable case. I have pushed my latest code, if you would like you can maybe implement your features on that? I will work on optimizing the implementation in the meanwhile. |
Ignore my last comment, I finally found the bug. It should be working as expected now. |
hi @ankurankan yea I have been taking a look at your implementation, and got a quick start of running the example model you sent in the comments, I will pull the new code and start implementing my features on that - |
@ankurankan are you still using the models that you posted above to test your implementation? |
@kyjohnso Yes, with an extra |
Your checklist for this pull request
Please review the guidelines for contributing to this repository.
Issue number(s) that this pull request fixes
List of changes to the codebase in this pull request