Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download comments? #16

Open
ipkpjersi opened this issue Jul 15, 2023 · 9 comments
Open

Download comments? #16

ipkpjersi opened this issue Jul 15, 2023 · 9 comments
Labels
enhancement New feature or request

Comments

@ipkpjersi
Copy link

Hi,

I was wondering if it would be possible to add the ability to download comments made by a user? For example, if I log in and want to download my subscribed communities and preferences etc, would it also be possible to download my comments?

Cheers.

@CMahaff
Copy link
Owner

CMahaff commented Jul 15, 2023

Download would be possible, but upload would not. So it would only let you read your old comments.

@CMahaff CMahaff added the enhancement New feature or request label Jul 15, 2023
@ipkpjersi
Copy link
Author

Good to know, thanks. Downloading is still worth it IMO.

@ipkpjersi
Copy link
Author

ipkpjersi commented Jul 29, 2023

If I put a bounty on this feature, would that help prioritize development of it?

Today helped me realize that this is actually an important feature to me. It seems that lemmy.one is down, and it may be gone for good since the API says it is having database issues (and if they don't have working backups it's probably just gone). If it really is gone for good, that means my comment history is gone too, which makes me realize how important this feature is in terms of having an archive.

@CMahaff
Copy link
Owner

CMahaff commented Jul 30, 2023

That's very kind of you, but in all honesty probably not, simply because my time to work on this is very limited, and the more I think about it, the more I think this request might be out of scope of this tool.

I have to consider that if I add this, downloads will take a lot longer, it's more APIs I have to support, it's more data in the profile versions I have to transfer, it's more things I have to test each release, etc. All of these things make it harder to get new releases out when the Lemmy API changes, which right now, is a lot - and I think most people get more value out of timely releases than an archive feature.

That said I do think there is value to such a service, so if I have the time / energy (which I can't promise) I might fork LASIM and use the software base to make such an archiving tool. It sounds counterintuitive at first, but I think that would actually be less work overall since an archiving tool wouldn't need all the profile/uploading stuff, but would need some new APIs. If I did make such a tool it would fetch your posts, comments, and maybe saved posts, while LASIM would continue to focus on settings, subscriptions, and blocks. In conjunction, you'd basically have everything.

But again I can't promise I'll have the bandwidth since my personal life has gotten very busy, but I'll be sure to let you know if I pick it up - or if I hear anyone else has for that matter.

Could also be something to PR Lemmy about, especially since I think some kind of account export is supposed to be required to meet the GDPR?

@ipkpjersi
Copy link
Author

That's pretty disappointing to hear that it probably won't happen, honestly, especially after it does seem like I have lost my own comment history data from lemmy.one going down.

I looked into doing it myself, because honestly if you want something done, that's the best way to do it. This, unfortunately, isn't looking much better.

I couldn't find any API endpoint for downloading my own comments from an instance. Lemmy's API documentation isn't actually API documentation, it's literally just them talking about a JavaScript/TypeScript HTTP client: https://join-lemmy.org/api/ and doesn't actually explain any of the API endpoints.

I assumed this would literally just be one endpoint for comments by user, then maybe one endpoint for posts by user, etc but it seems like maybe it's not that simple?

@CMahaff
Copy link
Owner

CMahaff commented Jul 31, 2023

Yeah the documentation is pretty poor unfortunately.

I usually look at the rust API directly.

This file shows all the endpoints: https://github.com/LemmyNet/lemmy/blob/0.18.3/src/api_routes_http.rs

Then you can find all the types here: https://github.com/LemmyNet/lemmy/tree/0.18.3/crates/api_common/src

In your case, I think you want to make a GET request to <instance>/api/v3/user

You would pass a JSON object containing the fields shown in GetPersonDetails (see link above) and receive a JSON object GetPersonDetailsResponse (see link above)

But yeah, I'm not sure if you set limit in GetPersonDetails high enough if you can get them all in 1 go, or if you'll have to write a loop that pulls them down in pages until there are no more.

@ipkpjersi
Copy link
Author

ipkpjersi commented Aug 12, 2023

I whipped up a quick PHP script this evening for scraping user data (comments, posts, etc), it's not ideal and it could be a lot better, but it's at least something, and the pagination works nicely: https://gist.github.com/ipkpjersi/3ce754391ab8390f23d9b9f80fca3994

I'm still hoping a gui-based approach will be added, or maybe something officially through Lemmy itself, but at least this is a decent stopgap for now.

I think one of the cool things about being able to export your own comments is you could in theory create a web application for reading the comments from JSON and displaying them in a readable manner as a web page, I think that would be pretty cool.

edit: Also, it only seems to work with lemmy.ml and not instances like lemmy.one or lemmy.world? For example, this API call returns nothing: https://lemmy.one/api/v3/user/?username=ipkpjersi&limit=50

@CMahaff
Copy link
Owner

CMahaff commented Aug 12, 2023

If your account is on lemmy.world you can probably just pass your username, but if you want to access that profile from another instance you probably need to pass username@lemmy.world.

@ipkpjersi
Copy link
Author

Yeah but if I'm making a GET request to https://lemmy.one/api/v3/user/?username=ipkpjersi&limit=50 presumably it would just return all of the posts local for that account on that instance, but it literally returns a 404 error instead. (I have an account on each of those 3 instances I listed, with posts on all of them IIRC)

Maybe difference instances have different API versions or something? But they're all on roughly the same backend version. Maybe some of them require some sort of authentication like a token?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants