Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dynamic avatar view for project avatars #181

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

agjohnson
Copy link
Contributor

@agjohnson agjohnson commented Jun 24, 2023

I waffled on this one a bit, but did an initial pass inside this
application instead of the core application. I'm impartial here though.

This adds a URL and view to a dynamic view that:

  • Redirects to the project's remote repository avatar URL if it exists
  • Shows a generated avatar SVG from a template otherwise

This replaces the use of Dicebear, a third party API and site. Dicebear
was added as a placeholder for some nicer images.

It also uses a different style for the avatars, using a reimplementation
of the Dicebear "initials" API instead of the "shapes" API. The shapes
look pretty but they don't feel identifiable either. The project avatar
breaks up the project name/slug into words and makes and acronym from
that.

Closes #131

Screencast.from.2023-06-23.18-57-17.webm

I waffled on this one a bit, but did an initial pass inside this
application instead of the core application. I'm impartial here though.

This adds a URL and view to a dynamic view that:

- Redirects to the project's remote repository avatar URL if it exists
- Shows a generated avatar SVG from a template otherwise

This replaces the use of Dicebear, a third party API and site. Dicebear
was added as a placeholder for some nicer images.

It also uses a different style for the avatars, using a reimplementation
of the Dicebear "initials" API instead of the "shapes" API. The shapes
look pretty but they don't feel identifiable either. The project avatar
breaks up the project name/slug into words and makes and acronym from
that.
@agjohnson
Copy link
Contributor Author

@humitos @stsewd Pinging you all for some input on this one. For now, I'm wondering if there is anything obviously wrong with this approach, or if there are opinions on where/how this is placed in our code.

I do have reservations about hitting the application for a list of images like this, but with CDN caching these, I'm not too worried.

@stsewd
Copy link
Member

stsewd commented Jun 26, 2023

I do have reservations about hitting the application for a list of images like this, but with CDN caching these, I'm not too worried.

So, we won't be able to use the CDN here, at least not that easily. We disabled caching on our main app to prevent any cache poisoning attacks. If we want to enable caching here, we would need to have per-view logic for caching, similar to what we have for serving docs. But I prefer if we don't enable caching in our main app, maybe use just two/three letters + three colors, and have them serve from another domain (assets.readthedocs.org).

@agjohnson
Copy link
Contributor Author

Ah good point. I'd like to avoid generating these, as it would be hard to do this well -- there would be several thousand files to generate at least.

But perhaps I'll look into SVG parameterization instead, I think that might work here. Let me try that next. If that works, we'd have a single static asset and the HTML would be a little more complex.

@agjohnson
Copy link
Contributor Author

So, parameterization is not a good solution still and CSS doesn't quite have enough reach here either, not that customizing with CSS seems like a particularly good idea anyways.

I'm not considering pre-generating these images as an option anymore either. It's too hacky for what I want to accomplish, and I want to extend this pattern to team, organizations, and users.

Also, I wanted to move this logic into the application. Similar to how HTMX and Django templates push logic back into application code, I think we shouldn't be writing frontend specific logic whenever we can. If I can't write this with application code, it's going to be purely JS driven and will be a bit more complicated.

Cache poisoning wouldn't be a concern here, as there is no request information that goes into generating the images. What needs to happen in order to mark this view able to be cached?

We don't need the CDN necessarily either. Worst case, the browser cache might be enough, but it won't be seeded and will be prone to synchronized cache invalidation for the user.

@stsewd
Copy link
Member

stsewd commented Jun 28, 2023

Cache poisoning wouldn't be a concern here, as there is no request information that goes into generating the images. What needs to happen in order to mark this view able to be cached?

The issue with enabling caching is if we can do it only for the responses we know are okay to cache, for example, if we enable a rule for anything that ends in /avatar/project/\w+/, then that can be used to cache any other views that allow you to end in that string (like in GHSA-7fcx-wwr3-99jv). And we should also don't cache 404s, since our new templates include the current username and csrf token in then. So, if cache is enabled, would be better for the application to decide if the view should be cached or not, or write a very specific rule for it in CF.

Also, the query being used takes the current user into account Project.objects.public(self.request.user).

@agjohnson
Copy link
Contributor Author

for example, if we enable a rule for anything that ends in /avatar/project/\w+/, then that can be used to cache any other views that allow you to end in that string

Maybe this is a separate discussion, but why use an ends with match for a URL like this? Does a full regex match not work for some reason?

Also, the query being used takes the current user into account Project.objects.public(self.request.user).

Good point, I forgot this was part of that query. That definitely makes this not cacheable, but perhaps it's only important to check for authorization?

It seems like the path of least resistance is maybe doing this on the front end side and removing all the template code. That way at least I can reuse the same logic in Django templates and Knockout templates.

@stsewd
Copy link
Member

stsewd commented Jun 28, 2023

Maybe this is a separate discussion, but why use an ends with match for a URL like this? Does a full regex match not work for some reason?

I think it's possible to write a very specific rule, but we need to be sure to cache that path on 200 only.

Comment on lines +53 to +54
random.seed(self.object.pk)
return random.choice(self.COLORS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I forget, this should make a separate instance https://docs.python.org/3/library/random.html#random.Random, instead of setting the seed globally.

@agjohnson
Copy link
Contributor Author

Right now, I'm considering if it is possible to do this on the front end code side. I'm worried that I'll need authorization information however, and I'll be back to just wanting this as a back end view. But, I suspect if the templates just output the information I need for each project, and the front end code is the part that is responsible for showing the correct image, that shouldn't need any additional information from the back end.

@agjohnson
Copy link
Contributor Author

I've also opened up readthedocs/readthedocs.org#10513 because one of the main limitations on the project creation page is that I don't have all of the fields that I need and can't search on the API v3 remote repository endpoint. With a full project API response on the remote repo lookup, this is maybe less of an issue to handle this with front end logic.

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simplify this endpoint.

I'd remove the authorization completely from this endpoint and always return an avatar even if the project doesn't exist. Basically, the only thing we want to do here is just: "given a slug (valid or invalid, existent or not existent) return an image of the first letter of each word".

We are not leaking anything in that way and we can cache the response forever without worries.

Comment on lines +81 to +90
def get_queryset(self):
return Project.objects.public(self.request.user)

def get_object(self):
queryset = self.get_queryset()
project_slug = self.kwargs.get("project_slug")
return get_object_or_404(
queryset,
slug=project_slug,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to perform authorization here and also query the database? I'd remove this and do not perform any validation against the DB on the slug received.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's doing a db query to get the remote_repo.avatar_url for projects that have one. That is, this view isn't outputting a generated image for every project, only projects without a connected remote repo.

So it seems like authorization would be required too, or a malicious user could scan for project slugs.

The reason for this is because API driven views don't have the same data as template driven views -- namely project remote repo details and project permissions. I would still need to maintain duplicate logic in templates and JS to show the correct avatar, but I think my comment above would at least make this easier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that the frontend only calls this endpoint when it doesn't have the remote_repo.avatar_url. So, when trying to display the avatar for a repository, if it's not available, it calls this endpoint and the "two letters avatar" is generated -- without authentication. Isn't it that possible?

Copy link
Contributor Author

@agjohnson agjohnson Jul 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That possible, and it's also not hard to do this on the front end side either.

However, the reason I wanted this view was because we don't always have project.remote_repo.avatar_url in API responses, so our front end code can't make the same decision that our backend code can. Consolidating this on the backend solves that, but @stsewd noted security/caching issues.

So, I'm probably leaning towards just doing this on the front end side and saving 10-20 requests per page load. I just need readthedocs/readthedocs.org#10513, which would be a good place to jump in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, the reason I wanted this view was because we don't always have project.remote_repo.avatar_url in API responses

Can we make our API to return this attribute instead to simplify this work and also allow CDN caching without leaking anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have clarified. readthedocs/readthedocs.org#10513 would allow me to use the v3 API, which has project expansion and would therefore would give me the avatar_url.

We could make the call whether to output the SVG on the frontend or backend, but I'd still probably lean towards a frontend implementation as the SVG that is generated is only ~600B. The request time for all of those images would probably be less than parameterizing the SVG on the frontend side.

But caching an endpoint like you're describing would be an option still.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can easily do it in the frontend side, I'd prefer that 👍🏼

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aye, I'll try this implementation again when I'm back and we can compare the load timing, etc. I think the front end implementation is looking most likely though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace Dicebear with static generated images
3 participants