Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Vimeo fuzzy rules be adapted? #169

Open
benoit74 opened this issue Apr 18, 2024 · 0 comments
Open

Should Vimeo fuzzy rules be adapted? #169

benoit74 opened this issue Apr 18, 2024 · 0 comments

Comments

@benoit74
Copy link

FYI, in warc2zim2 we had to slightly adapt Vimeo fuzzy rules to have them support more scenarii. I'm not sure this has to be reflected in wabac, but I prefer to share the findings ^^

I did not took the time to test a WARC with your replay solution.

Change the video rewritting

See openzim/warc2zim@47e104c

What I've observed is that in our test on https://website.test.openzim.org/vimeo.html, our adaptation of the fuzzy rule at

"match": /\/\/.*(?:gcs-vimeo|vod|vod-progressive)\.akamaized\.net.*?\/([\d/]+\.mp4)/,
wasn't matching at all because the domain was not matching (134vod-adaptive.akamaized.net) and because there was query parameters (not sure this is not a bug on our adaptation of the fuzzy rule).

I've decided for now to add support for the new domain and keep the range parameter (which seems to be the only important one from replay perspective).

Rewrite preview image from the CDN

The preview image (the one displayed before the user starts the video) comes from i.vimeocdn.com domain. Query parameters are added to request a size / quality matching the player need. From our experience, these query parameters are dynamically adapted, most probably based on viewport size or maybe other factors.

For instance, on my laptop there is two queries issued for the test video on https://website.test.openzim.org/vimeo.html:

But this is not what Browsertrix crawler got with --mobileDevice "Pixel 2":

We hence had to rewrite these URLs as well. For now, we decided to simply drop the query parameters. It is far from perfect, but from our experience there is just too many conditions to know which query parameters values would be present in the WARC and which will be requested at replay time.

Ideally we would benefit from using the "greater resolution available" ... but I failed to find how to do it easily. I hesitated to rewrite only when mh parameter is present, but it seems pretty fragile.

@benoit74 benoit74 changed the title Should Vimeo rules be adapted? Should Vimeo fuzzy rules be adapted? Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant