Trying to fix ytsearch:, here's where I've got to #210

cloudrac3r · 2020-10-23T11:15:47Z

Test command: youtube-dlc ytsearch:lol --flat-playlist -J --verbose

For searches, youtube-dl/c tries to download some representation of the search page encoded as JSON which contains HTML strings, visible around youtube.py:3289:

data = self._download_json( ...
html_content = data[1]['body']['content']

However when this code is executed the _download_json line fails because it tried to parse HTML as JSON. This is because the query parameter that youtube-dl/c was using, spf=navigate, is now ignored by YouTube, so YouTube just returns an ordinary page of results.

There may now be a different query parameter that gets the results in the same format, but if there is, I don't know what it is.

Otherwise we'll have to request the data from YouTube in a different format. Here's what I've got to on that:

post_data = {
    'context': {
        'client': {
            'clientName': 'WEB',
            'clientVersion': '2.20201022.01.01',
        }
    },
    'query': query # the search query goes here
}
result_url = 'https://www.youtube.com/youtubei/v1/search?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8' # this key is the same globally

and add these parameters to _download_json:

data=json.dumps(post_data).encode('utf-8'),
headers={
    'content-type': 'application/json'
}

Now you have a completely JSON representation of the results, which you can step into with:

data.contents.twoColumnSearchResultsRenderer.primaryContents.sectionListRenderer.contents[1].itemSectionRenderer.contents

Depending on the search terms, sometimes the 1 index is a 0.

I don't have the energy to continue arranging the data into a format that the rest of the code likes. Hopefully someone can pick up from my work.

Peace.

The text was updated successfully, but these errors were encountered:

blackjack4494 · 2020-10-23T13:53:38Z

Youtube is just rolling out updates in a way that not everyone will instantly use the new version and that they update not all components at once as it's the case with search now. They use the continuation method here now as well as they do with most feeds already.
Prior to that youtube/dl used to make use of some in page embedded buttons/links.
The new way is actually much easier and cleaner.

Just look out for this

{"continuationItemRenderer":
{"trigger":"CONTINUATION_TRIGGER_ON_ITEM_SHOWN","continuationEndpoint":
{"clickTrackingParams":"CBwQui8iEwjmxNfj58rsAhXJgt4KHf0cDIc=","commandMetadata":
{"webCommandMetadata":
{"url":"/service_ajax","sendPost":true,"apiUrl":"/youtubei/v1/search"}},
"continuationCommand":{"token":"Eps......","request":"CONTINUATION_REQUEST_TYPE_SEARCH"}}}}

Especially the "continuationCommand":{"token":
You will almost always get this token on almost every page and have to use this with the youtubei/v1 api

For search you would use
https://www.youtube.com/youtubei/v1/search?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8
and some Request payload which has the following

{"context":<DICT>,
"continuation":<KEY>}

Where the key is the token in continuationCommand

The Response is json. That will have another (new) token in each request until there are no more results.
Keep in mind that you should use some rate limiter (e.g. 2-3 seconds timeout between each request) since too many requests will lead to youtube returning you errors.

It's quite an easy fix to implement. However I am missing the time currently to do so.
I got some fix related to feeds like watch history that uses the exact same implementation. But that is still a local branch and not uploaded on github as of yet. I may have more time the next days or week to finally polish those fixes and incorporate them.

There is a screenshot of history feed progress on Gitter (link)

blackjack4494 · 2020-10-23T14:38:42Z

Seems youtube-dl just implemented this fix nice. Less work for me then :D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to fix ytsearch:, here's where I've got to #210

Trying to fix ytsearch:, here's where I've got to #210

cloudrac3r commented Oct 23, 2020

blackjack4494 commented Oct 23, 2020 •

edited

blackjack4494 commented Oct 23, 2020

Trying to fix ytsearch:, here's where I've got to #210

Trying to fix ytsearch:, here's where I've got to #210

Comments

cloudrac3r commented Oct 23, 2020

blackjack4494 commented Oct 23, 2020 • edited

blackjack4494 commented Oct 23, 2020

blackjack4494 commented Oct 23, 2020 •

edited