Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attribute age range added to movie #413

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zahra-ash0uri
Copy link

No description provided.

@alberanid alberanid self-assigned this Nov 22, 2022
@alberanid alberanid added enhancement http parsers of IMDb web pages labels Nov 22, 2022
@alberanid
Copy link
Collaborator

Thanks for the PR!

I'm still undecided about it: the same information can be found in the 'certificates' key, in a list of "country:certificate" strings; would that be enough for your use case?

The value you extract is probably the one for the USA. I'm mostly worried about the fact that not all movies may have the same entries in that banner (e.g. "PG-13 | 2h 22min | Drama, Romance | 06 Oct 1994 (Italy) | Movie" for Forrest Gump, accessed via browser) and so it could be difficult to assign the correct meaning to the first element. What happens for movies that were not rated, for example?

An example:

#!/usr/bin/env python

import imdb

ia = imdb.Cinemagoer('http')

fg = ia.get_movie('0109830')
#print(sorted(fg.keys()))
print(fg['certificates'])

will print:

['Argentina:13', 'Australia:M', 'Brazil:14', 'Canada:PG::(British Columbia/Manitoba/Nova Scotia/Ontario)', 'Canada:G::(Quebec)', 'Canada:PG::(Alberta)', 'Denmark:7', 'Ecuador:12::(self-applied)', 'Ecuador:7+::(self-applied)', 'Egypt:Not Rated::(self-applied)', 'Finland:K-14', 'Finland:K-12', 'France:Tous publics', 'Germany:12', 'Germany:6::(video rating)', 'Greece:K-12', 'Hong Kong:16+::(self-applied)', 'Hungary:12', 'Iceland:L', 'India:UA', 'Indonesia:21::(self-applied)', 'Ireland:15', 'Ireland:12', 'Israel:PG', 'Italy:T', 'Japan:PG12', 'Malaysia:P13', 'Mexico:13', 'Netherlands:12', 'New Zealand:M', 'Nigeria:PG', 'Norway:11', 'Norway:12::(TV rating)', 'Peru:14', 'Philippines:18+::(self-applied)', 'Poland:15', 'Portugal:M/12::(Qualidade)', 'Russia:0+', 'Saudi Arabia:PG', 'Singapore:PG', 'Singapore:NC-16::(TV rating)', 'Singapore:PG13', 'South Africa:PG::(self-applied)', 'South Korea:12', 'South Korea:15::(DVD rating)', 'Spain:A', 'Sweden:11', 'Taiwan:7+::(self-applied)', 'Thailand:u 13+::(self-applied)', 'Turkey:7+::(DVD Rating)', 'United Kingdom:12', 'United Kingdom:12A', 'United States:TV-PG', 'United States:PG-13', 'Ukraine:ZA', 'United Arab Emirates:15+::(self-applied)']

What's your opinion?

Thanks!

@zahra-ash0uri
Copy link
Author

@alberanid
You are welcome!
I actually need to know that standard value (like PG-13, R, GP, NC-17, etc) so I'll decide the age number according to my country.
For example I use this mapping:
AGE_RANGES = { "G": 0, "PG": 3, "GP": 3, "PG-13": 13, "R": 17, "NC-17": 18, "TV-Y": 0, "TV-Y7": 7, "TV-G": 0, "TV-PG": 7, "TV-14": 14, "TV-MA": 17, "Not Rated": -1, "TV_G": 0, "TV_PG": 7, "TV_14": 14, "NC_17": 18, "PG_13": 13, "TV_Y": 0, "TV_Y7": 7, "TV_MA": 17, "Unrated": -1, "NOT RATED": -1, "Approved": 13, "Passed": 3, "APPROVED": 13, "UNRATED": -1, "TV-Y7-FV": 7, "PG-15": 15, }

@alberanid
Copy link
Collaborator

I see; yes, probably the best way to do it is to look in the 'certificates' list if your country is present and - if not - use the value for the USA or any other available. After that you can map that value to an age, like you are already doing.

In any case, let's keep this PR open for a while; maybe it can be useful to parse the values in that banner but we have to better understand if it's possible to identify the information unambiguously.

Plus, right now everything is a mess since some recent changes in IMDb broke half of our parsers (see #419 and #421 and more will follow).

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement http parsers of IMDb web pages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants