Skip to content

Latest commit

 

History

History
566 lines (409 loc) · 11.8 KB

README.md

File metadata and controls

566 lines (409 loc) · 11.8 KB

Python supported versions GPLv3 license PRs Welcome

Football Players Statistics WebCrawler

This project is a sub-module for Multiplayer Football Draft Simulator.

About

A web-crawler to scrape all football players' information from Sofifa and exporting it to JSON format. Perform data cleaning and analytics on the obtained data

  • Crawler: Built on scrapy using python3
  • Analytics: IPynb noteboook python3

Further exported to the Football Draft Backend to serve from an endpoint

Steps to run the project

Easy Run

chmod +x ./run.sh
./run.sh

Manual Setup and Run

  • Setup virtualenv (optional, but recommended)

    virtualenv -p python3.8 env
    source env/bin/activate
    
  • Install project dependencies

    pip install -r requirements.txt
  • Run the crawler with ./fifa-crawler as current directory (This the main scrapy crawler directory)

    cd fifa_crawler
    
  • First run the URL spider (To get all players urls)

    scrapy crawl players_urls
  • After successfull, run the stats spider (To get the players statistics from URLs from above)

    scrapy crawl players_stats

Scope/Aim as an indiviual project

Future features

  • Add analysis projects on the crawled data.
  • Update the crawler to perform scraping to obtain Teams data (currently player-data)
  • Improve speed of the crawler

Metadata

Click here to expand meta view, or go-here for a detailed view
id
  • type: string

  • example: "158023"

name
  • type: string

  • example: "Lionel Andrés Messi Cuccittini"

short_name
  • type: string

  • example: "L. Messi"

photo_url
primary_position
  • type: string

  • example: "RW"

positions
  • type: string[]

  • example: ["RW", "ST", "CF"]

age
  • type: string

  • example: "33"

birth_date
  • type: string (DateFormat is YYYY/MONTH_NAME_SHORT/DD)

  • example: "1987/Jun/24"

height
  • type: integer (in cms)

  • example: 170

weight
  • type: integer (in kg)

  • example: 72

Overall Rating
  • type: integer

  • example: 93

Potential
  • type: integer

  • example: 93

Value
  • type: string (in euros)

  • example: "€103.5M"

Wage
  • type: string (in euros)

  • example: "€560K"

Preferred Foot
  • type: enum["Left", "Right"]

  • example: "Left"

Weak Foot
  • type: integer (range 1-5)

  • example: 4

Skill Moves
  • type: integer (range 1-5)

  • example: 4

International Reputation
  • type: integer (range 0-5)

  • example: 5

Work Rate
  • type: enum["Medium/Low"]

  • example: "Medium/Low"

Body Type
  • type: enum["Unique", "Normal (170-185)", "Normal (185+)", "Lean (170-185)", "Lean (185+)", "Stocky (170-185)", "Normal (170-)", "Stocky (185+)", "Stocky (185+)", "Stocky (170-)", ]

  • example: "Unique"

Real Face
  • type: enum["Yes", "No"]

  • example: "Yes"

Release Clause
  • type: string (in euros)

  • example: "€212.2M"

teams
  • type: map<string, integer> (including international and domestic clubs)

  • example:

{
"FC Barcelona": 84,
"Argentina": 83
}
attacking
  • type: map<attackOptions, integer>
attackOptions
  • type: enum["Crossing", "Finishing", "HeadingAccuracy", "ShortPassing", "Volleys"]
  • example:
{
    "Crossing": 85,
    "Finishing": 95,
    "HeadingAccuracy": 70,
    "ShortPassing": 91,
    "Volleys": 88
}
skill
  • type: map<skillOptions, integer>
skillOptions
  • type: enum["Dribbling", "Curve", "FKAccuracy", "LongPassing", "BallControl"]
  • example:
{
    "Dribbling": 96,
    "Curve": 93,
    "FKAccuracy": 94,
    "LongPassing": 91,
    "BallControl": 96
}
movement
  • type: map<movementOptions, integer>
movementOptions
  • type: enum["Acceleration", "SprintSpeed", "Agility", "Reactions", "Balance"]
  • example:
{
    "Acceleration": 91,
    "SprintSpeed": 80,
    "Agility": 91,
    "Reactions": 94,
    "Balance": 95
}
power
  • type: map<powerOptions, integer>
powerOptions
  • type: enum["ShotPower", "Jumping", "Stamina", "Strength", "LongShots"]
  • example:
{
    "ShotPower": 86,
    "Jumping": 68,
    "Stamina": 72,
    "Strength": 69,
    "LongShots": 94
}
mentality
  • type: map<mentalityOptions, integer>
mentalityOptions
  • type: enum["Aggression", "Interceptions", "Positioning", "Vision", "Penalties", "Composure"]
  • example:
{
    "Aggression": 44,
    "Interceptions": 40,
    "Positioning": 93,
    "Vision": 95,
    "Penalties": 75,
    "Composure": 96
}
defending
  • type: map<defendingOptions, integer>
defendingOptions
  • type: enum["DefensiveAwareness", "StandingTackle", "SlidingTackle"]
  • example:
{
    "DefensiveAwareness": 32,
    "StandingTackle": 35,
    "SlidingTackle": 24
}
goalkeeping
  • type: map<goalkeepingOptions, integer>
goalkeepingOptions
  • type: enum["GKDiving", "GKHandling", "GKKicking", "GKPositioning", "GKReflexes"]
  • example:
{
    "GKDiving": 6,
    "GKHandling": 11,
    "GKKicking": 15,
    "GKPositioning": 14,
    "GKReflexes": 8
}
player_traits
  • type: string["Technical Dribbler (AI)","Long Shot Taker (AI)","Flair","Speed Dribbler (AI)","Injury Prone","Long Passer (AI)","Playmaker (AI)","Power Header","Dives Into Tackles (AI)","Outside Foot Shot","Team Player","Finesse Shot","Leadership","Solid Player","Early Crosser","Long Throw-in","Comes For Crosses","Power Free-Kick","GK Long Throw","Cautious With Crosses","Rushes Out Of Goal","Saves with Feet","Chip Shot (AI)","Giant Throw-in","One Club Player"]

  • example:

[
    "Finesse Shot",
    "Long Shot Taker (AI)",
    "Speed Dribbler (AI)",
    "Playmaker (AI)",
    "Outside Foot Shot",
    "One Club Player",
    "Team Player",
    "Chip Shot (AI)"
]
player_hashtags
  • type: string["#Strength","#Acrobat","#Engine","#Speedster","#Dribbler","#Aerial Threat","#Tactician","#FK Specialist","#Crosser","#Distance Shooter","#Clinical Finisher","#Playmaker","#Tackling","#Complete Midfielder","#Complete Forward","#Poacher","#Complete Defender"] (Each tag starts with #)

example:

[
    "#Dribbler",
    "#Distance Shooter",
    "#FK Specialist",
    "#Acrobat",
    "#Clinical Finisher",
    "#Complete Forward"
]
logos
  • type: map<groupNames, logoAttributes>
groupNames
  • type: enum["country", "club", "nationalClub"]
logoAttributes
  • type: map<enum["name", "url"], string>

  • logoAttributes examples:

{
    "name": "Argentina",
    "url": "https://cdn.sofifa.com/flags/ar.png"
}
  • examples:
{
    "country": {
    "name": "Argentina",
    "url": "https://cdn.sofifa.com/flags/ar.png"
    },
    "club": {
    "name": "FC Barcelona",
    "url": "https://cdn.sofifa.com/teams/241/60.png"
    },
    "nationalClub": {
    "name": "Argentina",
    "url": "https://cdn.sofifa.com/teams/1369/60.png"
    }
}

Contributing tot the Project

We love your input! We want to make contributing to this project as easy and transparent as possible, whether it's:

  • Reporting a bug
  • Discussing the current state of the code
  • Submitting a fix
  • Proposing new features

Making a PR

  • Fork the repo and clone it on your machine.

  • Add a upstream link to main branch in your cloned repo

     git remote add https://github.com/sauravhiremath/fifa-stats-crawler.git
    
    
  • Keep your cloned repo upto date by pulling from upstream (this will also avoid any merge conflicts while committing new changes)

    git pull upstream master
    
  • Create your feature branch

    git checkout -b <feature-name>
    
  • Commit all the changes

    git commit -am "Meaningful commit message"
    
  • Push the changes for review

    git push origin <branch-name>
    
  • Create a PR from our repo on Github.

Additional Notes

  • Code should be properly commented to ensure it's readability.
  • If you've added code that should be tested, add tests as comments.
  • In python use docstrings to provide tests.
  • Make sure your code properly formatted.
  • Issue that pull request!

Issue suggestions/Bug reporting

When you are creating an issue, make sure it's not already present. Furthermore, provide a proper description of the changes. If you are suggesting any code improvements, provide through details about the improvements.

Great Issue suggestions tend to have:

  • A quick summary of the changes.
  • In case of any bug provide steps to reproduce
    • Be specific!
    • Give sample code if you can.
    • What you expected would happen
    • What actually happens
    • Notes (possibly including why you think this might be happening, or stuff you tried that didn't work)

Additional References:

More step by step guide with pictures for creating a pull request can be found here