Skip to content

Generate Plotly graphs of the most viewed Wikipedia pages of the year

Notifications You must be signed in to change notification settings

addshore/wikipedia-year-in-plotly

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Wikipedia year in plotly

Generates Plotly graphs for the year from the Wikimedia pageview api.

The process starts with the top page views per month across the whole year, with a bunch of processing then applied.

You can find some of the generated plots here:

A description of these graphs can be found below:

  • Overview: A mixture of the "top" articles from the other graphs listed (peaks, change, total).
  • Peaks: Articles that had the highest monthly page view values in the year.
  • Change: Articles that had the largest change between their high and low in the year.
  • Total: Articles that had the most views overall in the year.

The 2020 overview plot looks like this:

Interesting other links:

Running the code:

Install the dependencies using npm

npm install

In order to run this code you need a plotly account and to create the .plotly_user and .plotly_token files with your details.

To run for different years you currently need to alter the code.

Then just run the script with some arguments, such as "en.wikipedia" and "2020".

npm main.js <project> <year>

Note: 2016 is the first year that this will work for due to the limited data contained in the pageview API.

If you want to dump the data as it passes through the script you can do something like:

DUMP_DATA=1 node main.js en.wikipedia 2020

About

Generate Plotly graphs of the most viewed Wikipedia pages of the year

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published