Skip to content

Generate extended stats from your Last.fm scrobbles using your local music library, MusicBrainz and YouTube.

License

Notifications You must be signed in to change notification settings

simongoricar/lastfm-extended-scrobbles

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lastfm-extended-scrobbles

Uses Last.fm With data from MusicBrainz

Python 3.8+ Poetry

Generate extended stats from your Last.fm scrobbles using your local music library, MusicBrainz and YouTube.

1. Installation

To run lastfm-extended-scrobbles, download the latest release or clone/download this repository to some directory.

Then, if you have Poetry installed, simply run poetry install. Otherwise, install the dependencies from the generated requirements.txt file with pip install -r requirements.txt

2. Usage

2.1. Setup

Take an example configuration file at data/config.EXAMPLE.toml, copy it to data/config.toml and fill out the Last.fm API key/secret as well as the scrobbles JSON file (see below) and local music library path. Other settings can be left alone.

Before running the script, you need to have a JSON file with your Last.fm scrobbles. This script can process a list of pages returned by the Last.fm API.

The recommended way to save your scrobbles into a correct JSON format is using the provided script in data/download-scrobbles.py. Run it with python download-scrobbles.py --username myusername to download your scrobbles into a JSON file in the data directory (you'll need to have the configuration file already filled out and the dependencies installed for the script to work).
An alternative is the JSON output of a site like ghan.nl/scrobbles, but the loved tracks column will be always 0 this way.

lastfm-extended-scrobbles has multiple modes of search:

  • Local music library lookup via track MBID (extremely fast once indexed)
  • Local music library lookup via track metadata (both exact and fuzzy matching, still very fast)
  • MusicBrainz track MBID lookup (not too bad)
  • YouTube search (slowest, but also only if previous methods fail)
  • If everything above fails, an entry with just the scrobble data is created (but some columns will be empty)

If first one fails to find a match, the second method is attempted, and so on. It is therefore highly recommended that you link your local music library (which should be properly tagged) with this script if you have one.

Genres are looked up via Last.fm tags and filtered by the huge list of genres provided by Beets' LastGenre plugin. Amount of genres to output is configurable via max_genre_count in the configuration file, but defaults to 4. Track genre has the highest priority, then album genre, then finally artist genres.

Important tldr: Before running, the Last.fm API key/secret, music library location and the scrobbles file path must be filled out and saved into the configuration file at data/config.toml (use the data/config.EXAMPLE.toml file as a template).

2.2. Run the script

If you used Poetry for the install, run the script with poetry run python analyse.py. If not, use python analyse.py.

The script will first index your music library and then proceed to generate a spreadsheet (xlsx extension) with the extended scrobble information. The resulting spreadsheet will be (by default) saved to data/output-timestamp.xlsx.

2.3. Maintenance and troubleshooting

If the content of your music library changes (or its path does), you must delete the music library cache at data/cache/library_cache.json, otherwise the script will not work properly.

If unexpected errors pop up during the analysis and they aren't caused by something like a configuration issue, please do fill out a GitHub Issue with details of your problems.

3. Extended Data

This tool outputs the original data from the scrobble, attempts to improve columns like artist names and track titles. It extends the data with the track length and genre as well via Last.fm tags.

All columns:

  • Track source local music library / MusicBrainz / YouTube
  • Scrobble time unix epoch
  • Artist name and MusicBrainz ID uses scrobbled info unless there's a local library match
  • Album title and MusicBrainz ID ^
  • Track title and MusicBrainz ID ^
  • Track length in seconds, with one decimal place for local library files
  • Track love based on loved tracks from Last.fm
  • Genre merged track/album/artist genres, sorted by weight

A note on expected speed

Speed of this script varies quite a bit. If most of your scrobbles are present in your local music library (for example around 90%), the speed of a fresh run without any cache should be around 2 minutes per 1000 scrobbles. If you are processing scrobbles over a longer period of time and you listened to the same track multiple times, expect this to be shorter as the cache will be hit much more often. The slowest operations are when the script has to reach out to MusicBrainz, YouTube and/or Last.fm as we are limited by their rate limits.