Skip to content

mdezube/sms-analysis

Repository files navigation

sms-analysis

Python/IPython code to analyze one's text messages. Intended to work out of the box.

Author: Michael Dezube <michael dezube at gmail dot com>

For further discussions: Join the chat at https://gitter.im/mdezube/sms-analysis

Overview of code

This code will:

  1. Find your latest iPhone sync (currently only supports doing this automatically on Macs), for PCs edit table_connector.py to find the file
  2. Load up the messages database and address book database locally
  3. Merge the databases together into fully_merged_messages_df which you can freely play with
  4. Visualize a word tree of your text messages with a specific contact, see word tree screenshot
  5. Show you who you text the most
  6. Create an interactive streamgraph to visualize how your texting with people has trended over time, see steamgraph screenshot
  7. Create a word cloud of the words you use, and those used by your contacts, see word cloud screenshot
  8. Use TFIDF to understand what words identify your contacts' verbiage
  9. Use TFIDF to understand what words identify the difference between contacts' verbiage. For example: how do high school friends talk differently from college friends, see tfidf contact comparison
  10. Use TFIDF to show you what topics were popular in texts you sent, or texts sent to you, and how this progressed over the years

Note: none of your data is modified nor sent anywhere during execution

Dependencies easy install

If you don't have pip, see https://pip.pypa.io/en/stable/installing/, or if using a Mac run sudo easy_install pip

Then run pip install -r requirements.txt and pip install "matplotlib>=1.4"

If the second comamnd fails, then you'll have to follow these detailed Matplotlib install instructions

Dependencies with details

  1. Pandas
  2. IPython
  3. Matplotlib
    • The majority of the code will work without this, but certain graphs will fail
  4. An iPhone, having synced with this computer
  5. If running on a Mac, code will work out of the box. If running on a PC, change the variable BASE_DIR in table_connector.py to the directory of your backups
    • This post seems to specify the location of backups on Windows.
  6. Internet connection to load the google visualization API, it's a very small file though

Quick Start - Jupyter Notebook

  1. Start the IPython notebook like so: jupyter notebook sms_analysis.ipynb
  2. Under the menu choose Cell --> Run All
  3. Edit the CONTACT_NAME and ROOT_WORD in the last cell to alter the visualization and then re-run that cell, under menu choose: Cell --> Run Cell

Quick Start - Command Line

  • Run python table_connector.py to see a sample of the messages and address book data
  • Run python table_connector.py --full to see a sample of the messages and address book data with all of their columns
  • Run python table_connector.py <output directory> to output the messages and address book data into CSV files
  • Run python table_connector.py --full <output directory> to output the messages and address book data into CSV files with all of their columns
  • SEE THE ARGS DOCUMENTATION: python table_connector.py --help to see the arguments and their options

Screenshots from running the code

Example word tree

Example steamgraph

Example word cloud

Example TFIDF contact comparison

Example of Clustering

About

Python/IPython code to analyze one's text messages. Intended to work out of the box, see README for details.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published