Skip to content

seb646/happiness-and-altruism

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Happiness and Altruism in the United States

This repository contains all of the files necessary for an investigation of happiness and altruism in the United States using the NORC's General Social Survey (GSS) data. The aim of this study was to determine if happiness leads to altruistic behavior.

Note
The research paper associated with this study is available here.

Getting Started

Requirements

This project requires both the R programming language and Quarto. If you do not have these tools in your development environment, please install them now. You will also need an integrated development environment (IDE) capable of running R scripts. I recommend RStudio (local) or Posit Cloud (cloud-based).

Once your environment is set up, you must install several packages that handle various tasks, like graphing data, creating tables, and general organization and processing. You will find a complete list of these packages in the file scripts/00-install_dependencies.r. You only need to run this file once to install the required dependencies.

Download the data

Note
A step-by-step guide for how to download this data is available here.

The first step in working with this project is to download following three data sets from the General Social Survey.

Once you download the data from GSS, place the GSS.dat and GSS.dct files in inputs/data/raw and run scripts/01-data_covert.r to conver the data to a .csv file.

Clean the data

Before moving to data analysis, we must clean the generated .csv files to help us filter, use, and understand the relevant data points. The scripts/02-data_cleaning.r file handles all of the data cleaning, including fixing column names (many have characters that cannot be used or are insufficent descriptors), selecting the appropriate columns, and filtering any rows that contain null data.

Run the file to fetch the raw data sets, clean them, and then create new .csv files with the clean data. At the end of this process, you should have six new files in inputs/data/clean:

  • directions_negative_data.csv
  • directions_positive_data.csv
  • happy_negative_data.csv
  • happy_positive_data.csv
  • homeless_negative_data.csv
  • homeless_positive_data.csv

Analyze the data

The core data analysis of this project occurs in the outputs/paper/paper.qmd file, another Quarto document. Once you render paper.qmd, Quarto will generate a paper.pdf file in the same folder. The raw references used in paper.qmd are available under the same folder in the references.bib file.

Debugging

Test the data

If you're experiencing problems with the data, I've compiled a document that tests the data against several parameters, like data types, number ranges, and data ranges. This testing document is available under the scripts/03-data_testing.r file. The file contains a number of tests to run on the six .csv files.

Before running these tests, you must first download the data following the steps outlined above. All of these tests should return true. If they do not, feel free to create an issue.

Simulate the data

If you'd like to debug the problem yourself, or if you'd like to use a service like Stack Overflow for help, it's important to have some simulated data to reproduce the problem. The scripts/04-data_simulation.r file generates random, fake data based on the information initially downloaded from GSS.

Acknowledgments

Created by Sebastian Rodriguez, Laura Lee-Chu, and Iz Leitch © 2023, licensed under the BSD 3-Clause License. Contains information from General Social Survey (GSS), a project of the independent research organization NORC at the University of Chicago, with principal funding from the National Science Foundation. Created using R, an open-source statistical programming language.

This project uses a number of R packages, including: dplyr, ggplot2, here, janitor, kableExtra, knitr, lubridate, opendatatoronto, readr, RColorBrewer, scales, and tidyverse.

Much of this project's development was informed by Rohan Alexander's book Telling Stories with Data.