Skip to content

soxoj/bellingcat-hackathon-watchcats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

29 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Adana: Analytical DAshboard (for NArratives)

๐Ÿ“Š 1-click analytical dashboard for OSINT researchers

The idea

Analytical tool to extract insights (shown on a simple dashboard) from social media posts about narratives, sentiments, initiators, influencers and clusters of accounts. It should be applicable for studying disinformation campaigns, analysing public opinion, and assessing risks related to some topics.

It's a project created by team Watch Cats during participation in Bellingcat's First In-person Hackathon.

Inspired by 4CAT and twitter explorer. The development process is documented in this Google document.

MVP

Available by the link: https://bellingcat-hackathon-watchcats-uearyc7iggn84xznppgq5k.streamlit.app/

Team members

@soxoj, @dizvyagintsev

Datasets

Twitter posts on various topics (1-20K), including datasets enriched with topics and sentiments.

Instructions:

How can I get topics and sentiments for my dataset? Cause itโ€™s a resource- and time-consuming operation, we implemented it in the Jupyter Notebook script available on our GitHub. For tweets vectorization we are using hkunlp/instructor-large model, for clusterization โ€“ MiniBatchKMeans, for the detection of topics โ€“ GPT-4-Turbo API, for the sentiment analysis of tweets โ€“ cardiffnlp/twitter-roberta-base-sentiment-latest mode. All steps are reproducible.

Installation

For local installation you need Python and pip installed.

pip install -r requirements.txt
streamlit run dashboard.py

For private cloud installation, you need:

  1. Login (register) to GitHub
  2. Fork this repository
  3. Login (register) in Streamlit by GitHub account
  4. Create a new project in Streamlit from a forked repository
  5. Deploy (no payment method required!)
  6. Voila!

Utils

utils folder contains:

  • CSV tweet datasets formatter (to Twitwi)
  • cluster_n_sentiments.ipynb: ML stuff (enrichment of datasets with sentiments and topics)

SOWEL classification

This tool uses the following OSINT techniques:

Some other results

An example of a hashtag network built with Twitter Explorer using one of the datasets

HashtagNetwork