Skip to content

Project developed for distributed systems classes. The main idea is to collect tweets, insert into a queue, classify into topics and create a queue for each topic, then an client (subscriber) can watch any topic he wants.

Notifications You must be signed in to change notification settings

alescrocaro/tweets-message-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project developed for distributed systems classes. The main idea is to collect tweets, insert into a queue, classify into topics and create a queue for each topic, then an client (subscriber) can watch any topic he wants.

Authors:
Alexandre Aparecido Scrocaro Junior
Pedro Klayn

How to run

RabbitMQ

First you will need to download and install it.
Here is an example:

sudo apt-get update
sudo apt-get upgrade
sudo apt-get install erlang # dependency package
sudo apt-get install rabbitmq-server
sudo systemctl enable rabbitmq-server
sudo systemctl start rabbitmq-server
sudo systemctl status rabbitmq-server
pip install -r requirements.txt

Then you can unzip the used database

unzip tweets.csv.zip

To list all existing queues:

sudo rabbitmqctl list_queues

To clear all existing queues (BE CAREFUL!):

sudo rabbitmqctl list_queues | awk '{print $1}' | xargs -I % sudo rabbitmqctl delete_queue %

To consume all tweets from database:

python collector.py # this should get all tweets and insert into a single queue

To split tweets into topics:

python classifier.py

To start a client:

python subscriber.py barcelona ajax bayern
python subscriber.py realmadrid

This is the common flux, but you should be able to run it in any order.

Libraries

  • RabbitMQ

    • Provides advanced Message Queuing Protocol (AMQP), here we use it to build message queues, and a publish-subscriber application.
  • Pika

    • Python client for RabbitMQ. We will use it to create a connection with RabbitMQ and publish tweets to queues.
  • csv

    • Used for reading the csv file to collect tweets
  • json

    • Made operations to convert dict into string and vice versa
  • time

    • We used the sleep function to simulate a timeline of twitter
  • sys

    • Get argv in subscriber to subscribe in queues
  • random

    • Select a random queue from the subscribed queues to show in timeline

About

Project developed for distributed systems classes. The main idea is to collect tweets, insert into a queue, classify into topics and create a queue for each topic, then an client (subscriber) can watch any topic he wants.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages