Scientific testing has found no evidence to support the premises or purported effects outlined in astrological traditions. The continued belief in astrology despite its lack of credibility is seen as a demonstration of low scientific literacy, although some continue to believe in it even though they are scientifically literate. Let's make fun of it!.
Astrology-Bot/
│
├── data/
│ ├── horoscope.csv
│ ├── tarot.csv
│ ├── horoscope_webscraping.ipynb
│ └── tarot_webscraping.ipynb
│
├── interface/
│ ├── get_response.py
│ ├── inference.py
│ └── UI.py
│
├── model/
│ ├── embedding_model.py
│ └── inference_model.py
│
└── RAG/
├── chunk_data.py
├── index_data.py
├── main.py
└── utils.py
Scrape the plain text from Astrology.com with Astrology.com on daily basis.
For each of the zodiac sign(aries
, taurus
, gemini
, cancer
, leo
, virgo
, libra
, scorpio
, sagittarius
, capricorn
, aquarius
, pisces
), I scraped love
, daily
and work
.
Scrape the plain text of the meaning of each tarot card in different position from biddytarot.com.
- Chunking: The cleaned text data is chunked through sliding window with 200 words as window size and 50 as sliding step size.
- Embed Text: The text is embedded with BGE-Large model which is selected through MTEB LeaderBoard.
- Index Embedding: The embeddings are indexed into Pinecone. The retriver utilizes cosine similarity to retrieve relavant embeddings from the database.
- Prompting: The query will be embedded with the same encoder. Then the retrieved text will be added into the prompt.
- Inference:
LLaMA-2-7B
model is utilized to generate results. Due to the autoregressive nature, the generated text is post-processed and only the first answer is extracted as the final decision.
Play around by yourself! Deployed with Streamlit Community Cloud!
As shown in the results, the generated context information is more readable and makes more sense after fine-tuning.