Swahili-tweet-sentiment

author image
by:Kelvin Shirima|Last updated:March 10, 2020

Context

A new Swahili tweet dataset for sentiment analysis.

Content

The dataset has 1 CSV file with 2 columns The columns are;

  • The tweet text
  • The sentiment of the tweet

Inspiration

  • Can we predict sentiment of tweets based on this data?
  • Can we perform language modeling for Swahili?

Download

Using datasets library

from datasetsimport load_dataset
dataset = load_dataset("Davis/Swahili-tweet-sentiment")

Or just clone the repo

git lfs install
git clone https://huggingface.co/datasets/Davis/Swahili-tweet-sentiment

Dataset Preview

Tweets (string)Labels (int)
So chuga si tunakutana kesho kwenye Nyamachoma festival nanenane mnajif anyaga mnakula mtungi mje kesho niwakalishe0
Asante sana watu wa Sirari jimbo la Tarime vijijini Huu ni Upendo usio na Mashaka kwa wote wanaojaribu kufanya kazi kwa bidii1
Leo nimepata kitambulisho changu cha taifa Asante sana0
Mgema akisifiwa tembo hulitia maji1

Licensing Information

More Information Needed

Contributions

Davis David, Zephania Reuben & Eliya Masesa