tweeta

tweeta.text

This module contains utilities related to processing tweets

tweeta.text.extract_hashtags(text)

Extract hashtags from the text

tweeta.text.extract_mentions(text)

Extract mentions from the text

tweeta.text.fix_text(text)

ftfy.fix_text and remove linebreaks

tweeta.text.has_url(text)
tweeta.text.lang(text)

lang detection based on the text

tweeta.text.remove_lb(text)

Remove linebreaks

tweeta.text.remove_mentions(text)

Remove mentions from the text

tweeta.text.remove_url(text)

Remove url from text

tweeta.text.replace_slash(text, sub=' ')
tweeta.text.sanitize_nofunccall(text)

This is equivailant to replace_slash(remove_url(fix_text(text)))


tweeta.tweet

This module contains functionality for extracting various data elements from a Tweet object

class tweeta.tweet.TweetaTweet(in_data)

Bases: object

created_at(output_time_format='YMD')

Raw created_at are in constants.PARSE_TIME_FORMAT. This converts the datetime to other formats, e.g., YMD (predefined) or %Y-%m-%d (user defined)

fixed_text()

Fix some of the unicodes, and remove linebreaks [ftfy.fix_text(remove_lb(text)))]

get(field_name)

get arbitariry field values

has_quoted_status()

Whether the quoted_status is not null Note that quote is different from retweet, is_quote == has_quoted_status

has_retweeted_status()

Whether the retweeted_status is not null

has_url()

Whether the tweet contains urls (check entities first)

hashtags()

Use mentiones from entities first otherwise use text

is_deleted()

Whether th tweet has been deleted. If the tweet is deleted, none of the other attributes will be populated

is_en()

Wether the tweet is written in English Use the lang attribute first if it exists, otherwise use langid

is_geotagged()

Whether the tweet has been geo-tagged (either has geo or coordinates or place)

is_quote()

Whether the tweet is a quote of another tweet (quoted_status is not null

is_retweet()

Whether the tweet is a retweet (either start with (‘RT|Rt|rT|rt @’) or retweeted_status is not null ) Note that there might be cases where a tweet is a retweet (starts with RT), but retweeted_status is not filled Note that only ‘RT @’ is a true retweet. There are cases where tweets started with ‘RT’ but they are not retweets. e.g., “RT IF U NOT FRIENDLY..”

is_user_en()

Whether the user is English speaking

is_valid()

Whether the tweet contains all the root elements (‘text’ in tweet and ‘id’ in tweet and ‘created_at’ in tweet and ‘user’ in tweet)

json()

Get the raw json

mentions()

Use mentiones from entities first otherwise use text

text()

Take full_text from extended tweet (default compatable mode for streaming api or ‘full_text’ in tweet, which replaces ‘text’ when use extended mode) https://developer.twitter.com/en/docs/tweets/tweet-updates

tweet()

return the raw tweet object

tweet_id()

Get tweet id (from ‘id_str’ first if avaliable) otherwise use ‘id’

user_description()

Get user description, return empty string if it doesn’t exist

user_id()

Get user id (from tweet[‘user’][‘id_str’] first)

user_location()

Get user location, return empty string if it doesn’t exist

user_name()

Get user name, return empty string if it doesn’t exist

user_screen_name()

Get user screen name, return empty string if it doesn’t exist


tweeta.constants

Various constants related to Tweet


tweeta.exceptions

This module contains tweeta specific Exception classes.

exception tweeta.exceptions.TweetaError

Bases: Exception

Generic error class, catch-all for most tweeta issues.