DataPlug module
- class DataPlug.DataPlug(name)
Bases:
object
DataPlug class. Used for data aquisition from reddit.
- Attributes:
- name:
name of the user agent
- reddit_raw_data:
raw data from reddit
- df:
pandas dataframe
- user_agent:
user agent for reddit
- client_id:
client id for reddit
- client_secret:
client secret for reddit
- priceDF:
price dataframe of GME stock price
- mergedDF:
merged dataframe of reddit and price data
- aggregate_reddit_posts_daily()
Aggregate reddit posts by date. Turns dataframe for each reddit post into one dataframe with average results across all posts for each day.
- Returns:
pandas dataframe of aggregated reddit posts
- Return type:
pandas dataframe
- data_to_pandas(data)
Convert reddit data to pandas dataframe.
- Parameters:
data – data to convert
- Returns:
pandas dataframe
- Return type:
pandas dataframe
- get_data(subreddit, limit)
Get data from reddit.
- Parameters:
subreddit – subreddit to get data from
limit – number of posts to get
- Returns:
data: data from reddit
- Return type:
dict
- get_data_pushshift(subreddit, limit, before, after)
Get data from reddit.
- Parameters:
subreddit – subreddit to get data from
limit – number of posts to get
before – time to get posts before
after – time to get posts after
- Returns:
data from reddit
- Return type:
dict
- get_date(epoch_time)
Convert epoch time to a readable date.
- Parameters:
epoch_time – time to convert
- Returns:
date: readable date
- Return type:
str
- get_epoch(date)
Convert readable date to epoch time.
- Parameters:
date – date string to convert
- Returns:
epoch_time: epoch time
- Return type:
float
- get_price_dataframe(filename='./../data/GME.csv')
A funciton to get dataframe of price data from csv.
- Parameters:
filename – name of CSV file; defaults to ‘./../data/GME.csv’
- Returns:
pandas dataframe of price data. Saved to self.priceDF attribute.
- Return type:
pandas dataframe
- get_reddit_dataframe(filename='./../data/reddit_wsb.csv')
A funciton to get dataframe of price data from csv
- Parameters:
filename – name of CSV file; defaults to ‘./../data/reddit_wsb.csv’
- Returns:
pandas dataframe of reddit data. Saved to self.df attribute.
- Return type:
pandas dataframe
- merge_dataframes()
Merge the price and reddit dataframes.
- Returns:
pandas dataframe
- Return type:
pandas dataframe
- new_aggregate_reddit_posts_daily()
Aggregate reddit posts by date. Returns dataframe for each reddit post into one dataframe with average results across all posts for each day.
- Returns:
Aggregated reddit posts daily
- Return type:
pandas dataframe