DataPlug module

class DataPlug.DataPlug(name)

Bases: object

DataPlug class. Used for data aquisition from reddit.

Attributes:

  • name:

    name of the user agent

  • reddit_raw_data:

    raw data from reddit

  • df:

    pandas dataframe

  • user_agent:

    user agent for reddit

  • client_id:

    client id for reddit

  • client_secret:

    client secret for reddit

  • priceDF:

    price dataframe of GME stock price

  • mergedDF:

    merged dataframe of reddit and price data

aggregate_reddit_posts_daily()

Aggregate reddit posts by date. Turns dataframe for each reddit post into one dataframe with average results across all posts for each day.

Returns:

pandas dataframe of aggregated reddit posts

Return type:

pandas dataframe

data_to_pandas(data)

Convert reddit data to pandas dataframe.

Parameters:

data – data to convert

Returns:

pandas dataframe

Return type:

pandas dataframe

get_data(subreddit, limit)

Get data from reddit.

Parameters:
  • subreddit – subreddit to get data from

  • limit – number of posts to get

Returns:

data: data from reddit

Return type:

dict

get_data_pushshift(subreddit, limit, before, after)

Get data from reddit.

Parameters:
  • subreddit – subreddit to get data from

  • limit – number of posts to get

  • before – time to get posts before

  • after – time to get posts after

Returns:

data from reddit

Return type:

dict

get_date(epoch_time)

Convert epoch time to a readable date.

Parameters:

epoch_time – time to convert

Returns:

date: readable date

Return type:

str

get_epoch(date)

Convert readable date to epoch time.

Parameters:

date – date string to convert

Returns:

epoch_time: epoch time

Return type:

float

get_price_dataframe(filename='./../data/GME.csv')

A funciton to get dataframe of price data from csv.

Parameters:

filename – name of CSV file; defaults to ‘./../data/GME.csv’

Returns:

pandas dataframe of price data. Saved to self.priceDF attribute.

Return type:

pandas dataframe

get_reddit_dataframe(filename='./../data/reddit_wsb.csv')

A funciton to get dataframe of price data from csv

Parameters:

filename – name of CSV file; defaults to ‘./../data/reddit_wsb.csv’

Returns:

pandas dataframe of reddit data. Saved to self.df attribute.

Return type:

pandas dataframe

merge_dataframes()

Merge the price and reddit dataframes.

Returns:

pandas dataframe

Return type:

pandas dataframe

new_aggregate_reddit_posts_daily()

Aggregate reddit posts by date. Returns dataframe for each reddit post into one dataframe with average results across all posts for each day.

Returns:

Aggregated reddit posts daily

Return type:

pandas dataframe