Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase in Columns #146

Open
isaactyj opened this issue Sep 19, 2023 · 2 comments
Open

Increase in Columns #146

isaactyj opened this issue Sep 19, 2023 · 2 comments

Comments

@isaactyj
Copy link

After running raw_to_Xy to take in multi indexed data, how do I amend the code in the getting started Python notebook such that it now takes into account the change in matrix size? Sorry, I am still trying to figure it out and is quite new to this

@jankrepl
Copy link
Owner

Hey @isaactyj

Do you think you could share a minimal reproducible example of your issue? Without leaking any private information - feel free to to create a minimal dataset and upload it if necessary.

@isaactyj
Copy link
Author

The data I use is just from Yahoo Finance, so i can just upload the code to get the multi-index csv

stocks = ['AAPL', 'GOOGL', 'MSFT', 'AMZN', 'TSLA']

start_date = '2013-01-01'

end_date = '2019-01-01'

data = yf.download("AAPL", start=start_date, end=end_date)

data = data.reset_index()

columns_multi_index = pd.MultiIndex.from_product([stocks, ['Close', "Volume"]], names=['Stock', 'Metric'])

all_data = pd.DataFrame(columns=columns_multi_index, index=dates)

for i in stocks:
    print(i)
    data = yf.download(i, start=start_date, end=end_date)
    close = data['Close']
    volume = data['Volume']
    all_data[(i, 'Close')] = close.tolist()
    all_data[(i, 'Volume')] = volume.tolist()

n_timesteps = len(all_data)  # 20
n_channels = len(all_data.columns.levels[0])  # 2
n_assets = len(all_data.columns.levels[1])  # 2
lookback, gap, horizon = 5, 2, 4
n_samples =  n_timesteps - lookback - horizon - gap + 1  # 10
X, timestamps, y, asset_names, indicators = raw_to_Xy(all_data,
                                                      lookback=lookback,
                                                      gap=gap,
                                                      freq="B",
                                                      horizon=horizon,
                                                      use_log=True)
 assert X.shape == (n_samples, n_channels, lookback, n_assets) 
 assert timestamps[0] == all_data.index[lookback]             

I instantly get an assertion error for both assertions. I am actually using 18 stocks but for simplicity ill just change it to 5 stocks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants