Matplotlib: Plot Data and then Time Series Predictions

bclayman Source

I'm using matplotlib to display a stock's price movements over time. I want to focus on the last 90 days and then predict the next 14 days. I have the last 90 days of data and my predictions, but I want to graph my predictions in a different color, so it's clear they're different.

How would I do this?

If I just add a second plot() call to my code, the predictions will start from the same point as my 90 days of data and be overlaid, which isn't what I want.

Right now I'm doing this:

df[-90:]["price"].plot()
plt.show()

Thanks!

pythonpandasmatplotlib

Answers

answered 3 months ago yuji #1

enter image description here

import matplotlib.pyplot as plt
import numpy as np

last90days = np.random.rand(90)
next14days = np.random.rand(14)

plt.plot(x=np.arange(90), y=last90days)
plt.plot(x=np.arange(90, 90+14), y=next14days)
plt.show()

answered 3 months ago Y. Luo #2

Hopefully this is what you want:

import pandas as pd
import numpy as np; np.random.seed(1)
import matplotlib.pyplot as plt

datelist = pd.date_range(pd.datetime(2018, 1, 1), periods=104)
df = pd.DataFrame(np.cumsum(np.random.randn(104)), 
                  columns=['price'], index=datelist)

plt.plot(df[:90].index, df[:90].values)
plt.plot(df[90:].index, df[90:].values)
# If you don't like the break in the graph, change 90 to 89 in the above line
plt.gcf().autofmt_xdate()
plt.show()

enter image description here

answered 3 months ago vestland #3

Short answer:

Use pd.merge() and make good use of missing valus in two different series to get two lines with different colors. This suggestion will be very flexible with regards to what type of dataframe index you're using (dates, intergers og strings). This is what you'll get:

enter image description here


Long answer:

About the detail regarding...

I want to focus on the last 90 days and then predict the next 14 days.

... I'm going to assume that you are using a dataframe with a daily index. I'm also assuming that you know the index values of your dataset with 90 days and your dataset with 14 days.

Here's a dataframe with 104 observations (random data):

Snippet 1:

import pandas as pd
import numpy as np

np.random.seed(12)
rows = 104
df = pd.DataFrame(np.random.randint(-4,5,size=(rows, 1)), columns=['data'])
datelist = pd.date_range(pd.datetime(2018, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist 
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
df = df.cumsum()
df.plot()

Plot 1:

enter image description here

To replicate your setup, I split the dataframe into two different frames with 90 observations (price) and 14 days (predictions). This way, your'll have two differendt datasets, but the associated index will be continous - which I assume is you actual situation.

Snippet 2:

df_90 = df[:90].copy(deep = True)
df_14 = df[-14:].copy(deep = True)
df_90.columns = ['price']
df_14.columns = ['predictions']

df_90.plot()
df_14.plot()

Plot 2:

enter image description here

Now you can merge them together so that you'll get a dataframe with two columns data and predictions. Of course you'll end up with some missing data, but that is exactly what is going to give you two lines with different colors when you plot it.

Snippet 3:

df_all = pd.merge(df_90, df_14, how = 'outer', left_index=True, right_index=True)
df_all.plot()

Plot 3:

enter image description here

I hope the suggested solution matches your real situation. Let me know if the details about the index will be an issue, and I'll take a look at that as well.


Here's the complete code for an easy copy-paste:

import pandas as pd
import numpy as np

np.random.seed(12)
rows = 104
df = pd.DataFrame(np.random.randint(-4,5,size=(rows, 1)), columns=['data'])
datelist = pd.date_range(pd.datetime(2018, 1, 1).strftime('%Y-%m-%d'), periods=rows).tolist()
df['dates'] = datelist 
df = df.set_index(['dates'])
df.index = pd.to_datetime(df.index)
df = df.cumsum()

df.plot()

df_90 = df[:90].copy(deep = True)
df_14 = df[-14:].copy(deep = True)
df_90.columns = ['price']
df_14.columns = ['predictions']

df_90.plot()
df_14.plot()

df_all = pd.merge(df_90, df_14, how = 'outer', left_index=True, right_index=True)
df_all.plot()

comments powered by Disqus