Why mere Machine Learning cannot predict Bitcoin price

Data science, Learning

Why mere Machine Learning cannot predict Bitcoin price – Source Erogol.com

Lately, I study time series to see something more out the limit of my experience. I decide to use what I learn in cryptocurrency price predictions with a hunch of being rich. Kidding? Or not :).  As I see more about the intricacies of the problem I got deeper and I got a new challenge out of this. Now, I am in a process of creating something new using traditional machine learning to latest reinforcement learning achievements.

So the story aside, I like to see if an AI bot trading without manual help is possible or is a luring dream. Lately, I read a lot about the topic  from traditional financial technical analysis to latest ML solutions. What I see at the ML front is many people claim to use lazy ML with success and sell deceitful dreams.What I call lazy ML is, downloading data , training the model and done. We are rich!! What I really experience is they have false conclusion induced by false interpretations. And the bad side of this, many other people try to replicate their results (aka beginner me) and waste a lot of time. Here, I like to show a particular mistake in those works with a accompanying code helping us to realize the problem better off.

Briefly, this work illustrates a simple supervised setting where a model predicts the next Bitcoin move given the current state.  Here is the full Notebook and to see more advance set of experiments check out the repo.  Hope you like that.

Before we start, lets lay down two main assumptions  generally deemed true in market literature.

  • All information describing the market is hidden under the price values.
  • We go Semi-Markovian, meaning each prediction only depends on the present state.

Now, what we do here is very simple. Given the state as High, Low, Open, Close price values of the present step we like to predict the price direction at the next step which is categorized as Up, Down or Same.

DATA_PATH = "../data/bitcoin-historical-data/coinbaseUSD_1-min_data_2014-12-01_to_2017-10-20.csv.csv"
df = read_data(DATA_PATH, 3)

df_feats = compute_features(df)
df_feats.dropna(inplace=True)
df_feats

First,  we read Bitcoin price history downloaded from here into a Pandas dataframe, convert any row to a difference from the previous time step and drop None rows. That is, each row is a difference btw the time t and time (t-1) for each columns.

# Validation split
train_split_point = time.mktime(datetime.datetime.strptime('2016-1-1 00:00:00', "%Y-%m-%d %H:%M:%S").timetuple())
split_point = time.mktime(datetime.datetime.strptime('2017-7-1 00:00:00', "%Y-%m-%d %H:%M:%S").timetuple())

df_train = df_feats[np.logical_and(df_feats['timestamp'] < split_point, df_feats['timestamp'] > train_split_point)]
df_test = df_feats[df_feats['timestamp'] > split_point]

print(df_train.shape)
print(df_test.shape)

Split the data into train and test by taking the date 2017-7-1 is the split point. So we use the market data after 2017-7-1 as the test set. That gives us 262541 steps for training and 53319 steps for testing.

y_train, label_names = compute_labels(df_train)
y_test, _ = compute_labels(df_test)

check_labels(y_test.argmax(axis=1), df_test['close'].values)
check_labels(y_train.argmax(axis=1), df_train['close'].values)

assert y_train.shape[0] == df_train.shape[0]
assert y_test.shape[0] == df_test.shape[0]

X_train = df_train.iloc[:, -4:].values
X_test = df_test.iloc[:, -4:].values

assert y_train.shape[0] == X_train.shape[0]
assert y_test.shape[0] == X_test.shape[0]

Compute labels at each time step (t) as Up, Down or Same. If the label is Up, the price is predicted to increase at time (t+1).

Let’s define the magic box with Keras. This is a basic 4 layers fully connected network. You can play around the architecture as you like for your run.

from keras.models import Sequential
from keras.layers import Dense, Dropout

model = Sequential()
model.add(Dense(32, activation = 'tanh', input_dim = 4))
model.add(Dropout(0.2))
model.add(Dense(32, activation = 'tanh'))
model.add(Dropout(0.1))
model.add(Dense(32, activation = 'tanh'))
model.add(Dropout(0.1))
model.add(Dense(3, activation = 'softmax')) 
# out shaped on df_Yt.shape[1]
model.compile(loss='categorical_crossentropy', optimizer='adam', 
metrics=['accuracy'])

Train the model and enjoy the progress bar 🙂

batch_size = 512 # Total 'blocks/snapshot' in a day
epochs = 1000

model.fit(X_train, y_train, validation_data=[X_test, y_test], batch_size=batch_size, epochs=1000)
Epoch 27/1000
262541/262541 [==============================] - 3s - loss: 0.8915 - acc: 0.5028 - val_loss: 0.9344 - val_acc: 0.4627
Epoch 28/1000
177152/262541 [====================..........] - ETA: 1s - loss: 0.8910 - acc: 0.5055

This is where I stop the learning. You should also see similar values.

#if we always predict UP

precision recall f1-score support

timestamp 0.4694 1.0000 0.6389 25027
high 0.0000 0.0000 0.0000 22390
low 0.0000 0.0000 0.0000 5902

avg / total 0.2203 0.4694 0.2999 53319

Before we see the model performance, first we measure the baseline values. Considering the Bitcoin craze, If we always predict UP we already get ~0.22 accuracy.

#random prediction

precision recall f1-score support

timestamp 0.4732 0.3385 0.3947 25027
high 0.4262 0.3354 0.3754 22390
low 0.1112 0.3353 0.1670 5902

avg / total 0.4134 0.3368 0.3614 53319

Random prediction also obtains ~0.41 accuracy. Now measure the model performance and see if we get something better.

precision recall f1-score support

UP 0.4999 0.6983 0.5826 25027
DN 0.4540 0.2735 0.3413 22390
FLAT 0.3840 0.3168 0.3472 5902

avg / total 0.4678 0.4777 0.4552 53319

We obtain 0.47 accuracy which is better than random and shows our model is keen to learn something. Most of the people stop here and believe that things gonna work out when you stream the real data. No!! it is not done yet.

Let’s plot the predictions and see what  actually goes wrong. What we see here is the color coding of our prediction at each time step. Green is Up, blue is Same and red is Down.

Very large plot and please open it on another window

The broken thing here, if we look carefully, the model only predicts what we have at the previous time step. If price stayed the same, it predicts Blue. If price was up before, it predicts Green and so on. With a model like this, it is normal to measure good accuracy since it is natural to expect Up move, if it went Up previously. It is a good catch for a Kaggler but not a trader.

That I can say, trained model is not generalizing the knowledge to help us but memorizing basic rules which makes it useless in a real-life.

I like to keep it brief. What I aim to pin here is not that ML is useless for this problem. ML is definitely helpful with more advance constructs. Just don’t expect to download data, train the model and be rich :).

If you are really interested in using ML in trading, I suggest you to start from the rudiments. Initially use ML for creating helping signals. It might merely use traditional financial indicators and signal certain complex conditions. However, do not rely on ML from the start and use it as a side-kick.

Note that, I try to keep things simple here but you might like to include many other features like financial indicators using the great library TA-Lib.  Also, you might use other basic ML models. Or you can type a regression problem and predict the real price changes. I assure you the result will be the same. Some of such experiments are on the repo as well.

What about RNN? I should also point out that RNN (or LSTM, GRU) has far worse memorization problem. If you train RNN for regressing the relative price change, what it does is predicting a small variance over a previous time step price. Again, this gives satisfying model performance as Mean Squared Error is the concern but has no real use.  Although this is a solution proposed by many blog posts, I once again assure you that RNN does not work too.

Last remarks, I believe ML has a huge playground at the nascent Crypto market for two main reasons. The first, since many people are just new in trading, they tightly follow well studied buy/sell patterns with no cannier selections. So this means a pattern and can be learned by ML. The second, cryto market is a wild game. Things are so volatile. Things  go up 100%  or down 200% over a night. It is great opportunity for good traders but it is not possible to eye all the crazy volatile market. So this is just a great reason to use AI to help us and expand the horizon we anticipate.

Pls let me know what you think.  Also feel free to ping me if you have something new or you like the notion of AI based trading. I personally start to use ML to do what I propose above. We could enjoy it together. Best!!

 

Share

The post Why mere Machine Learning cannot predict Bitcoin price appeared first on A Blog From Human-engineer-being.