大量代碼手把手教你如何預測加密貨幣的價格,有視頻喲

2021-01-10 大數據文摘

YouTube網紅小哥Siraj Raval系列視頻又和大家見面啦!今天要講的是加密貨幣價格預測,包含大量代碼,還用一個視頻詳解具體步驟,不信你看了還學不會!

點擊觀看詳解視頻

時長22分鐘

有中文字幕

預測加密貨幣價格其實很簡單,用Python+Keras,再來一個循環神經網絡(確切說是雙向LSTM),只需要9步就可以了!比特幣以太坊價格預測都不在話下。

這9個步驟是:

數據處理建模訓練模型測試模型分析價格變化分析價格百分比變化比較預測值和實際數據計算模型評估指標結合在一起:可視化

數據處理

導入Keras、Scikit learn的metrics、numpy、pandas、matplotlib這些我們需要的庫。

## Keras for deep learningfrom keras.layers.core import Dense, Activation, Dropoutfrom keras.layers.recurrent import LSTMfrom keras.layers import Bidirectionalfrom keras.models import Sequential## Scikit learn for mapping metricsfrom sklearn.metrics import mean_squared_error#for loggingimport time##matrix mathimport numpy as npimport math##plottingimport matplotlib.pyplot as plt##data processingimport pandas as pd

首先,要對數據進行歸一化處理。關於數據處理的原則,有張大圖,大家可以在大數據文摘公眾號後臺對話框內回復「加密貨幣」查看高清圖。

def load_data(filename, sequence_length):""" Loads the bitcoin data Arguments: filename -- A string that represents where the .csv file can be located sequence_length -- An integer of how many days should be looked at in a row Returns: X_train -- A tensor of shape (2400, 49, 35) that will be inputed into the model to train it Y_train -- A tensor of shape (2400,) that will be inputed into the model to train it X_test -- A tensor of shape (267, 49, 35) that will be used to test the model's proficiency Y_test -- A tensor of shape (267,) that will be used to check the model's predictions Y_daybefore -- A tensor of shape (267,) that represents the price of bitcoin the day before each Y_test value unnormalized_bases -- A tensor of shape (267,) that will be used to get the true prices from the normalized ones window_size -- An integer that represents how many days of X values the model can look at at once """ #Read the data file raw_data = pd.read_csv(filename, dtype = float).values #Change all zeros to the number before the zero occurs for x in range(0, raw_data.shape[0]): for y in range(0, raw_data.shape[1]): if(raw_data[x][y] == 0): raw_data[x][y] = raw_data[x-1][y] #Convert the file to a list data = raw_data.tolist() #Convert the data to a 3D array (a x b x c) #Where a is the number of days, b is the window size, and c is the number of features in the data file result = [] for index in range(len(data) - sequence_length): result.append(data[index: index + sequence_length]) #Normalizing data by going through each window #Every value in the window is divided by the first value in the window, and then 1 is subtracted d0 = np.array(result) dr = np.zeros_like(d0) dr[:,1:,:] = d0[:,1:,:] / d0[:,0:1,:] - 1 #Keeping the unnormalized prices for Y_test #Useful when graphing bitcoin price over time later start = 2400 end = int(dr.shape[0] + 1) unnormalized_bases = d0[start:end,0:1,20] #Splitting data set into training (First 90% of data points) and testing data (last 10% of data points) split_line = round(0.9 * dr.shape[0]) training_data = dr[:int(split_line), :] #Shuffle the data np.random.shuffle(training_data) #Training Data X_train = training_data[:, :-1] Y_train = training_data[:, -1] Y_train = Y_train[:, 20] #Testing data X_test = dr[int(split_line):, :-1] Y_test = dr[int(split_line):, 49, :] Y_test = Y_test[:, 20] #Get the day before Y_test's price Y_daybefore = dr[int(split_line):, 48, :] Y_daybefore = Y_daybefore[:, 20] #Get window size and sequence length sequence_length = sequence_length window_size = sequence_length - 1 #because the last value is reserved as the y value return X_train, Y_train, X_test, Y_test, Y_daybefore, unnormalized_bases, window_size

建模

我們用到的是一個3層RNN,dropout率20%。

雙向RNN基於這樣的想法:時間t的輸出不僅依賴於序列中的前一個元素,而且還可以取決於未來的元素。比如,要預測一個序列中缺失的單詞,需要查看左側和右側的上下文。雙向RNN是兩個堆疊在一起的RNN,根據兩個RNN的隱藏狀態計算輸出。

舉個例子,這句話裡缺失的單詞gym要查看上下文才能知道(文摘菌:everyday?):

I go to the ( ) everyday to get fit.

def initialize_model(window_size, dropout_value, activation_function, loss_function, optimizer):""" Initializes and creates the model to be used Arguments: window_size -- An integer that represents how many days of X_values the model can look at at once dropout_value -- A decimal representing how much dropout should be incorporated at each level, in this case 0.2 activation_function -- A string to define the activation_function, in this case it is linear loss_function -- A string to define the loss function to be used, in the case it is mean squared error optimizer -- A string to define the optimizer to be used, in the case it is adam Returns: model -- A 3 layer RNN with 100*dropout_value dropout in each layer that uses activation_function as its activation function, loss_function as its loss function, and optimizer as its optimizer """ #Create a Sequential model using Keras model = Sequential() #First recurrent layer with dropout model.add(Bidirectional(LSTM(window_size, return_sequences=True), input_shape=(window_size, X_train.shape[-1]),)) model.add(Dropout(dropout_value)) #Second recurrent layer with dropout model.add(Bidirectional(LSTM((window_size*2), return_sequences=True))) model.add(Dropout(dropout_value)) #Third recurrent layer model.add(Bidirectional(LSTM(window_size, return_sequences=False))) #Output layer (returns the predicted value) model.add(Dense(units=1)) #Set activation function model.add(Activation(activation_function)) #Set loss function and optimizer model.compile(loss=loss_function, optimizer=optimizer) return model

訓練模型

這裡取batch size = 1024,epoch times = 100。我們需要最小化均方誤差MSE。

def fit_model(model, X_train, Y_train, batch_num, num_epoch, val_split):""" Fits the model to the training data Arguments: model -- The previously initalized 3 layer Recurrent Neural Network X_train -- A tensor of shape (2400, 49, 35) that represents the x values of the training data Y_train -- A tensor of shape (2400,) that represents the y values of the training data batch_num -- An integer representing the batch size to be used, in this case 1024 num_epoch -- An integer defining the number of epochs to be run, in this case 100 val_split -- A decimal representing the proportion of training data to be used as validation data Returns: model -- The 3 layer Recurrent Neural Network that has been fitted to the training data training_time -- An integer representing the amount of time (in seconds) that the model was training """ #Record the time the model starts training start = time.time() #Train the model on X_train and Y_train model.fit(X_train, Y_train, batch_size= batch_num, nb_epoch=num_epoch, validation_split= val_split) #Get the time it took to train the model (in seconds) training_time = int(math.floor(time.time() - start)) return model, training_time

測試模型

def test_model(model, X_test, Y_test, unnormalized_bases):""" Test the model on the testing data Arguments: model -- The previously fitted 3 layer Recurrent Neural Network X_test -- A tensor of shape (267, 49, 35) that represents the x values of the testing data Y_test -- A tensor of shape (267,) that represents the y values of the testing data unnormalized_bases -- A tensor of shape (267,) that can be used to get unnormalized data points Returns: y_predict -- A tensor of shape (267,) that represnts the normalized values that the model predicts based on X_test real_y_test -- A tensor of shape (267,) that represents the actual prices of bitcoin throughout the testing period real_y_predict -- A tensor of shape (267,) that represents the model's predicted prices of bitcoin fig -- A branch of the graph of the real predicted prices of bitcoin versus the real prices of bitcoin """ #Test the model on X_Test y_predict = model.predict(X_test) #Create empty 2D arrays to store unnormalized values real_y_test = np.zeros_like(Y_test) real_y_predict = np.zeros_like(y_predict) #Fill the 2D arrays with the real value and the predicted value by reversing the normalization process for i in range(Y_test.shape[0]): y = Y_test[i] predict = y_predict[i] real_y_test[i] = (y+1)*unnormalized_bases[i] real_y_predict[i] = (predict+1)*unnormalized_bases[i] #Plot of the predicted prices versus the real prices fig = plt.figure(figsize=(10,5)) ax = fig.add_subplot(111) ax.set_title("Bitcoin Price Over Time") plt.plot(real_y_predict, color = 'green', label = 'Predicted Price') plt.plot(real_y_test, color = 'red', label = 'Real Price') ax.set_ylabel("Price (USD)") ax.set_xlabel("Time (Days)") ax.legend() return y_predict, real_y_test, real_y_predict, fig

分析價格變化

def price_change(Y_daybefore, Y_test, y_predict):""" Calculate the percent change between each value and the day before Arguments: Y_daybefore -- A tensor of shape (267,) that represents the prices of each day before each price in Y_test Y_test -- A tensor of shape (267,) that represents the normalized y values of the testing data y_predict -- A tensor of shape (267,) that represents the normalized y values of the model's predictions Returns: Y_daybefore -- A tensor of shape (267, 1) that represents the prices of each day before each price in Y_test Y_test -- A tensor of shape (267, 1) that represents the normalized y values of the testing data delta_predict -- A tensor of shape (267, 1) that represents the difference between predicted and day before values delta_real -- A tensor of shape (267, 1) that represents the difference between real and day before values fig -- A plot representing percent change in bitcoin price per day, """ #Reshaping Y_daybefore and Y_test Y_daybefore = np.reshape(Y_daybefore, (-1, 1)) Y_test = np.reshape(Y_test, (-1, 1)) #The difference between each predicted value and the value from the day before delta_predict = (y_predict - Y_daybefore) / (1+Y_daybefore) #The difference between each true value and the value from the day before delta_real = (Y_test - Y_daybefore) / (1+Y_daybefore) #Plotting the predicted percent change versus the real percent change fig = plt.figure(figsize=(10, 6)) ax = fig.add_subplot(111) ax.set_title("Percent Change in Bitcoin Price Per Day") plt.plot(delta_predict, color='green', label = 'Predicted Percent Change') plt.plot(delta_real, color='red', label = 'Real Percent Change') plt.ylabel("Percent Change") plt.xlabel("Time (Days)") ax.legend() plt.show() return Y_daybefore, Y_test, delta_predict, delta_real, fig

分析價格百分比變化

def binary_price(delta_predict, delta_real):""" Converts percent change to a binary 1 or 0, where 1 is an increase and 0 is a decrease/no change Arguments: delta_predict -- A tensor of shape (267, 1) that represents the predicted percent change in price delta_real -- A tensor of shape (267, 1) that represents the real percent change in price Returns: delta_predict_1_0 -- A tensor of shape (267, 1) that represents the binary version of delta_predict delta_real_1_0 -- A tensor of shape (267, 1) that represents the binary version of delta_real """ #Empty arrays where a 1 represents an increase in price and a 0 represents a decrease in price delta_predict_1_0 = np.empty(delta_predict.shape) delta_real_1_0 = np.empty(delta_real.shape) #If the change in price is greater than zero, store it as a 1 #If the change in price is less than zero, store it as a 0 for i in range(delta_predict.shape[0]): if delta_predict[i][0] > 0: delta_predict_1_0[i][0] = 1 else: delta_predict_1_0[i][0] = 0 for i in range(delta_real.shape[0]): if delta_real[i][0] > 0: delta_real_1_0[i][0] = 1 else: delta_real_1_0[i][0] = 0 return delta_predict_1_0, delta_real_1_0

比較預測值和實際數據

def find_positives_negatives(delta_predict_1_0, delta_real_1_0):""" Finding the number of false positives, false negatives, true positives, true negatives Arguments: delta_predict_1_0 -- A tensor of shape (267, 1) that represents the binary version of delta_predict delta_real_1_0 -- A tensor of shape (267, 1) that represents the binary version of delta_real Returns: true_pos -- An integer that represents the number of true positives achieved by the model false_pos -- An integer that represents the number of false positives achieved by the model true_neg -- An integer that represents the number of true negatives achieved by the model false_neg -- An integer that represents the number of false negatives achieved by the model """ #Finding the number of false positive/negatives and true positives/negatives true_pos = 0 false_pos = 0 true_neg = 0 false_neg = 0 for i in range(delta_real_1_0.shape[0]): real = delta_real_1_0[i][0] predicted = delta_predict_1_0[i][0] if real == 1: if predicted == 1: true_pos += 1 else: false_neg += 1 elif real == 0: if predicted == 0: true_neg += 1 else: false_pos += 1 return true_pos, false_pos, true_neg, false_neg

計算模型評估指標

def calculate_statistics(true_pos, false_pos, true_neg, false_neg, y_predict, Y_test):""" Calculate various statistics to assess performance Arguments: true_pos -- An integer that represents the number of true positives achieved by the model false_pos -- An integer that represents the number of false positives achieved by the model true_neg -- An integer that represents the number of true negatives achieved by the model false_neg -- An integer that represents the number of false negatives achieved by the model Y_test -- A tensor of shape (267, 1) that represents the normalized y values of the testing data y_predict -- A tensor of shape (267, 1) that represents the normalized y values of the model's predictions Returns: precision -- How often the model gets a true positive compared to how often it returns a positive recall -- How often the model gets a true positive compared to how often is hould have gotten a positive F1 -- The weighted average of recall and precision Mean Squared Error -- The average of the squares of the differences between predicted and real values """ precision = float(true_pos) / (true_pos + false_pos) recall = float(true_pos) / (true_pos + false_neg) F1 = float(2 * precision * recall) / (precision + recall) #Get Mean Squared Error MSE = mean_squared_error(y_predict.flatten(), Y_test.flatten()) return precision, recall, F1, MSE

結合在一起:可視化

終於可以看看我們的成果啦!

首先是預測價格vs實際價格:

y_predict, real_y_test, real_y_predict, fig1 = test_model(model, X_test, Y_test, unnormalized_bases)#Show the plotplt.show(fig1)

然後是預測的百分比變化vs實際的百分比變化,值得注意的是,這裡的預測相對實際來說波動更大,這是模型可以提高的部分:

Y_daybefore, Y_test, delta_predict, delta_real, fig2 = price_change(Y_daybefore, Y_test, y_predict)#Show the plotplt.show(fig2)

最終模型表現是這樣的:

Precision: 0.62Recall: 0.553571428571F1 score: 0.584905660377Mean Squared Error: 0.0430756924477

怎麼樣,看完有沒有躍躍欲試呢?

作 者| Siraj Raval 大數據文摘經授權譯製

翻 譯| 糖竹子、狗小白、鄧子稷

時間軸| 韓振峰、Barbara、菜菜Tom

監 制| 龍牧雪

相關焦點

  • 手把手:教你如何用深度學習模型預測加密貨幣價格
    實際上,我並沒有持有任何加密貨幣,但說起憑藉深度學習、機器學習以及人工智慧成功預測加密貨幣的價格,我覺得自己還算是個老司機。一開始,我認為把深度學習和加密貨幣結合在一起研究是個非常新穎獨特的想法,但是當我在準備這篇文章時,我發現了一篇類似的文章。那篇文章只談到比特幣。
  • AI 教你投資:用深度學習預測虛擬貨幣價格
    年,由於加密貨幣市值連續幾個月呈指數增長,其受歡迎程度飆升。加儘管機器學習已經成功地通過一系列不同的時間序列模型來預測股市價格,但它在預測加密貨幣價格方面的應用卻非常有限。其背後的原因是顯而易見的,因為加密貨幣的價格取決於許多因素,如技術進步、內部競爭、市場交付壓力、經濟問題、安全問題、政治因素等。如果採取明智的投資策略,它們價格的高波動性將帶來巨大的利潤。不幸的是,由於缺乏指數,與股市等傳統金融預測相比,加密貨幣的預測相對較難。
  • 如何使用數學來交易比特幣和主流加密貨幣
    比特幣價格突破了2019年的高點,之後又上漲了5,000美元。該水平之所以對確認逆轉至牛市之所以如此關鍵,是因為阻力位於黃金比率(基於斐波納契序列)。今天,為慶祝斐波那契紀念日和比特幣突破關鍵區域,我們正在深入研究斐波那契回撤水平在加密貨幣等金融市場中的重要性。
  • 魔鏡魔鏡告訴我,數字貨幣未來價格可以預測嗎?
    ,關於數字貨幣的炒作一直不斷。近日在外媒Medium上,就有一位叫做Chalita Lertlumprasert的博主發表了如何用機器學習來預測數字貨幣價格變化的文章,雷鋒網整理如下:機器學習分析數字貨幣價格變化的原理 在經典的時間序列分析中,我們認為觀察到的時間序列是模式和隨機變量的組合。使用這種方法,我們可以根據歷史數據預測未來的價值。
  • 高級論證:如何使用數學來交易比特幣和主流加密貨幣
    今天,為慶祝斐波那契紀念日和比特幣突破關鍵區域,我們正在深入研究斐波那契回撤水平在加密貨幣等金融市場中的重要性。什麼是斐波那契數列?儘管比特幣是一種不同於以往任何其他資產的資產,但它仍然傾向於遵循與金融市場相同的規律和動態。
  • EKT多鏈技術談丨加密貨幣如何加密
    由於貨幣需要一個穩定的系統,智能合約設計的越複雜出錯的可能性就越高,所以在早期中本聰認為貨幣系統是不需要圖靈完備的語言的,比特幣之所以不支持智能合約也是覺得貨幣需要極高的穩定性。那麼,如何運行區塊鏈裡保持貨幣穩定的同時又可以開發Dapp呢?有一種比較好的解決辦法就是把Token鏈和DApp鏈分開。
  • 手把手教你如何隱藏電腦文件夾
    手把手教你如何隱藏電腦文件夾時間:2017-08-03 19:24   來源:三聯   責任編輯:沫朵 川北在線核心提示:原標題:電腦怎麼隱藏文件夾? 手把手教你如何隱藏電腦文件夾 朋友會向你借電腦,但是電腦上有一些文件又不想讓其他人看到。該怎麼辦呢?有的人把它們放到U盤或移動硬碟,貼身保管;有的人則用軟體進行加密。
  • 手把手教你如何使用斐波那契回調線
    原標題:手把手教你如何使用斐波那契回調線 阿薩社區FXMAP 最優秀的外匯投資者,都看這裡最廣為人知的回調是50%位置,它在許多市場被大量的交易者密切關注是否有反轉出現。其他回調水平在市場中也較為敏感,比如斐波那契位。 在交易市場上,大多數的技術指標都具有滯後性,導致交易者在使用時不太好掌握。但是,斐波那契回調線具有提前性,能很好的幫助交易者掛單操作,提前布局。
  • 鏈聞乾貨:看看因子分析法如何確定加密貨幣估值
    ,比特幣規模最大、知名度最高,除此而外,幾乎每天都有新的加密資產面世,但如何去分析和評估其價值,投資者卻沒什麼頭緒,也沒有已經成型的工具可用。其依據是,多元化投資組合的長期預計回報率,可以用對這些因子的敞口程度來解釋,這樣你就無需了解所持的個股去預期其表現。除股市以外,因子分析法也可以用於包括加密貨幣市場在內的任何其他目標,拆解的因子也可以是任意數量。1992 年的經典論文「法馬-佛倫奇 Fama-French」論文中的模型採用了三個因子。
  • 加密貨幣交易中採用的夏普比率是什麼
    打開APP 加密貨幣交易中採用的夏普比率是什麼 發表於 2019-08-29 10:33:34 加密貨幣交易員對風險和波動性並不陌生
  • 2021年加密貨幣和區塊鏈的前景如何?20名專家如是說
    在新冠肺炎大流行期間,全球各國政府開始大量印鈔,這引發了人們對金融領域健康狀況的新擔憂,並促使人們轉向加密貨幣這種替代資產。因此,與2017年不同的是,比特幣(BTC)證明了自己可以對衝通脹,同時其作為價值儲存手段的地位也得到了加強。
  • 政府發幣、軍隊挖礦,一個國家的加密貨幣自救
    更為重要的是,馬杜羅任性地改變石油幣錨定石油市場價格的設定,隨意上調石油幣幣價,導致石油幣備受質疑。eToro高級分析師Mati Greenspan曾發推文稱:「魚與熊掌,不可兼得。如果石油幣價格與油價掛鈎,那麼總統就不能在電視上宣傳匯率。這根本不是自由市場的運作方式。石油幣不是一種加密貨幣!」
  • 數據顯示:加密貨幣交易所持有BTC下降,大量資金流入鯨魚手中
    小編:記得關注哦來源:火星一線文 | 梁雨山火星財經APP一線1月11日報導,根據CryptoQuant數據,近一年加密貨幣交易所的比特幣持有量持續下降。上圖顯示,在2020年3月中旬的流動性危機期間,加密貨幣交易所錢包中持有比特幣達300萬峰值,當時1枚比特幣價格暴跌至4000美元以下。
  • 「玩轉華為雲」手把手教你利用ModelArts實現人臉年齡預測
    華為雲官方網站手把手教你利用ModelArts實現人臉年齡預測年齡預測,是指自動識別出一張圖片中人物的年齡。這項技術有很多應用,如視頻監控、產品推薦、人機互動、市場分析、用戶畫像、年齡變化預測等。年齡預測場景年齡預測場景本實驗將對圖片中的人臉進行識別並根據人臉進行年齡預測。
  • 如何看加密數字貨幣Libra的價值
    Libra的儲備雖然Libra也使用「加密貨幣」的名稱,但與比特幣等加密貨幣相比,它的最大不同之處是Libra有充足的實物資產儲備作為後盾。因為沒有資產儲備作為後盾,比特幣的價值沒有錨定物,因而它價格的波動幅度很大。比特幣並不適合作為貨幣、電子現金或是全球支付系統的一部分,但它可以作為一個投資或投機工具。
  • 量子計算和加密貨幣
    在經典物理學中,幾乎所有事物都以可預測的方式運行-計算和測量可以精確進行。一旦你開始研究量子物理學大小的物體,事情將變得更加不可預測。也就是說,量子力學研究的對象是不可預測的。既然,不能保證量子物理學中的測量和計算是準確的,則只能使用概率來猜測。在量子水平上,粒子可以像波一樣開始表現,甚至根據是否被觀察而突然改變其狀態。
  • 採用加密貨幣的監管又有哪些好處?
    此外,儘管Libra本身已經撤回了開發完全分布式的區塊鏈網絡以支持非託管錢包的計劃,但快速的創新將導致大量的分布式協議允許非託管錢包在行動裝置上運行,以支持穩定價值的加密資產—穩定幣、交易加密的分布式協議的DEX(加密貨幣資產交易所)並提供其它金融服務而無需中介(DeFi)等所有這些有可能增加主流用戶對個人加密貨幣交易的採用。
  • 聽Paradigm創始人Fred Ehrsam談加密貨幣如何激活創作人和社區
    )共同探討了如何用加密技術激發創作者和社區靈感的大趨勢。網際網路創作者與加密貨幣社區對商業模式都有哪些探索?Fred: 嘿,夥計們。很高興我們終於找到了時間來討論一些我們最喜歡的話題:數字創作者,網際網路社區,以及加密將如何改變兩者的遊戲。我們先從創作者的角度說起。
  • 2021 加密世界風向:聽 42 位海外大咖預測行業前景
    文化創造者因此會獲得大量加密貨幣財富,有些實物會被代幣化並進行自我推銷——如果有人感興趣,我會出價。對於那些集成了閃電網絡的加密貨幣交易所、運行節點的人員、以及通過閃電網絡接受付款的企業來說,他們此前辛勤工作已經開始帶來豐厚的回報。 教育和Z 世代採用。這可能是我用最少證據做出的預測,我相信 2021 年加密貨幣價格走勢將推動更多人去理解加密貨幣,牛市是對初學者最好的教育材料。
  • 小白科普文丨加密貨幣錢包由什麼組成 有什麼值得推薦的數字加密...
    12月11日上午,以太坊創始人V神在拉丁美洲比特幣會議上表示:錢包的安全性仍是加密領域最大的問題之一。他認為,數字加密貨幣錢包還是很難使用,這使得它們對非技術用戶來說有些不安全,大規模採用加密貨幣時可能會帶來麻煩。