深度學習實戰:tensorflow訓練循環神經網絡模仿莎士比亞風格創作

2020-12-15 deephub

deephub翻譯組： Calab

FLORIZEL:Should she kneel be?In shall not weep received; unleased meAnd unrespective greeting than dwell in, thee,look』d on me, son in heavenly properly.

這是誰寫的，莎士比亞還是機器學習模型？

答案是後者！上面這篇文章是一個經過TensorFlow訓練的循環神經網絡的產物，經過30個epoch的訓練，並給出了一顆「FLORIZEL:」的種子。在本文中，我將解釋並給出如何訓練神經網絡來編寫莎士比亞戲劇或任何您希望它編寫的東西的代碼!

導入和數據

首先導入一些基本庫

import tensorflow as tf

import numpy as np

import os

import time

TensorFlow內置了莎士比亞作品。如果您在像Kaggle這樣的在線環境中工作，請確保連接了網際網路。

path_to_file = tf.keras.utils.get_file('shakespeare.txt', '')

數據需要用utf-8進行解碼。

text = open(path_to_file, 'rb').read().decode(encoding='utf-8')

# length of text is the number of characters in it

print ('Length of text: {} characters'.format(len(text)))

[輸出]:

Length of text: 1115394 characters

它裡面有很多的數據可以用！

我們看看前250個字符是什麼

print(text[:250])

向量化

首先看看文件裡面有多少不同的字符：

vocab = sorted(set(text))

print ('{} unique characters'.format(len(vocab)))

65 unique characters

在訓練之前，字符串需要映射到數字表示。下面創建兩個表—一個表將字符映射到數字，另一個表將數字映射到字符。

char2idx = {u:i for i, u in enumerate(vocab)}

idx2char = np.array(vocab)

text_as_int = np.array([char2idx[c] for c in text])

查看向量字典：

print('{')

for char,_ in zip(char2idx, range(20)):

print(' {:4s}: {:3d},'.format(repr(char), char2idx[char]))

print(' ...\n}')

[輸出]：

{

'\n': 0,

' ' : 1,

'!' : 2,

'$' : 3,

'&' : 4,

"'" : 5,

',' : 6,

'-' : 7,

'.' : 8,

'3' : 9,

':' : 10,

...

}

每一個不一樣的字符都有了編號。

我們看看向量生成器如何處理作品的前兩個單詞 'First Citizen'

print ('{} ---- characters mapped to int ---- > {}'.format(repr(text[:13]), text_as_int[:13]))

這些單詞被轉換成一個數字向量，這個向量可以很容易地通過整數到字符字典轉換回文本。

製造訓練數據

給定一個字符序列，該模型將理想地找到最有可能的下一個字符。文本將被分成幾個句子，每個輸入句子將包含文本中的一個可變的seq_length字符。任何輸入語句的輸出都將是輸入語句，向右移動一個字符。

例如，給定一個輸入「Hell」，輸出將是「ello」，從而形成單詞「Hello」。

首先，我們可以使用tensorflow的.from_tensor_slices函數將文本向量轉換為字符索引。

# The maximum length sentence we want for a single input in characters

seq_length = 100

examples_per_epoch = len(text)//(seq_length+1)

# Create training examples / targets

char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

for i in char_dataset.take(5):

print(idx2char[i.numpy()])

批處理方法允許這些單個字符成為確定大小的序列，形成段落片段。

sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

for item in sequences.take(5):

print(repr(''.join(idx2char[item.numpy()])))

'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou ' 'are all resolved rather to die than to famish?\n\nAll:\nResolved. resolved.\n\nFirst Citizen:\nFirst, you k' "now Caius Marcius is chief enemy to the people.\n\nAll:\nWe know't, we know't.\n\nFirst Citizen:\nLet us ki" "ll him, and we'll have corn at our own price.\nIs't a verdict?\n\nAll:\nNo more talking on't; let it be d" 'one: away, away!\n\nSecond Citizen:\nOne word, good citizens.\n\nFirst Citizen:\nWe are accounted poor citi'

對於每個序列，我們將複製它並使用map方法移動它以形成一個輸入和一個目標。

def split_input_target(chunk):

input_text = chunk[:-1]

target_text = chunk[1:]

return input_text, target_text

dataset = sequences.map(split_input_target)

現在，數據集已經變成了我們想要的輸入和輸出。

Input data: 'First Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou'

Target data: 'irst Citizen:\nBefore we proceed any further, hear me speak.\n\nAll:\nSpeak, speak.\n\nFirst Citizen:\nYou '

對向量的每個索引進行一次性處理;對於第0步的輸入，模型接收「F」的數值索引，並嘗試預測「i」作為下一個字符。在下一個時序步驟中，它做同樣的事情，但是RNN不僅考慮前面的步驟，而且還考慮它剛才預測的字符。

for i, (input_idx, target_idx) in enumerate(zip(input_example[:5], target_example[:5])):

print("Step {:4d}".format(i))

print(" input: {} ({:s})".format(input_idx, repr(idx2char[input_idx])))

print(" expected output: {} ({:s})".format(target_idx, repr(idx2char[target_idx])))

Step 0

input: 18 ('F')

expected output: 47 ('i')

Step 1

input: 47 ('i')

expected output: 56 ('r')

Step 2

input: 56 ('r')

expected output: 57 ('s')

Step 3

input: 57 ('s')

expected output: 58 ('t')

Step 4

input: 58 ('t')

expected output: 1 (' ')

Tensorflow的 tf.data 可以用來將文本分割成更易於管理的序列——但首先，需要將數據打亂並打包成批。

# Batch size

BATCH_SIZE = 64

# Buffer size to shuffle the dataset

BUFFER_SIZE = 10000

dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

dataset

構建模型

最後，我們可以構建模型。讓我們先設定一些重要的變量:

# Length of the vocabulary in chars

vocab_size = len(vocab)

# The embedding dimension

embedding_dim = 256

# Number of RNN units

rnn_units = 1024

模型將有一個嵌入層或輸入層，該層將每個字符的數量映射到一個具有變量embedding_dim維數的向量。它將有一個GRU層(可以用LSTM層代替)，大小為units = rnn_units。最後，輸出層將是一個標準的全連接層，帶有vocab_size輸出。

下面的函數幫助我們快速而清晰地創建一個模型。

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):

model = tf.keras.Sequential([

tf.keras.layers.Embedding(vocab_size, embedding_dim,

batch_input_shape=[batch_size, None]),

tf.keras.layers.GRU(rnn_units,

return_sequences=True,

stateful=True,

recurrent_initializer='glorot_uniform'),

tf.keras.layers.Dense(vocab_size)

])

return model

通過調用函數組合模型架構。

model = build_model(

vocab_size = len(vocab),

embedding_dim=embedding_dim,

rnn_units=rnn_units,

batch_size=BATCH_SIZE)

讓我們總結一下我們的模型，看看有多少參數。

Model: "sequential"

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

embedding (Embedding) (64, None, 256) 16640

gru (GRU) (64, None, 1024) 3938304

dense (Dense) (64, None, 65) 66625

Total params: 4,021,569

Trainable params: 4,021,569

Non-trainable params: 0

400萬的參數!我們希望把它訓練的久一點。

匯集

這個問題現在可以作為一個分類問題來處理。給定先前的RNN狀態和時間步長的輸入，預測表示下一個字符的類。因此，我們將附加一個稀疏分類熵損失函數和Adam優化器。

def loss(labels, logits):

return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

example_batch_loss = loss(target_example_batch, example_batch_predictions)

print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")

print("scalar_loss: ", example_batch_loss.numpy().mean())

model.compile(optimizer='adam', loss=loss)

Prediction shape: (64, 100, 65) # (batch_size, sequence_length, vocab_size)

scalar_loss: 4.1746616

配置檢查點

模型訓練，尤其是像莎士比亞戲劇這樣的大型數據集，需要很長時間。理想情況下，我們不會為了做出預測而反覆訓練它。tf.keras.callbacks.ModelCheckpoint函數可以在訓練期間將某些檢查點的權重保存到一個文件中，該文件可以在一個空白模型被後續檢索。這在訓練因任何原因中斷時也很方便。

# Directory where the checkpoints will be saved

checkpoint_dir = './training_checkpoints'

# Name of the checkpoint files

checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(

filepath=checkpoint_prefix,

save_weights_only=True)

最後，執行訓練

EPOCHS=30

history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])

這應該需要大約6個小時的時間來獲得不那麼令人印象深刻但更快的結果，epochs可以調整到10(任何小於5的都會完全變成垃圾)。

生成文本

衝檢查點中恢復權重參數

tf.train.latest_checkpoint(checkpoint_dir)

用這些權重參數我們可以重新構建模型：

model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)

model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))

model.build(tf.TensorShape([1, None]))

生成文本的步驟：

首先選擇一個種子字符串，初始化RNN狀態，並設置要生成的字符數。使用開始字符串和RNN狀態獲得下一個字符的預測分布。使用分類分布計算預測字符的索引，並將其作為模型的下一個輸入。模型返回的RNN狀態被反饋回自身。重複步驟2和步驟4，直到生成文本。def generate_text(model, start_string):

# Evaluation step (generating text using the learned model)

# Number of characters to generate

num_generate = 1000

# Converting our start string to numbers (vectorizing)

input_eval = [char2idx[s] for s in start_string]

input_eval = tf.expand_dims(input_eval, 0)

# Empty string to store our results

text_generated = []

# Low temperatures results in more predictable text.

# Higher temperatures results in more surprising text.

# Experiment to find the best setting.

temperature = 1.0

# Here batch size == 1

model.reset_states()

for i in range(num_generate):

predictions = model(input_eval)

# remove the batch dimension

predictions = tf.squeeze(predictions, 0)

# using a categorical distribution to predict the character returned by the model

predictions = predictions / temperature

predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

# We pass the predicted character as the next input to the model

# along with the previous hidden state

input_eval = tf.expand_dims([predicted_id], 0)

text_generated.append(idx2char[predicted_id])

return (start_string + ''.join(text_generated))

最後，給定一個開始字符串，我們可以生成一些有趣的文本。

現在，欣賞一下兩個RNN的劇本吧，一個是訓練了10個epochs，另一個是30個epochs。

這是訓練了10個epochs的

print(generate_text(model, start_string=u"ROMEO: "))

ROMEO: how I, away too put That you shall have thieffort, are but love.

JULIET: Go, fight, sir: we say 『Ay,』 and alack to stand and not to go to; And washt us him to-domm. Ay, my ows young; a man hear from his monsher to thee.

KING RICHARD III: Come, cease. O broteld the costime’s deforment! Thou wilt was quite.

PAULINA: I would you say the hour! Ah, hole for your company: But, good my lord; we have a king, of peace?

BALTHASAR: Cadul and washee could he ha! To curit her I may wench.

GLOUCESTER: Had you here shall such a pierce to temper; Or might his noble offery owe and speed Which seemest thy trims in a weaky amidude By this to the dother, dods citizens.

Third Citizen:Madam sweet give reward, rebeire them With news gone! Pluck yielding: ’tis sign out things Within risess in strifes all ten times, To dish his finmers for briefily.

JULIET:Gentlemen, God eveI come approbouting his wife as it, — triumphrous night change you gods, thou goest:To which will dispersed and France.

哇!僅僅在10個epochs之後，就有了令人印象深刻的理解。這些詞的拼寫準確性令人懷疑，但其中有明顯的情節衝突。寫作肯定可以改進。希望30-epoch模型能有更好的表現。

這是訓練了30個epochs的

欣賞一下完全由RNN一個字一個字地創作出來的作品吧！

BRUTUS:Could you be atherveshed him, our two,But much a tale lendly fear;For which we in thy shade of Naples.Here’s no increase False to’t, offorit is the war of white give again.This is the queen, whose vanoar’s head is worthly.But cere it be a witch, some comfort.What, nurse, I say!Go Hamell.

FLORIZEL:Should she kneel be?In shall not weep received; unleased meAnd unrespective greeting than dwell in, thee,look』d on me, son in heavenly properly,That ever you are my father is but straing;Unless you would repossess him, hath always louded up,You provokest. Good faith, o』erlar I can repart the heavens like deeds dillsFor temper as soon as another maiden here, and he is bann』d upon which springs;O』er most upon your voysus, I have no thunder; and my good villain!Alest each other’s sleepings.A fool; if this business prating dutyDoes these traitors other sorrow.

LUCENTIO:Tell me, they’s honourably.

Shepherd:I know, my lord, to London, and you my moved join under him,Great Apollo’s stan to make a book,Both yet my father away towards Covent. Tut, And thou still』d by the earthmen lord r sensible your mother?

Servant:Go, vill! We muster yet, for you』ll not: you are took good mad within your company in rage, I would you fight it so, his eye for every days,To swear the beam of such a detects,To Clarence dead to call upon you all I thank your grace, my father and my father, and yourself prevailsMy father, hath a sword for hither;Nor when thy heart is grown grave done.

*QUEEN MARGARET: *Thou art a lodging very good and give thanksWith him.But There is now in hand:Therefore it be possish』d with Romeo dead.

MENENIUS:Ha! little very welcome to my daughter’s sword,Which haply my prayer’s legs, such as he does.I am banks, sir, I』ll make you say 『nough; for hither so better now to be so, sent it: it is stranger.

哇!有趣的是，這個模型甚至學會了在某些情況下押韻(特別是Florizel的臺詞)。想像一下，在50甚至100個epochs之後，RNN能寫些什麼!

嗯，我猜想AI會讓作家失業

不完全是這樣——但我可以想像未來人工智慧會發表大量設計成病毒式傳播的文章。這是一個挑戰——收集與主題相關的頂級文章，比如Human Parts或其他類似出版物的文章，然後訓練人工智慧撰寫熱門文章。發布RNN的輸出，逐字地，看看效果如何！注意——我不建議在更專業的出版物上訓練RNN，比如Towards Data Science 或 Better Programming，因為它需要RNN在合理的時間內無法學習的技術知識。然而，在RNN目前的能力範圍內，更多的哲學和非技術的寫作還行。

隨著文本生成變得越來越先進，它將有潛力比人類寫得更好，因為它有一個眼睛，什麼內容將像病毒一樣，什麼措辭讓讀者感覺良好，等等。令人震驚的是，有一天，機器可以在人類最擅長的事情——寫作上擊敗人類。誠然，它無法真正理解自己在寫什麼，但它會掌握人類的交流方式。

我想如果你不能打敗他們，那就加入他們吧！

深度學習實戰:tensorflow訓練循環神經網絡模仿莎士比亞風格創作

相關焦點

TensorFlow和Caffe、MXNet、Keras等其他深度學習框架的對比

從星際2深度學習環境到神經機器翻譯,上手機器學習這些開源項目必...

谷歌開放GNMT教程:如何使用TensorFlow構建自己的神經機器翻譯系統

谷歌TensorFlow成為最受歡迎Python項目

TensorFlow 2入門指南,初學者必備!

從系統和代碼實現角度解析TensorFlow的內部實現原理 | 深度

這裡有一份TensorFlow2.0中文教程

TensorFlow 2.4來了:上線對分布式訓練和混合精度的新功能支持

能看破並說破一切的TensorFlow

TensorFlow 2.0開源工具書,30天「無痛」上手

代碼+實戰:TensorFlow Estimator of Deep CTR——DeepFM/NFM/AFM/...

mnist tensorflow 預測專題及常見問題 - CSDN

業界| 23個深度學習庫大排名:TensorFlow、Keras名列一二,Sonnet...

如何使用神經網絡彈奏出帶情感的音樂?

介紹TensorFlow物體檢測API訓練神經網絡、Python腳本尋找威利的過程

用TensorFlow 讓機器人唱首歌給你聽

TensorFlow官方開發者認證:考試費100美元,5小時完成5個模型

程式設計師實戰,生成式對抗網絡(GAN)下篇,tensorflow實現

AI大事件|深度強化學習Bootcamp-視頻講座&實驗

雅虎開源 TensorFlowOnSpark,TensorFlow 結合 Spark