tensorflow(8)將h5文件轉化為pb文件並利用tensorflow/serving實現模型部署

2021-02-20 NLP奇幻之旅

在文章NLP（三十四）使用keras-bert實現序列標註任務中，我們使用Keras和Keras-bert進行模型訓練、模型評估和模型預測。我們對人民日報實體數據集進行模型訓練，保存後的模型文件為example.h5，h5是Keras保存模型的一種文件格式。
在文章Keras入門（七）使用Flask+Keras-bert構建模型預測服務，我們也介紹了如何使用Flask和example.h5文件來實現模型預測的HTTP服務。
本文將會介紹如何將h5文件轉化為pb文件並利用tensorflow/serving實現模型部署。

將h5文件轉化為pb文件

在Github項目keras_to_tensorflow（網址為：https://github.com/amir-abdi/keras_to_tensorflow）中，有專門介紹如何將普通的keras模型轉化為tensorflow模型的辦法。本文在此基礎上，略微修改轉換的腳本（change_keras_h5_file_to_pb_models.py）如下：

# -*- coding: utf-8 -*-
import os
import tensorflow as tf
from tensorflow.python.framework import graph_util
from tensorflow.python.framework import graph_io
from pathlib import Path
from absl import app
from absl import flags
from absl import logging
import keras
from keras import backend as K
from keras.models import model_from_json

from keras_bert import get_custom_objects
from keras_contrib.layers import CRF
from keras_contrib.losses import crf_loss
from keras_contrib.metrics import crf_accuracy

custom_objects = get_custom_objects()
for key, value in {'CRF': CRF, 'crf_loss': crf_loss, 'crf_accuracy': crf_accuracy}.items():
    custom_objects[key] = value

K.set_learning_phase(0)
FLAGS = flags.FLAGS

flags.DEFINE_string('input_model', "../example_ner.h5", 'Path to the input model.')
flags.DEFINE_string('input_model_json', None, 'Path to the input model '
                                              'architecture in json format.')
flags.DEFINE_string('output_model', "./example_ner.pb", 'Path where the converted model will '
                                          'be stored.')
flags.DEFINE_boolean('save_graph_def', False,
                     'Whether to save the graphdef.pbtxt file which contains '
                     'the graph definition in ASCII format.')
flags.DEFINE_string('output_nodes_prefix', None,
                    'If set, the output nodes will be renamed to '
                    '`output_nodes_prefix`+i, where `i` will numerate the '
                    'number of of output nodes of the network.')
flags.DEFINE_boolean('quantize', False,
                     'If set, the resultant TensorFlow graph weights will be '
                     'converted from float into eight-bit equivalents. See '
                     'documentation here: '
                     'https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms')
flags.DEFINE_boolean('channels_first', False,
                     'Whether channels are the first dimension of a tensor. '
                     'The default is TensorFlow behaviour where channels are '
                     'the last dimension.')
flags.DEFINE_boolean('output_meta_ckpt', False,
                     'If set to True, exports the model as .meta, .index, and '
                     '.data files, with a checkpoint file. These can be later '
                     'loaded in TensorFlow to continue training.')

flags.mark_flag_as_required('input_model')
flags.mark_flag_as_required('output_model')

def load_model(input_model_path, input_json_path):
    if not Path(input_model_path).exists():
        raise FileNotFoundError(
            'Model file `{}` does not exist.'.format(input_model_path))
    try:
        # 下面一行已經修改，在改回普通的Keras加載模型時，需要去掉custom_objects
        model = keras.models.load_model(input_model_path, custom_objects=custom_objects)
        return model
    except FileNotFoundError as err:
        logging.error('Input mode file (%s) does not exist.', FLAGS.input_model)
        raise err
    except ValueError as wrong_file_err:
        if input_json_path:
            if not Path(input_json_path).exists():
                raise FileNotFoundError(
                    'Model description json file `{}` does not exist.'.format(
                        input_json_path))
            try:
                model = model_from_json(open(str(input_json_path)).read())
                model.load_weights(input_model_path)
                return model
            except Exception as err:
                logging.error("Couldn't load model from json.")
                raise err
        else:
            logging.error(
                'Input file specified only holds the weights, and not '
                'the model definition. Save the model using '
                'model.save(filename.h5) which will contain the network '
                'architecture as well as its weights. If the model is '
                'saved using model.save_weights(filename), the flag '
                'input_model_json should also be set to the '
                'architecture which is exported separately in a '
                'json format. Check the keras documentation for more details '
                '(https://keras.io/getting-started/faq/)')
            raise wrong_file_err

def main(args):
    logging.info("begin====================================================")
    # If output_model path is relative and in cwd, make it absolute from root
    output_model = FLAGS.output_model
    if str(Path(output_model).parent) == '.':
        output_model = str((Path.cwd() / output_model))

    output_fld = Path(output_model).parent
    output_model_name = Path(output_model).name
    output_model_stem = Path(output_model).stem
    output_model_pbtxt_name = output_model_stem + '.pbtxt'

    # Create output directory if it does not exist
    # print (Path(output_model).parent)
    if not os.path.exists(str(Path(output_model).parent)):
        Path(output_model).parent.mkdir(parents=True)

    if FLAGS.channels_first:
        K.set_image_data_format('channels_first')
    else:
        K.set_image_data_format('channels_last')

    model = load_model(FLAGS.input_model, FLAGS.input_model_json)

    input_node_names = [node.op.name for node in model.inputs]
    logging.info('Input nodes names are: %s', str(input_node_names))

    # TODO(amirabdi): Support networks with multiple inputs
    orig_output_node_names = [node.op.name for node in model.outputs]
    if FLAGS.output_nodes_prefix:  # 給模型節點編號
        num_output = len(orig_output_node_names)
        pred = [None] * num_output
        converted_output_node_names = [None] * num_output

        # Create dummy tf nodes to rename output
        for i in range(num_output):
            converted_output_node_names[i] = '{}{}'.format(
                FLAGS.output_nodes_prefix, i)
            pred[i] = tf.identity(model.outputs[i],
                                  name=converted_output_node_names[i])
    else:
        converted_output_node_names = orig_output_node_names
    logging.info('Converted output node names are: %s',
                 str(converted_output_node_names))

    sess = K.get_session()
    if FLAGS.output_meta_ckpt:  # 讓轉化的模型可以繼續被訓練
        saver = tf.train.Saver()
        saver.save(sess, str(output_fld / output_model_stem))

    if FLAGS.save_graph_def:  # 以ascii形式存儲模型
        tf.train.write_graph(sess.graph.as_graph_def(), str(output_fld),
                             output_model_pbtxt_name, as_text=True)
        logging.info('Saved the graph definition in ascii format at %s',
                     str(Path(output_fld) / output_model_pbtxt_name))

    if FLAGS.quantize:  # 將權重從float轉為八位比特
        from tensorflow.tools.graph_transforms import TransformGraph
        transforms = ["quantize_weights", "quantize_nodes"]
        transformed_graph_def = TransformGraph(sess.graph.as_graph_def(), [],
                                               converted_output_node_names,
                                               transforms)
        constant_graph = graph_util.convert_variables_to_constants(
            sess,
            transformed_graph_def,
            converted_output_node_names)
    else:  # float形式存儲權重
        constant_graph = graph_util.convert_variables_to_constants(
            sess,
            sess.graph.as_graph_def(),
            converted_output_node_names)

    graph_io.write_graph(constant_graph, str(output_fld), output_model_name,
                         as_text=False)
    logging.info('Saved the freezed graph at %s',
                 str(Path(output_fld) / output_model_name))

if __name__ == "__main__":
    app.run(main)

在該腳本中，input_model（輸入模型）的路徑為../example_ner.h5，output_model（輸出模型）的路徑為./example_ner.pb，同時加載了keras-bert訓練好的模型example_ner.h5。運行上述腳本，會在當前路徑下生成example_ner.pb，但很可惜，只有這一個腳本還無法實現在tensorflow/serving上的部署。
再接再厲！我們將上述生成的pb文件轉化為tensorflow/serving支持的文件格式。轉化的腳本（get_tf_serving_file.py）如下：

# -*- coding: utf-8 -*-
import os
import tensorflow as tf
from tensorflow.python.saved_model import signature_constants
from tensorflow.python.saved_model import tag_constants
from keras_bert import get_custom_objects
from keras_contrib.layers import CRF
from keras_contrib.losses import crf_loss
from keras_contrib.metrics import crf_accuracy
from keras.models import load_model

custom_objects = get_custom_objects()
for key, value in {'CRF': CRF, 'crf_loss': crf_loss, 'crf_accuracy': crf_accuracy}.items():
    custom_objects[key] = value

export_dir = '../example_ner/1'
graph_pb = './example_ner.pb'
model = load_model('../example_ner.h5', custom_objects=custom_objects)

builder = tf.saved_model.builder.SavedModelBuilder(export_dir)
with tf.gfile.GFile(graph_pb, "rb") as f:
    graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())

sigs = {}
with tf.Session(graph=tf.Graph()) as sess:
    tf.import_graph_def(graph_def, name="")

    g = tf.get_default_graph()

    sigs[signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY] = \
        tf.saved_model.signature_def_utils.predict_signature_def(
            inputs={"input_1": g.get_operation_by_name('input_1').outputs[0], "input_2": g.get_operation_by_name('input_2').outputs[0]},
            outputs={"output": g.get_operation_by_name('crf_1/one_hot').outputs[0]}
        )

    builder.add_meta_graph_and_variables(sess,
                                        [tag_constants.SERVING],
                                        signature_def_map = sigs)

    builder.save()

運行上述腳本，會在上級目錄生成example_ner/1文件夾，結構如下：

example_ner
└── 1
├── saved_model.pb
└── variables

2 directories, 1 file

至此，我們已經生成了tensorflow/serving支持的模型部署的文件格式。

利用tensorflow/serving實現模型部署

在文章tensorflow(5)將ckpt轉化為pb文件並利用tensorflow/serving實現模型部署及預測、tensorflow(6)利用tensorflow/serving實現模型部署及預測、tensorflow(7)利用tensorflow/serving實現BERT模型部署中，筆者已經給出了不少tensorflow/serving的使用說明，這裡不再詳細講述。
利用tensorflow/serving部署example_ner模型的命令如下：

docker run -t --rm -p 8561:8501 -v "$path/example_ner:/models/example_ner" -e MODEL_NAME=example_ner tensorflow/serving:1.14.0

模型調用腳本如下：

# -*- coding: utf-8 -*-
import json
import requests
import numpy as np
from pprint import pprint
from keras_bert import Tokenizer

# 讀取label2id字典
with open("../example_label2id.json", "r", encoding="utf-8") as h:
    label_id_dict = json.loads(h.read())

id_label_dict = {v: k for k, v in label_id_dict.items()}

# 載入數據
dict_path = '../chinese_L-12_H-768_A-12/vocab.txt'
token_dict = {}
with open(dict_path, 'r', encoding='utf-8') as reader:
    for line in reader:
        token = line.strip()
        token_dict[token] = len(token_dict)

class OurTokenizer(Tokenizer):
    def _tokenize(self, text):
        R = []
        for c in text:
            if c in self._token_dict:
                R.append(c)
            else:
                R.append('[UNK]')
        return R

# 將BIO序列轉化為JSON格式
def bio_to_json(string, tags):
    item = {"string": string, "entities": []}
    entity_name = ""
    entity_start = 0
    iCount = 0
    entity_tag = ""

    for c_idx in range(min(len(string), len(tags))):
        c, tag = string[c_idx], tags[c_idx]
        if c_idx < len(tags)-1:
            tag_next = tags[c_idx+1]
        else:
            tag_next = ''

        if tag[0] == 'B':
            entity_tag = tag[2:]
            entity_name = c
            entity_start = iCount
            if tag_next[2:] != entity_tag:
                item["entities"].append({"word": c, "start": iCount, "end": iCount + 1, "type": tag[2:]})
        elif tag[0] == "I":
            if tag[2:] != tags[c_idx-1][2:] or tags[c_idx-1][2:] == 'O':
                tags[c_idx] = 'O'
                pass
            else:
                entity_name = entity_name + c
                if tag_next[2:] != entity_tag:
                    item["entities"].append({"word": entity_name, "start": entity_start, "end": iCount + 1, "type": entity_tag})
                    entity_name = ''
        iCount += 1
    return item

tokenizer = OurTokenizer(token_dict)

# 測試HTTP響應時間
sentence = "井上雄彥的《灌籃高手》是一部作品，也是他自己的修行之路。"
token_ids, segment_is = tokenizer.encode(sentence, max_len=128)
tensor = {"instances": [{"input_1": token_ids, "input_2": segment_is}]}

url = "http://192.168.1.193:8561/v1/models/example_ner:predict"
req = requests.post(url, json=tensor)
if req.status_code == 200:
    t = np.asarray(req.json()['predictions'][0]).argmax(axis=1)
    tags = [id_label_dict[_] for _ in t]
    pprint(bio_to_json(sentence, tags[1:-1]))

輸出結果如下：

{'entities': [{'end': 4, 'start': 0, 'type': 'PER', 'word': '井上雄彥'}],
'string': '井上雄彥的《灌籃高手》是一部作品，也是他自己的修行之路。'}

總結

本文演示的所有腳本已經上傳至https://github.com/percent4/keras_bert_sequence_labeling/tree/master/h5_2_tensorflow_serving 。
2021.1.16於上海浦東

tensorflow(8)將h5文件轉化為pb文件並利用tensorflow/serving實現模型部署

相關焦點

tensorflow(6)利用tensorflow/serving實現模型部署及預測

TensorFlow 2.0 部署:TensorFlow Serving

基於tensorRT實現TensorFlow模型的高效推理

【他山之石】Tensorflow模型保存方式大匯總

一句代碼發布你的TensorFlow模型,簡明TensorFlow Serving上手教程

教程 | 從零開始:TensorFlow機器學習模型快速部署指南

容器化運行機器學習 TensorFlow Serving

使用TensorFlow的經驗分享

6 種方法部署 TensorFlow2 機器學習模型,簡單 + 快速 + 跨平臺!

TensorFlow Lite Android部署介紹

用 TFserving 部署深度學習模型

基於TensorFlow、Docker和Flask部署深度學習模型

Tensorflow的C語言接口部署DeeplabV3+語義分割模型

tensorflow(4)使用tensorboard查看ckpt和pb圖結構

發布新的中文系列視頻 | TensorFlow Lite 概述和模型轉化簡介

利用 NVIDIA TensorRT 優化 TensorFlow Serving 的性能

乾貨 | tensorflow模型導出與OpenCV DNN中使用

如何使用 TensorFlow mobile 將 PyTorch 和 Keras 模型部署到行動裝置

利用Tensorflow構建自己的物體識別模型(一)

Kubeflow實戰系列:利用TensorFlow Serving進行模型預測