TensorFlow自動識別驗證碼(二)

2021-12-25 阿里雲應急響應

0X000 前言

在使用tensorflow自動識別驗證碼（一）這篇文章中，對使用tensorflow自動識別驗證碼的過程做了簡單的了解和編寫。
那麼今天這篇文章將對上篇文章中代碼進行修改用於實現對主流的CMS進行驗證碼的破解。

0x001 破解步驟

先回顧一下 tensorflow 的自動識別驗證碼的步驟

由於後面三步基本都是tensorflow自動完成
我們主要的工作是前兩步。所以步驟以以下幾步為主：

尋找開源系統中的驗證碼模塊

修改和測試驗證碼模塊

驗證碼模塊適配採樣代碼

修改識別 模型參數

0x002 尋找開源系統中的驗證碼模塊

先尋找你想要破解的cms（開不開源沒關係，最主要是你有源碼）。
這裡用的是XXXCMS（=。= 屏蔽掉了關鍵字自行想像）
我們先登陸一下管理員，OK，果然是有驗證碼的。

打開編輯器尋找到生成驗證碼的類 checkcode.class.php

<?php/** * 生成驗證碼 * @author chenzhouyu * 類用法 * $checkcode = new checkcode(); * $checkcode->doimage(); * //取得驗證 * $_SESSION['code']=$checkcode->get_code(); */class checkcode { //驗證碼的寬度 public $width=130; //驗證碼的高 public $height=50; //設置字體的地址 private $font; //設置字體色 public $font_color; //設置隨機生成因子 public $charset = 'abcdefghkmnprstuvwyzABCDEFGHKLMNPRSTUVWYZ23456789'; //設置背景色 public $background = '#EDF7FF'; //生成驗證碼字符數 public $code_len = 4; //字體大小 public $font_size = 20; //驗證碼 private $code; //圖片內存 private $img; //文字X軸開始的地方 private $x_start; function __construct() { $rand = rand(0,1); if($rand==0) { $this->font = PC_PATH.'libs'.DIRECTORY_SEPARATOR.'data'.DIRECTORY_SEPARATOR.'font'.DIRECTORY_SEPARATOR.'elephant.ttf'; } else { $this->font = PC_PATH.'libs'.DIRECTORY_SEPARATOR.'data'.DIRECTORY_SEPARATOR.'font'.DIRECTORY_SEPARATOR.'Vineta.ttf'; } } /** * 生成隨機驗證碼。 */ protected function creat_code() { $code = ''; $charset_len = strlen($this->charset)-1; for ($i=0; $i<$this->code_len; $i++) { $code .= $this->charset[rand(1, $charset_len)]; } $this->code = $code; } /** * 獲取驗證碼 */ public function get_code() { return strtolower($this->code); } /** * 生成圖片 */ public function doimage() { $code = $this->creat_code(); $this->img = imagecreatetruecolor($this->width, $this->height); if (!$this->font_color) { $this->font_color = imagecolorallocate($this->img, rand(0,156), rand(0,156), rand(0,156)); } else { $this->font_color = imagecolorallocate($this->img, hexdec(substr($this->font_color, 1,2)), hexdec(substr($this->font_color, 3,2)), hexdec(substr($this->font_color, 5,2))); } //設置背景色 $background = imagecolorallocate($this->img,hexdec(substr($this->background, 1,2)),hexdec(substr($this->background, 3,2)),hexdec(substr($this->background, 5,2))); //畫一個櫃形，設置背景顏色。 imagefilledrectangle($this->img,0, $this->height, $this->width, 0, $background); $this->creat_font(); $this->creat_line(); $this->output(); } /** * 生成文字 */ private function creat_font() { $x = $this->width/$this->code_len; for ($i=0; $i<$this->code_len; $i++) { imagettftext($this->img, $this->font_size, rand(-30,30), $x*$i+rand(0,5), $this->height/1.4, $this->font_color, $this->font, $this->code[$i]); if($i==0)$this->x_start=$x*$i+5; } } /** * 畫線 */ private function creat_line() { imagesetthickness($this->img, 3); $xpos = ($this->font_size * 2) + rand(-5, 5); $width = $this->width / 2.66 + rand(3, 10); $height = $this->font_size * 2.14; if ( rand(0,100) % 2 == 0 ) { $start = rand(0,66); $ypos = $this->height / 2 - rand(10, 30); $xpos += rand(5, 15); } else { $start = rand(180, 246); $ypos = $this->height / 2 + rand(10, 30); } $end = $start + rand(75, 110); imagearc($this->img, $xpos, $ypos, $width, $height, $start, $end, $this->font_color); if ( rand(1,75) % 2 == 0 ) { $start = rand(45, 111); $ypos = $this->height / 2 - rand(10, 30); $xpos += rand(5, 15); } else { $start = rand(200, 250); $ypos = $this->height / 2 + rand(10, 30); } $end = $start + rand(75, 100); imagearc($this->img, $this->width * .75, $ypos, $width, $height, $start, $end, $this->font_color); } /** * 輸出圖片 */ private function output() { header("content-type:image/png\r\n"); imagepng($this->img); imagedestroy($this->img); }}

前期準備工作基本完成。接下來是修改和測試驗證碼模塊

0x003 修改和測試驗證碼模塊

由於系統的驗證碼都是隨機生成且不可控
我們需要把上面的代碼改造成形如
create_img.php?code=XXXX 的形式
這樣子我們就可以通過上次的py的代碼隨機生成參數
來控制驗證碼的生成從而達到生成樣本的目的。
值得注意的是這個系統用了兩種字體去生成它的驗證碼
我們這為了減輕識別的負擔，把其中一個去掉。

改造後保存為 create_img.php

<?phpclass checkcode{ //驗證碼的寬度 public $width = 130; //驗證碼的高 public $height = 50; //設置字體的地址 private $font; //設置字體色 public $font_color; //設置隨機生成因子 public $charset = 'abcdefghkmnprstuvwyzABCDEFGHKLMNPRSTUVWYZ23456789'; //設置背景色 public $background = '#EDF7FF'; //生成驗證碼字符數 public $code_len = 4; //字體大小 public $font_size = 20; //驗證碼 private $code; //圖片內存 private $img; //文字X軸開始的地方 private $x_start; function __construct() { $this->font = './font/elephant.ttf'; } /** * 生成隨機驗證碼。 */ protected function creat_code() { $this->code = $_GET['code']; } /** * 獲取驗證碼 */ public function get_code() { return strtolower($this->code); } /** * 生成圖片 */ public function doimage() { $code = $this->creat_code(); $this->img = imagecreatetruecolor($this->width, $this->height); if (!$this->font_color) { $this->font_color = imagecolorallocate($this->img, rand(0, 156), rand(0, 156), rand(0, 156)); } else { $this->font_color = imagecolorallocate($this->img, hexdec(substr($this->font_color, 1, 2)), hexdec(substr($this->font_color, 3, 2)), hexdec(substr($this->font_color, 5, 2))); } //設置背景色 $background = imagecolorallocate($this->img, hexdec(substr($this->background, 1, 2)), hexdec(substr($this->background, 3, 2)), hexdec(substr($this->background, 5, 2))); //畫一個櫃形，設置背景顏色。 imagefilledrectangle($this->img, 0, $this->height, $this->width, 0, $background); $this->creat_font(); $this->creat_line(); $this->output(); } /** * 生成文字 */ private function creat_font() { $x = $this->width / $this->code_len; for ($i = 0; $i < $this->code_len; $i++) { imagettftext($this->img, $this->font_size, rand(-30, 30), $x * $i + rand(0, 5), $this->height / 1.4, $this->font_color, $this->font, $this->code[$i]); if ($i == 0) $this->x_start = $x * $i + 5; } } /** * 畫線 */ private function creat_line() { imagesetthickness($this->img, 3); $xpos = ($this->font_size * 2) + rand(-5, 5); $width = $this->width / 2.66 + rand(3, 10); $height = $this->font_size * 2.14; if (rand(0, 100) % 2 == 0) { $start = rand(0, 66); $ypos = $this->height / 2 - rand(10, 30); $xpos += rand(5, 15); } else { $start = rand(180, 246); $ypos = $this->height / 2 + rand(10, 30); } $end = $start + rand(75, 110); imagearc($this->img, $xpos, $ypos, $width, $height, $start, $end, $this->font_color); if (rand(1, 75) % 2 == 0) { $start = rand(45, 111); $ypos = $this->height / 2 - rand(10, 30); $xpos += rand(5, 15); } else { $start = rand(200, 250); $ypos = $this->height / 2 + rand(10, 30); } $end = $start + rand(75, 100); imagearc($this->img, $this->width * .75, $ypos, $width, $height, $start, $end, $this->font_color); } /** * 輸出圖片 */ private function output() { header("content-type:image/png\r\n"); imagepng($this->img); imagedestroy($this->img); }}$checkcode = new checkcode();$checkcode->doimage();

接下來要測試一下編寫 test.py

import requests as reqfrom PIL import Imagefrom io import BytesIOimport numpy as npresponse = req.get('http://127.0.0.1:8080/xxxcms/create_img.php?code=1234')image = Image.open(BytesIO(response.content))gray = image.convert('L') #灰值gray = gray.point(lambda x: 0 if x<128 else 255, '1') #去雜質gray.show()img = np.array(gray.getdata()) #轉換成數組print img

運行 python test.py

如果打開看到控制臺以及黑白圖片後
那麼代表驗證碼部分準備完成

0x004 驗證碼模塊適配採樣代碼

重點看幾個參數

驗證碼的 生成因子

驗證碼的長，寬

驗證碼的位數

上面的類中我們可以看到這幾個參數的值依次為

複製一份 generate_captcha.py 為 xxxcms_generate_captcha.py

添加
from io import BytesIO 和 import requests as req 的 import

主要修改兩個地方

第一個是開頭處的生成參數

width=130, # 驗證碼圖片的寬 height=50, # 驗證碼圖片的高 char_num=4, # 驗證碼字符個數 characters='abcdefghkmnprstuvwyzABCDEFGHKLMNPRSTUVWYZ23456789'):

第二個是 gen_captcha 的方法中獲取圖片的方法修改成test.py中的方法

X = np.zeros([batch_size, self.height, self.width, 1])img = np.zeros((self.height, self.width), dtype=np.uint8)Y = np.zeros([batch_size, self.char_num, self.classes])image = ImageCaptcha(width=self.width, height=self.height)while True: for i in range(batch_size): captcha_str = ''.join(random.sample(self.characters, self.char_num)) imgurl = 'http://127.0.0.1:8080/xxxcms/create_img.php?code='+captcha_str response = req.get(imgurl) img = Image.open(BytesIO(response.content)).convert('L') img = np.array(img.getdata()) X[i] = np.reshape(img, [self.height, self.width, 1]) / 255.0 for j, ch in enumerate(captcha_str): Y[i, j, self.characters.find(ch)] = 1 Y = np.reshape(Y, (batch_size, self.char_num * self.classes)) yield X, Y

打開 train_captcha.py
把import generate_captcha 改為
import xxxcms_generate_captcha as generate_captcha

重新運行 python train_captcha.py

剩下的流程就和第一篇一樣了。

0x005 一些小心得

TensorFlow自動識別驗證碼(二)

相關焦點

將您的代碼從 TensorFlow 1 遷移到 TensorFlow 2(二)

OpenCV+TensorFlow圖片手寫數字識別(附源碼)

TensorFlow (2) CIFAR-10 簡單圖像識別

這些驗證碼不再安全

Tensorflow官方語音識別入門教程 | 附Google新語音指令數據集

Ubuntu 18.04安裝Tensorflow(CPU)

TensorFlow Serving入門

TensorFlow 安裝手冊 — 使用 pip 安裝 TensorFlow

【乾貨】快速上手圖像識別:用TensorFlow API實現圖像分類實例

TensorFlow 2.0 部署:TensorFlow Serving

聲網:基於 TensorFlow 在實時音視頻中實現圖像識別

tensorflow(7)利用tensorflow/serving實現BERT模型部署

TensorFlow Profiler安裝指南(真香)

Tensorflow如何導出與使用預測圖

教程 | 如何使用TensorFlow API構建視頻物體識別系統

詳解Tensorflow模型量化(Quantization)原理及其實現方法

一文上手最新TensorFlow2.0系列(二)

Tensorflow in R 系列(1) :數字圖片分類

教程 | Tensorflow keras 極簡神經網絡構建與使用

Tensorflow 入門:Tensor