python - matplotlib繪圖

2020-12-11 算法星球

一、繪圖基礎

import matplotlib.pyplot as plt

import numpy as np

import tensorflow as tf

import pandas as pd

'''

Matplotlib中的pyplot子庫可用來快速繪製二維圖表

figure 創建畫布

plot 繪製圖形

show 顯示繪圖

figure(num,figsize,dpi,facecolor,edgecolor,frameon)

num：圖形編號或名稱，取值為數字/字符串

figsize：繪圖對象的寬和高，單位為英寸

dpi：繪圖對象的解析度，預設值為80

facecolor：背景顏色

edgecolor：邊框顏色

frameon：表示是否顯示邊框。

顏色和縮寫字符

blue b

black k

green g

white w

red r

cyan c

yellow y

magenta m

'''

# plt.figure(figsize=(3, 3), facecolor='red')

# plt.plot()

# plt.show()

'''

figure對象劃分子圖：subplot(行數, 列數, 子圖序號)

子圖序號從1開始

每個subplot()函數隻創建一個子圖。要創建4個子圖，就需要4條語句

當subplot()函數中的3個參數都小於10時，可以省略參數間的逗號，用一個3位數來表示

'''

# fig = plt.figure()

# plt.subplot(2, 2, 1)

# plt.subplot(222)

# plt.subplot(223)

# plt.show()

'''

設置中文字體

plt.rcParams["font.sans-serif"] = "SimHei"

宋體 SimSun 楷體 KaiTi 黑體 SimHei 仿宋 FangSong 微軟雅黑 Microsoft YaHei

隸書 LiSu 微軟正黑體 Microsoft JhengHei 幼圓 YouYuan

恢復標準的默認配置：plt.rcdefaults()

'''

添加標題

添加全局標題：suptitle(標題文字)

添加自標題：title(標題文字)

suptitle()函數的主要參數及默認值：

x 標題位置的x坐標 0.5

y 標題位置的y坐標 0.98

color 標題顏色黑色

backgroundcolor 標題背景顏色 12

fontsize 標題的字體大小 (fontsize: xx-small x-small small medium large x-large xx-large)

fontweight 字體粗細 normal (fontweight: light normal medium semibold bold heavy black)

fontstyle 設置字體類型 (normal/italic/oblique)

horizontalalignment 標題水平對齊方式 center (left/right/center)

verticalalignment 標題的垂直對齊方式 top (center/top/bottom/baseline)

title()函數的主要參數：

loc 標題位置 left，right

rotation 標題文字旋轉角度

color 標題顏色黑色

fontsize 標題的字體大小

fontweight 字體粗細 normal

fontstyle 設置字體類型

horizontalalignment 標題水平對齊方式 center

verticalalignment 標題的垂直對齊方式 top

fontdict 設置參數字典

tight_layout()函數:

檢查坐標軸標籤、刻度標籤、和子圖標題，自動調整子圖，使之填充整個繪圖區域，並消除子圖之間的重疊。

tight_layout(rect=[left, bottom, right, top])

'''

# 缺少字體，則添加字體 https://blog.csdn.net/qq_39817865/article/details/101363401

'''

plt.rcParams["font.family"] = "SimHei"

fig = plt.figure(facecolor='lightgrey')

plt.subplot(221)

plt.title('子標題1')

plt.subplot(222)

plt.title('子標題2', loc='left', color='r')

plt.subplot(223)

fontdicTest = {'fontsize':12, 'color':'g', 'rotation':30}

plt.title('子標題3', fontdict=fontdicTest)

plt.subplot(224)

plt.title('子標題4', color='w', backgroundcolor='b')

plt.suptitle('全局標題', fontsize=25, color='y')

plt.tight_layout(rect=[0, 0, 1, 0.9])

plt.show()

'''

當tight_layout不起作用的時候，可以用savefig代替show

'''

# plt.savefig('fig.png', bbox_inches='tight') # 替換 plt.show()

二、散點圖

scatter(x, y, scale, color, marker, label)

參數：

x 數據點的x坐標不可省略

y 數據點的y坐標不可省略

scale 數據點的大小 36

color 數據點的顏色

marker 數據點的樣式』o』（圓點）

label 圖例文字

marker參數——數據點樣式：

- 實線 -- 虛線 -. 點線 : 點虛線 . 點 , 像素

1 朝下的三角 v 朝下的三角形

2 朝上的三角 ^ 朝上的三角形

3 朝左的三角 < 朝左的三角形

4 朝右的三角 > 朝右的三角形

s 正方形 D 鑽石形

p 五角形 d 小版鑽石形

o 圓形 * 星型

| 垂直線形 + +號標記

h 1號六角形 _ 水平線行

x x號標記 H 2號六角形

'''

添加文字—— text()函數

text(x, y, s, fontsize, color)

x 文字的x坐標不可省略

y 文字的y坐標不可省略

s 顯示的文字不可省略

fontsize 文字的大小 12

color 文字的顏色黑色

坐標軸設置：

plt.rcParams["axes.unicode_minus"] = False

xlabel(x, y, s, fontsize,color) 設置x軸標籤

ylabel( x, y, s, fontsize,color) 設置y軸標籤

xlim(xmin, xmax) 設置x軸坐標的範圍

ylim(ymin, ymax) 設置y軸坐標的範圍

tick_params(labelsize) 設置刻度文字的字號

'''

繪製標準正態分布的散點圖

'''

plt.rcParams['font.sans-serif']= "SimHei"

plt.rcParams["axes.unicode_minus"] = False # 坐標軸的設置

n=1024

x = np.random.normal(0,1,n)

y = np.random.normal(0,1,n)

plt.scatter(x, y, color="blue",marker='*') # 繪製散點圖

plt.title("標準正態分布",fontsize=20) # 設置標題

plt.text(2.5, 2.5, "均值：0\n標準差：1") # 設置文本

plt.xlim(-4,4)

plt.ylim(-4,4) # 設置坐標軸的範圍

plt.xlabel('橫坐標x', fontsize=14)

plt.ylabel('縱坐標y', fontsize=14) # 設置坐標軸的標籤

plt.show()

'''

繪製標準正態分布、均勻分布的散點圖

'''

增加圖例

scatter(x, y, scale, color, marker, label) label指定圖例內容

legend(loc, fontsize) 顯示圖例 loc參數制定圖例的位置

loc參數取值：

0 best 6 center left

1 upper right 7 center right

2 upper left 8 lower center

3 lower left 9 upper center

4 lower right 10 center

5 right

'''

plt.rcParams['font.sans-serif']= "SimHei"

plt.rcParams["axes.unicode_minus"] = False # 坐標軸的設置

n=1024

x1 = np.random.normal(0,1,n)

y1 = np.random.normal(0,1,n)

x2 = np.random.uniform(-4, 4, (1, n))

y2 = np.random.uniform(-4, 4, (1, n))

plt.scatter(x1, y1, color="blue", marker='*', label='正態分布') # 繪製散點圖

plt.scatter(x2, y2, color='y', marker='o', label='均勻分布')

plt.legend()

plt.title("標準正態分布",fontsize=20) # 設置標題

plt.text(2.5, 2.5, "均值：0\n標準差：1") # 設置文本

plt.xlim(-4,4)

plt.ylim(-4,4) # 設置坐標軸的範圍

plt.xlabel('橫坐標x', fontsize=14)

plt.ylabel('縱坐標y', fontsize=14) # 設置坐標軸的標籤

plt.show()

'''

三、折線圖和柱狀圖

繪製折線圖：

plot(x, y, color, marker, label, linewidth, markersize)

參數：

x 數據點的x坐標 0,1,2,3...

y 數據點的y坐標不可省略

color 數據點的顏色

marker 數據點的樣式』o』（圓點）

label 圖例文字

linewidth 折線的寬度

markersize 數據點的大小

'''

plt.rcParams['font.sans-serif']= "SimHei"

n = 24

y1 = np.random.randint(27, 37, n)

y2 = np.random.randint(40, 60, n)

plt.plot(y1, label='溫度')

plt.plot(y2, label='溼度')

plt.xlim(0, 23)

plt.ylim(20, 70)

plt.xlabel('小時', fontsize=12)

plt.ylabel('溫溼度', fontsize=12)

plt.title('24小時溫溼度統計', fontsize=20)

plt.legend()

plt.show()

'''

繪製柱形圖：

bar(left, height, width, facecolor, edgecolor, label)

'''

plt.rcParams['font.sans-serif']= "SimHei"

plt.rcParams["axes.unicode_minus"] = False # 坐標軸的設置

y1=[32,25,16,30,24,45,40,33,28,17,24,20]

y2=[-23,-35,-26,-35,-45,-43,-35,-32,-23,-17,-22,-28]

plt.bar(range(len(y1)), y1, width=0.8,facecolor='green', edgecolor='white',label='統計量1')

plt.bar(range(len(y2)), y2, width=0.8,facecolor='red', edgecolor='white',label='統計量2')

plt.title('柱狀圖', fontsize=20)

plt.legend()

plt.show()

'''

matplotlib官網

http:// matplotlib.org

https://matplotlib.org/genindex.html

Gallery頁面：https://matplotlib.org/gallery.html

'''

四、波士頓房價數據集可視化

Keras: 是一個高層的神經網絡和深度學習庫。內置了一些常用的公共數據集，可以通過keras.datasets模塊加載和訪問

keras中集成的數據集：

1 boston_housing 波士頓房價數據集

2 CIFAR10 10種類別的圖片集

3 CIFAR100 100種類別的圖片集

4 MNIST 手寫數字圖片集

5 Fashion-MNIST 10種時尚類別的圖片集

6 IMDB 電影點評數據集

7 reuters 路透社新聞數據集

波士頓房價的數據：

1 CRIM 城鎮人均犯罪率 0.00632

2 ZN 超過25000平方英尺的住宅用地所佔比例 18.0

3 INDUS 城鎮非零售業的商業用地所佔比例 2.31

4 CHAS 是否被Charles河流穿過（取值1：是；取值0：否） 0

5 NOX 一氧化碳濃度 0.538

6 RM 每棟住宅的平均房間數 6.575

7 AGE 早於1940年建成的自住房屋比例 65.2

8 DIS 到波士頓5個中心區域的加權平均距離 4.0900

9 RAD 到達高速公路的便利指數 1

10 TAX 每10000美元的全值財產稅率 296

11 PTRATIO 城鎮中師生比例 15.3

12 B 反映城鎮中的黑人比例的指標，越靠近0.63越小；B=1000*(BK-0.63)2，其中BK是黑人的比例。396.90

13 LSTAT 低收入人口的比例 7.68

14 MEDV 自住房屋房價的平均房價（單位為1000美元） 24.0

'''

# tensorflow安裝 https://zhuanlan.zhihu.com/p/61472293

'''

加載數據集

load_data() test_split參數用來設置測試集所佔用的數據比例

'''

boston_housing = tf.keras.datasets.boston_housing

(train_x, train_y), (test_x, test_y) = boston_housing.load_data()

print("train set: ", len(train_x)) # train set: 404

print("test set: ", len(test_x)) # test set: 102

'''

# 提取全部數據作為訓練集

'''

boston_housing = tf.keras.datasets.boston_housing

(train_x, train_y), (test_x, test_y) = boston_housing.load_data(test_split=0)

print("train set: ", len(train_x)) # train set: 506

print("test set: ", len(test_x)) # test set: 0

print(type(train_x)) # <class 'numpy.ndarray'>

print(type(train_y)) # <class 'numpy.ndarray'>

print("train_x的維度：", train_x.ndim) # train_x的維度：2

print("train_x的形狀：", train_x.shape) # train_x的形狀：(506, 13)

print("train_y的維度：", train_y.ndim) # train_y的維度：1

print("train_y的形狀：", train_y.shape) # train_y的形狀：(506,)

'''

平均房價數與房價之間的關係：

'''

plt.rcParams['font.sans-serif']= "SimHei"

plt.rcParams["axes.unicode_minus"] = False # 坐標軸的設置

boston_housing = tf.keras.datasets.boston_housing

(train_x, train_y), (test_x, test_y) = boston_housing.load_data()

titles = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT']

plt.figure(figsize=(5, 5))

plt.scatter(train_x[:, 5], train_y)

plt.xlabel('房間數')

plt.ylabel('房價')

plt.title('房間數與房價間的關係')

plt.show()

plt.figure(figsize=(12, 12))

for i in range(13):

plt.subplot(4, 4, (i+1))

plt.scatter(train_x[:, i], train_y)

plt.xlabel(titles[i])

plt.ylabel('價格')

plt.title(str(i+1) + "." + titles[i])

plt.tight_layout()

plt.suptitle("各個屬性與房屋之間的關係", x=0.5, y=1.02, fontsize=20)

plt.savefig('house.png')

# plt.show()

'''

五、鳶尾花數據集

'''

get_file()函數 —— 下載數據集

tf.keras.utils.get_file(fname, origin, cache_dir)

參數：

fname：下載後的文件名；

origin：文件的URL地址；

cache_dir：下載後文件的存儲位置

返回值：下載後的文件在本地磁碟中的絕對路徑

'''

# url = 'http://download.tensorflow.org/data/iris_traning.csv'

# train_path = tf.keras.utils.get_file('iris_traning.csv', url)

# print(train_path) # /User/diana/data/iris_traning.csv

'''

split()函數通過指定的分隔符對字符串進行切片，並返回一個列表

'''

獲取文件名：

'''

# fname_list = url.split("/")

# print(fname_list) # ['http:', '', 'download.tensorflow.org', 'data', 'iris_traning.csv']

# fname = fname_list[-1]

# print(fname) # 文件名 'iris_traning.csv'

# 所以上邊的代碼可以寫成：train_path = tf.keras.utils.get_file(fname_list[-1], url)

'''

讀取數據集文件

pd.read_csv(filepath_or_buffer, header, names)

參數：

filepath_or_buffer 文件名

header：header 的取值是行號，行號從0開始；header=0，第1行數據做為列標題(默認設置)；header=None, 沒有列標題

names：自定義列標題，代替header參數指定的列標題

'''

# df_iris = pd.read_csv("iris.csv")

# print(type(df_iris)) # <class 'pandas.core.frame.DataFrame'

# DataFrame 是一種Pandas中常用的數據類型，二維數據表

# head() 列印前5行數據

# df_iris = pd.read_csv("iris.csv", header=None)

# print(df_iris.head())

# 0 1 2 3 4 5

# 0 NaN Sepal.Length Sepal.Width Petal.Length Petal.Width Species

# 1 1.0 5.1 3.5 1.4 0.2 setosa

# 2 2.0 4.9 3 1.4 0.2 setosa

# 3 3.0 4.7 3.2 1.3 0.2 setosa

# 4 4.0 4.6 3.1 1.5 0.2 setosa

# df_iris = pd.read_csv('iris.csv', header=0)

# print(df_iris.head())

# Unnamed: 0 Sepal.Length Sepal.Width Petal.Length Petal.Width Species

# 0 1 5.1 3.5 1.4 0.2 setosa

# 1 2 4.9 3.0 1.4 0.2 setosa

# 2 3 4.7 3.2 1.3 0.2 setosa

# 3 4 4.6 3.1 1.5 0.2 setosa

# 4 5 5.0 3.6 1.4 0.2 setosa

# df_iris = pd.read_csv('iris.csv', header=2) # header=2。默認將第二行的數據作為列標題，第二行之前的數據會刪除

# print(df_iris.head())

# 2 4.9 3 1.4 0.2 setosa

# 0 3 4.7 3.2 1.3 0.2 setosa

# 1 4 4.6 3.1 1.5 0.2 setosa

# 2 5 5.0 3.6 1.4 0.2 setosa

# 3 6 5.4 3.9 1.7 0.4 setosa

# 4 7 4.6 3.4 1.4 0.3 setosa

'''

column_names = ['花萼長度','花萼寬度','花瓣長度','花瓣寬度','品種']

df_iris = pd.read_csv('iris.csv', header=0, names=column_names)

print(df_iris.head())

'''

# 花萼長度花萼寬度花瓣長度花瓣寬度品種

# 1 5.1 3.5 1.4 0.2 setosa

# 2 4.9 3.0 1.4 0.2 setosa

# 3 4.7 3.2 1.3 0.2 setosa

# 4 4.6 3.1 1.5 0.2 setosa

# 5 5.0 3.6 1.4 0.2 setosa

'''

head() 默認讀取前5行的數據，可以用head(n)，表示讀取前n行的數據

tail() 默認讀取後5行的數據，可以用tail(n)，表示讀取後n行的數據

使用索引和切片：df_iris[10:16] 表示讀取索引值為10-15的行

顯示統計信息：

describe()方法：顯示二維數據的統計信息。包括：總數，平均值，標準差，最小值，1/4值，1/2值，3/4值，最大值

'''

# print(df_iris.describe())

# 花萼長度花萼寬度花瓣長度花瓣寬度

# count 150.000000 150.000000 150.000000 150.000000

# mean 5.843333 3.057333 3.758000 1.199333

# std 0.828066 0.435866 1.765298 0.762238

# min 4.300000 2.000000 1.000000 0.100000

# 25% 5.100000 2.800000 1.600000 0.300000

# 50% 5.800000 3.000000 4.350000 1.300000

# 75% 6.400000 3.300000 5.100000 1.800000

# max 7.900000 4.400000 6.900000 2.500000

'''

DataFrame的常用屬性：ndim、size、shape

ndim 數據表的維數

shape 數據表的形狀

size 數據表元素的總個數

'''

# print(df_iris.ndim, df_iris.shape, df_iris.size) # 2 (150, 5) 750

'''

DataFrame轉化為numpy數組：

np.array()

DataFrame.values

DataFrame.as_matrix()

'''

iris1 = np.array(df_iris)

print(type(iris1)) # <class 'numpy.ndarray'>

iris2 = df_iris.values

iris3 = df_iris.as_matrix()

print(type(iris2)) # <class 'numpy.ndarray'>

print(type(iris3)) # <class 'numpy.ndarray'>

'''

# 訪問數組元素

# 查看所有鳶尾花的種類

# print(iris1[:, 4])

'''

鳶尾花數據集可視化

'''

plt.scatter(iris1[:, 2], iris1[:, 3]) # 分別代表花瓣長度和花瓣寬度

plt.show()

'''

色彩映射：將參數c指定為一個列表或數組，所繪製圖形的顏色，可以隨這個列表或數組中元素的值而變換，變換所對應的顏色由參數cmap中的顏色所提供

plt.scatter(x, y, c, cmap)

'''

x = np.arange(8)

y = np.arange(8)

colors = [0, 1, 2, 1, 2, 0, 1, 2] # 分別代表各個點所對應的顏色

plt.scatter(x, y, c=colors, marker='*', cmap='brg')

plt.show()

'''

iris = df_iris.values

plt.scatter(iris[:, 2], iris[:, 3], c = iris[:, 4], marker='*', cmap='brg')

plt.show()

'''

column_names = ['花萼長度','花萼寬度','花瓣長度','花瓣寬度','品種']

df_iris = pd.read_csv('iris.csv', header=0, names=column_names)

iris = df_iris.values

'''

plt.scatter(iris[:, 2], iris[:, 3], c = iris[:, 4], marker='*', cmap='brg')

plt.title('鳶尾花數據集')

plt.xlabel(column_names[2])

plt.ylabel(column_names[3])

plt.show()

'''

fig = plt.figure('iris data', figsize=(16, 16))

fig.suptitle('鳶尾花數據', fontsize=20)

for i in range(4):

for j in range(4):

plt.subplot(4, 4, (4 * i + j + 1))

if i == j:

plt.text(0.4, 0.5, column_names[i], fontsize=16)

else:

plt.scatter(iris[:, i], iris[:, j], c=iris[:, 4], cmap='brg')

plt.xlabel(column_names[i])

plt.ylabel(column_names[j])

plt.show()

混淆題：

import numpy as np

import matplotlib.pyplot as plt

x = np.arange(8)

y = np.arange(8)

dot_color = [2, 1, 0, 0, 1, 2, 2, 0]

plt.scatter(x,y, c=dot_color,cmap = 'brg')

plt.show()

# x = 3處的數據點是藍色

https://blog.csdn.net/weixin_42357472/article/details/103448062 . python3 安裝tensorflow後，import 報錯

python - matplotlib繪圖

相關焦點

如何用matplotlib繪圖呢?

「繪圖,讓科學生動起來」:Python-matplotlib繪圖(專題一)python環境搭建篇

python:matplotlib入門詳細教程

python使用matplotlib畫動態圖

python數據科學系列:matplotlib入門詳細教程

Python之matplotlib繪圖示例

Python 數據分析:Matplotlib 繪圖

Python 繪圖,我只用 Matplotlib(二)

Python數據分析之matplotlib繪圖基礎

【Python】機器學習繪圖神器Matplotlib首秀!

Python 繪圖庫 Matplotlib 入門教程

matplotlib繪圖邏輯(上)

Matplotlib入門詳細教程

高效使用 Python 可視化工具 Matplotlib

matplotlib繪圖的核心原理講解

Python matplotlib繪圖示例 - 繪製三維圖形

專題第19篇:Python繪圖神器之matplotlib

Python Matplotlib入門學習(一)

Python-Matplotlib: 官方學習手冊獲取與學習

Python-matplotlib: 圖表手繪風