小說網絡結構圖(graph)的比較分析:神鵰、天龍、三國、紅樓

2021-01-18 愚者暗於成事智者見於未萌

該圖整體的集聚係數也升為 0.272，表明次要人物之間的互動佔全書篇幅的比例有所上升。畢竟，真實世界是非常去中心化的，三國的故事即使經過文學改造，仍然保留了部分現實的特點。

網絡圖中則體現得更為明顯：前面三張圖中明顯的星型模式消失了。雖然仍有比較次要的外圍人物，但中間再也不是一個人、一個軸心，而是賈寶玉、賈母、王熙鳳、王夫人、薛寶釵、林黛玉構成的六邊形這個結構，成為了中心。在這個六邊形內外，可以看到密密麻麻的二元關係。一般文學意義上主角和配角的區分，在《紅樓夢》中大大地淡化了。

但筆者跑這個程序，倒不是為了文學鑑賞，而是像開頭提到的上篇日誌中那樣，是為了優化對國際關係的研究和表述。我們對國際經濟和國際政治的理解，能不能像《紅樓夢》那樣，在腦海中天然呈現一個高度集聚的網絡的樣子，時時想到多方之間的相互影響，而不是動不動就歸結為兩國之間的博弈？複雜網絡範式的傳播，我覺得仍然任重而道遠。

---title: "Relation Network"subtitle: ''author: ""date: "`r Sys.Date()`"output:  html_document:    df_print: paged    fig_caption: yes    theme: united    highlight: haddock    number_sections: yes    toc: yes    toc_depth: 3    toc_float:      collapsed: true      smooth_scroll: FALSE  rticles::ctex:    df_print: default    fig_caption: yes    number_sections: yes    toc: yes    toc_depth: '2'  word_document:    fig_caption: yes    toc: yes    toc_depth: 3    reference_docx:   pdf_document:    toc: yes    toc_depth: '2'documentclass: ctexartclassoption: hyperref,---
```{r setup, include = FALSE}knitr::opts_chunk$set(  fig.width = 7, fig.asp = 0.618,  out.width = "90%", fig.align = "center",  fig.path = 'Figures/', fig.show = "hold",  warn = 1, warning = FALSE, message = FALSE, echo = TRUE,   comment = '', collapse = F,   cache = T, cache.comments = F, autodep = TRUE  )library('pacman')p_load(tidyverse, lubridate, data.table,        ggthemes, showtext, gridExtra, igraph, ggraph, tidygraph,        lmtest, plm, orcutt, stats, forecast, zoo,        rvest, httr, xml2,        sqldf, DT,        jiebaR, wordcloud2, webshot, htmlwidgets, tidytext        )options(sqldf.driver = "SQLite") pdf.options(family = "GB1")font_add('YaHei','simhei.ttf')windowsFonts(H = windowsFont("Microsoft YaHei"))showtext_auto(enable = TRUE)mytheme <- theme_economist_white() +  theme(text = element_text(family = 'YaHei'),        plot.title = element_text(face = 'bold', size = 14),         plot.subtitle = element_text(size = 12),        plot.caption = element_text(hjust = 0, size = 10, margin = margin(2,0,0,0,'pt')),        plot.margin = margin(12,10,12,0,'pt'),        legend.position = 'top',        legend.justification = 'left',        legend.margin = margin(4,0,0,0,'pt'),        legend.key.size = unit(1,'lines'),        legend.title = element_text(size = 12),        legend.text = element_text(size = 10, margin = margin(0,0,0,0,'pt')),        axis.text = element_text(size = 10, margin = margin(2,0,2,0,'pt')),        axis.ticks.length = unit(-4,'pt')        )theme_bar <- theme_economist_white() +  theme(text = element_text(family = 'YaHei'),         plot.title = element_text(face = 'bold', size = 14),         plot.subtitle = element_text(size = 12),        plot.caption = element_text(hjust = 0, size = 10,                                    margin = margin(0,0,0,0,'pt')),        plot.margin = margin(12,0,12,10,'pt'),        legend.position = 'top',        legend.justification = 'left',        legend.margin = margin(4,0,0,0,'pt'),        legend.key.size = unit(0.7,'lines'),        legend.title = element_blank(),        legend.text = element_text(size = 10, margin = margin(0,8,0,4,'pt')),        axis.text = element_text(size = 8),        axis.ticks.length = unit(0,'pt')         )mytheme_graph <- theme_void() +   theme(text = element_text(family = 'YaHei'),         plot.title = element_text(face = 'bold', size = 14),         plot.subtitle = element_text(size = 10),        plot.caption = element_text(hjust = 0, size = 6,                                    margin = margin(8,0,0,0,'pt')),        plot.margin = margin(12,0,12,10,'pt'),        legend.position = 'right',        legend.justification = 'left',        legend.margin = margin(4,0,0,0,'pt'),        legend.key.size = unit(0.7,'lines'),        legend.title = element_text(size = 7,                                    margin = margin(0,8,0,4,'pt')),        legend.text = element_text(size = 6,                                    margin = margin(0,8,0,4,'pt'))        )```


```{r}total <- readLines('天龍八部.txt', encoding = "UTF-8") %>%   tibble(text = .) %>%   filter(text != '') %>%   mutate(paragraph = row_number())```
```{r}roles <- readLines('天龍八部人物名單.txt', encoding = "ANSI") roles1 <- paste0('(', gsub(' ', ')|(', roles), ')') roles_name <- roles %>% str_sub(1,3) %>% str_trim()name_match <- tibble(name_unique = roles_name, name_all = roles1)replace_name <- function(string){  temp <- string  for (i in 1:length(roles_name)) {    temp <- str_replace_all(temp, roles1[i], roles_name[i])  }  return(temp)}result <- map_chr(total %>% pull(text), replace_name) replace_over <- total %>% mutate(text = result) ```



```{r, fig.showtext = TRUE}count_name <- function(name){  replace_over %>% mutate(count = str_count(text, name)) %>%     summarise(n = sum(count)) %>% pull(n) %>% return()}count_n <- function(name){  replace_over %>% mutate(count = str_detect(text, name)) %>%     summarise(n = sum(count)) %>% pull(n) %>% return()}count_words <- function(name){  replace_over %>%     mutate(count = str_detect(text, name)) %>%     filter(count == TRUE) %>% pull(text) %>% str_c(collapse = '\n') %>%     str_length() %>% return()}role_freq <- tibble(name = roles_name,                     count = map(roles_name, count_name) %>% unlist(),                     n_para = map(roles_name, count_n) %>% unlist(),                    n_words = map(roles_name, count_words) %>% unlist())p1 <- role_freq %>% mutate(name = reorder(name, n_words)) %>%   ggplot(aes(x = name, y = n_words)) +   geom_bar(stat = 'identity', position = 'dodge',            width = 0.7, fill = '#016392') +  scale_y_continuous(position = "right") +   labs(title = '人物出現段落的總字數：',        subtitle = '',        x = '', y = '') +  theme_barp1ggsave(file = "./Figures/frequency.pdf", plot = p1,        width = 8, height = 5)```
```{r, fig.showtext = TRUE}intimate <- function(string1, string2){  temp <- replace_over %>%    mutate(count1 = str_detect(text, string1),           count2 = str_detect(text, string2)) %>%    filter(count1 == T & count2 == T)  ifelse(length(temp$text) == 0, 0, temp %>% pull(text) %>% str_c(collapse = '\n') %>% str_length()) %>% return()}                   n <- length(roles_name)intimate_matrix <- diag(rep(0, n))colnames(intimate_matrix) <- roles_namerownames(intimate_matrix) <- roles_namefor (i in 1:n) {  for (j in 1:n) {    intimate_matrix[i,j] <- intimate(roles_name[i], roles_name[j])  }  intimate_matrix[i, i] <- 0}most_intimate <- max(intimate_matrix)library(pheatmap)pheatmap(intimate_matrix, cluster_cols = F, cluster_rows = F)intimate('段譽','王語嫣')intimate('虛竹','童姥')```
```{r, fig.showtext = TRUE}relations <- tibble(from = 'A', to = 'B', intimate = 0)for (i in 1:(n - 1)) {  for (j in (i + 1):n) {    temp <- tibble(from = roles_name[i], to = roles_name[j],            intimate = intimate_matrix[i,j])    relations <- bind_rows(relations, temp)  }}graph <- graph_from_data_frame(relations %>% filter(intimate > most_intimate/10),                                vertices = role_freq,                               directed = F)cfg <- cluster_fast_greedy(graph)cfggraph_tg <- as_tbl_graph(graph) %>%   mutate(deg = centrality_betweenness(normalized = T)) %>%   mutate(group=group_infomap()) title_tg <- str_c('人物關係網絡圖，集聚係數為 ',                   transitivity(graph_tg) %>% round(5))title_tgp3 <- ggraph(graph_tg, layout = 'kk') +     geom_edge_fan(aes(edge_width = intimate), color="lightblue",                 end_cap = circle(0.05, 'inches'),                 show.legend=T) +   geom_node_point(aes(size = deg, fill = factor(group)), shape = 21) +   geom_node_text(aes(label = name), size = 2.5,                  vjust = 1, hjust = 1) +  scale_color_discrete() +  scale_edge_width(range = c(0.1,2)) +   guides(fill=F) +  labs(title = title_tg,       subtitle = '',       size = '標準化的中介中心度',       edge_width = '共同出現的段落字數',       caption = '') +  mytheme_graphp3ggsave(file = "./Figures/Network2.pdf", plot = p3,        width = 8, height = 5)       transitivity(graph_tg) degree(graph_tg,normalized = T) closeness(graph_tg, normalized = T) betweenness(graph_tg, normalized = T) ```

小說網絡結構圖(graph)的比較分析:神鵰、天龍、三國、紅樓

相關焦點

圖神經網絡 Graph Neural Network (GNN)

金庸《天龍八部》與《水滸傳》結構相比較

少不看水滸,老不看三國,男不看西遊,女不看紅樓有沒有道理呢

網絡分析概述之網絡基礎簡介

圖論Graph theory

從圖(Graph)到圖卷積(Graph Convolution):漫談圖神經網絡 (二)

百度安全開源大規模圖資料庫HugeGraph

圖網絡未來幾何?12位學者寄望2021大爆發

金庸的小說《天龍八部》是啥意思?何謂「天龍八部」?

初識分布式圖資料庫 Nebula Graph 2.0 Query Engine

作品回顧:金庸小說在內地:從「毒草」到名著

GAT 圖注意力網絡 Graph Attention Network

使用Keras構建具有自定義結構和層次圖卷積神經網絡

為什麼深度模型在 CV 領域好使,但用在圖網絡上就這麼難呢?

Bioinf | 生物醫學網絡中的圖嵌入：方法、應用與評價

知深行遠:關於圖神經網絡層數加深的探索

北大、微軟亞洲研究院:高效的大規模圖神經網絡計算

圖神經網絡的表達能力,究竟有多強大?

通俗易懂:圖卷積神經網絡入門詳解

一文學會網絡分析——Co-occurrence網絡圖在R中的實現

小說網絡結構圖(graph)的比較分析:神鵰、天龍、三國、紅樓

相關焦點

圖神經網絡 Graph Neural Network (GNN)

金庸《天龍八部》與《水滸傳》結構相比較

少不看水滸,老不看三國,男不看西遊,女不看紅樓 有沒有道理呢

網絡分析概述之網絡基礎簡介

圖論Graph theory

從圖(Graph)到圖卷積(Graph Convolution):漫談圖神經網絡 (二)

百度安全開源大規模圖資料庫HugeGraph

圖網絡未來幾何?12位學者寄望2021大爆發

金庸的小說《天龍八部》是啥意思?何謂「天龍八部」?

初識分布式圖資料庫 Nebula Graph 2.0 Query Engine

作品回顧:金庸小說在內地:從「毒草」到名著

GAT 圖注意力網絡 Graph Attention Network

使用Keras構建具有自定義結構和層次圖卷積神經網絡

為什麼深度模型在 CV 領域好使,但用在圖網絡上就這麼難呢?

Bioinf | 生物醫學網絡中的圖嵌入：方法、應用與評價

知深行遠:關於圖神經網絡層數加深的探索

北大、微軟亞洲研究院:高效的大規模圖神經網絡計算

圖神經網絡的表達能力,究竟有多強大?

通俗易懂:圖卷積神經網絡入門詳解

一文學會網絡分析——Co-occurrence網絡圖在R中的實現

少不看水滸,老不看三國,男不看西遊,女不看紅樓有沒有道理呢