本節將會學習如何利用地理空間數據進行連接、篩選等操作,首先我們加載軟體環境和數據。
sf:(simple features, standard way to encode spatial vector data
tidyverse:The tidyverse package is designed to make it easy to install and load core packages from the tidyverse in a single command
janitor:janitor has simple functions for examining and cleaning dirty data
tmap/tmaptools:tmap is an actively maintained open-source R-library for drawing thematic maps
library(pacman)p_load(sf,tidyverse,janitor,tmap,tmaptools)tmap_mode("plot")#this will take a few minutesEW <- st_read("https://opendata.arcgis.com/datasets/8edafbe3276d4b56aec60991cbddda50_2.geojson")LondonData<-read_csv("https://files.datapress.com/london/dataset/ward-profiles-and-atlas/2015-09-24T14:21:24/ward-profiles-excel-version.csv",locale = locale(encoding = "latin1"),na = "n/a")LondonData數據清洗之前
LondonData數據清洗之後,即使用janitor的clean_names函數對列名稱做了一定的清洗,裡面的空格將會變成下劃線,而且大寫字母會轉化為小寫字母。
篩選lad15cd中以「E09」開頭的條目
#篩選lad15cd中以「E09」開頭的條目LondonMap<-EW%>%filter(str_detect(lad15cd,"^E09"))LondonMap可視化LondonMap
qtm()qtm(LondonMap,fill = "lad15nm")不過這裡,我們沒有對EW表格中的列名稱進行清洗。下面,我們要把地理信息表EW和信息表LondonData合併,採用連接的方法,整個表連結步驟如下:
BoroughDataMap<-EW %>% clean_names()%>% filter(str_detect(lad15cd,"^E09"))%>% merge(., LondonData, by.x="lad15cd", by.y="new_code", no.dups=TRUE)%>% #內連接 distinct(.,lad15cd, .keep_all = TRUE) #去除重複條目qtm(BoroughDataMap,fill = "rate_of_job_seekers_allowance_jsa_claimants_2015")