測量統計術語:效應值Standardized And Unstandardized Effect Sizes

2021-02-19 學界我說

在統計學中,效應值(Effect size)是量化現象強度的數值。效應值實際的統計量包括了二個變數間的相關程度、回歸模型中的回歸係數、不同處理間平均值的差異……等等。無論哪種效應值,其絕對值越大表示效應越強,也就是現象越明顯。效應值與特效檢驗的概念是互補的。在估算統計檢定力、需要的樣本數與進行元分析時,效應值經常扮演重要角色。

什麼時候用標準化的?什麼時候用非標準化?

如果變量的單位是有意義的,比如時長、身高、收入等等,建議匯報非標準化的效應值,結果可解釋為每一小時/一釐米/1元/1000元的變化可以帶來因變量c的變化,其中直接引起的變化為c',被中介的作用為ab.

如果變量的單位沒有實際意義,比如生活質量、婚姻生活滿意度、幸福感等,建議匯報標準化的效應值,這樣一來,可以比較不同自變量效應的大小。

The term effect size can refer to a standardized measure of effect (such as r, Cohen's d, or the odds ratio), or to an unstandardized measure (e.g., the difference between group means or the unstandardized regression coefficients). Standardized effect size measures are typically used when:

the metrics of variables being studied do not have intrinsic meaning (e.g., a score on a personality test on an arbitrary scale),

results from multiple studies are being combined,

some or all of the studies use different scales, or

it is desired to convey the size of an effect relative to the variability in the population.

In meta-analyses, standardized effect sizes are used as a common measure that can be calculated for different studies and then combined into an overall summary.

About 50 to 100 different measures of effect size are known.

Correlation Family: Effect Sizes Based On "Variance Explained"

These effect sizes estimate the amount of the variance within an experiment that is "explained" or "accounted for" by the experiment's model.

 Pearson R Or Correlation Coefficient

Pearson's correlation, often denoted r and introduced by Karl Pearson, is widely used as an effect size when paired quantitative data are available; for instance if one were studying the relationship between birth weight and longevity. The correlation coefficient can also be used when the data are binary. Pearson's r can vary in magnitude from −1 to 1, with −1 indicating a perfect negative linear relation, 1 indicating a perfect positive linear relation, and 0 indicating no linear relation between two variables. Cohen gives the following guidelines for the social sciences:

A related effect size is r, the coefficient of determination (also referred to as R or "r-squared"), calculated as the square of the Pearson correlation r. In the case of paired data, this is a measure of the proportion of variance shared by the two variables, and varies from 0 to 1. For example, with an r of 0.21 the coefficient of determination is 0.0441, meaning that 4.4% of the variance of either variable is shared with the other variable. The r is always positive, so does not convey the direction of the correlation between the two variables.

Eta-squared describes the ratio of variance explained in the dependent variable by a predictor while controlling for other predictors, making it analogous to the r. Eta-squared is a biased estimator of the variance explained by the model in the population (it estimates only the effect size in the sample). This estimate shares the weakness with r that each additional variable will automatically increase the value of η. In addition, it measures the variance explained of the sample, not the population, meaning that it will always overestimate the effect size, although the bias grows smaller as the sample grows larger.

This form of the formula is limited to between-subjects analysis with equal sample sizes in all cells. Since it is less biased (although not unbiased), ω is preferable to η; however, it can be more inconvenient to calculate for complex analyses. A generalized form of the estimator has been published for between-subjects and within-subjects analysis, repeated measure, mixed design, and randomized block design experiments. In addition, methods to calculate partial ω for individual factors and combined factors in designs with up to three independent variables have been published.

相關焦點

  • 響應比(Response Ratio)、功效(power)、效應量(effect size)
    通過將不同處理的STR ws和PTR wp值與control進行比較,計算Cohen’s d作為對不同處理的STR Ws和PTR Wp值的效應值大小的估計。在Cohen’s d中,正的d值表示處理中的響應變量(本例中為ws和wp)大於對照組,反之亦然。根據Cohen’s估計,0.2~0.5的效應值較小,0.5~0.8的效應值中等,>0.8的效應值較大,本文利用effsize包計算。
  • 元分析要避免使用前後測效應量(Pre-post effect sizes)
    Pre-post effect sizes should be avoided in meta-analysesAimsThe standardised mean difference (SMD) is one of the most used effect sizes to indicate the effects of treatments.
  • 術語理解:Tests, Assessment and Evaluation
    在語言測試領域,我們關注學生的語言能力,而其本質屬於人的心理特徵,是無法直接測量的。我們只能通過其外顯行為或外在表現特徵來推測學生語言能力的高低。而在眾多行為和特徵中,我們又只能選取一部分有代表性的「行為樣本」(如寫出來的句子,或者是說出來的話)進行測量,而且必須保證測量手段是客觀且標準化,這樣測量的結果才能是可靠的、有效的。
  • 使用lazysizes延遲加載圖片
    最棒的是 lazysizes 讓這種技術使用起來非常簡單。什麼是 lazysizes?lazysizes 是最流行的用於延遲加載圖片的庫。它會在用戶滾動頁面時智能地加載圖片,優先加載用戶很快要看到的圖片。
  • 參數測量統計鮮為人知的秘密
    本期我們將跳脫以往複雜的公式計算,深入淺出的解析其本質意義,讓其在參數測量中能獨領風騷,一戰群雄。本文引用地址:http://www.eepw.com.cn/article/201606/293409.htm  在示波器的參數測量統計中,有一項鮮為人知的統計結果——標準差(Stdev),本期我們由深到淺,從客觀的角度去了解它和背後的意義。
  • 國標《泵的振動測量與評價方法》術語和定義
    1、範圍本標準規定了在泵的非旋轉部件表面進行的振動測量、測量儀器及泵的振動評價方法。GB/T 2298 機械振動、衝擊與狀態監測 詞彙GB/T7021 離心泵名詞術語GB/T13824 對振動烈度測量儀的要求ISO 10816-7 機械振動 通過在非轉動部件上測量評估機械振動 第7部分:包括在旋轉軸上測量的工業設施用迴轉動力泵
  • 管理心理學之統計(11)t分數
    M的估計標準誤公式為:之所以用方差來代替標準差的原因是樣本方差是無偏差的統計量,用樣本方差來估計總體方差是最準確的。t分數和z分數唯一的不同在於z分數公式使用的是總體方差的真實值,t分數使用的是相應的樣本方差。
  • 【資料】2019行測資料分析常見統計術語列舉
    【資料】2019行測資料分析常見統計術語列舉由北京人事考試網提供:更多關於2019國家公務員考試,國家公務員考試,公務員考試的內容請關注國家公務員考試網/北京公務員考試網!或關注北京華圖微信公眾號(bjhuatu),國家公務員考試培訓諮詢電話:400-010-1568。
  • 2020安徽省考資料分析常見統計術語列舉
    資料分析中常見的統計術語有哪些?  在資料分析中最常見的統計術語包括以下這些:基期、現期;發展水平、增長量、增長率;幅度、同比增長、環比增長;基尼係數、恩格爾係數;強度指標;年均增長量、年均增長率;拉動增長、逐期增長量、累計增長量;序時平均數、平均發展速度、平均增長速度;百分數、百分點;倍數、翻番;進出口額、貿易順差、貿易逆差;比重;指數;出生率、死亡率、人口自然增長率等等
  • 磁測量常用專業術語總結和磁性能參數對應的磁測量儀器
    磁測量常用專業術語及含義以下是符合JJG1013-93 磁學計量常用名詞術語及定義試行技術規範的磁測量常用專業術語及含義。磁通計 Fluxmeter 利用電磁感應定律,測量感應電動勢對時間的積分原理製成的測量磁通量變化的直讀儀表。磁強計 Magnetometer 測量磁場的儀器。
  • 科研——關於效應量(effect size)你不知道的那些事兒
    接更來了~前幾天看了一篇文章,裡面出現了效應量(effect size)這個概念,之前在接觸循證課程對文章質量進行評價時就提到過這個概念
  • 統計知識 | Mini Meta-analysis或單論文中的元分析
    首先,元分析(meta-analysis),從本質上講就是採用統計的方法將多個研究中的數據結果綜合起來,從而對研究的效應進行一個量化的總結。元分析最早可能是用來證明吸菸與肺癌之間的關係(《The Emperor of All Maladies》中提及,細節記不清楚了,如有錯誤請指正)。當時菸草企業要極力否認這種關係,人類研究無法進行因果推斷,動物研究無法直接用到人類身上。
  • 醫學統計與R語言:Welch's ANOVA and Games-Howell post-hoc test
    Although rather similar to Tukey’s test in its formulation, the Games-Howell test does not assume equal variances and sample sizes.
  • 「MSA」術語
    【MSA】術語-設備特徵標準(Standard):用於比較的可接受偏倚受的準則;一已知的值,在不確定度(uncertainty)的指定範圍內,被接受為一真值(True value);參考值(reference value)。