首页 | 主题 | 图库 | 问答 | 文摘 | 原创 | 百科

历史 | 地理 | 人物 | 艺术 | 体育 | 科学 | 音乐 | 电影 | 信息技术 | 世界遗产

 开放、中立,源自维基百科

个人工具


用搜狗搜索相关网站  Google Search

方差

维库,知识与思想的自由文库

跳转到: 导航, 搜索


概率论统计学中,一个随机变量的“方差”描述的是它的发散程度,也就是该变量离其期望值的距离。 一个实随机变量的方差也称为它的二阶距,恰巧也是它的二阶culmulent。 方差的算术平方根称为该随机变量的标准差


In probability theory and statistics, the variance of a random variable is a measure of its statistical dispersion, indicating how far from the expected value its values typically are. The variance of a real-valued random variable is its second central moment, and it also happens to be its second cumulant. The variance of a random variable is the square of its standard deviation.

目录

[编辑] 定义

X 为服从分布 F 的随机变量,则称 Var(X) = E(XEX)2 为随机变量 X 或者分布 F方差

If \mu = \operatorname{E}(X) 是隨機變數 X期望值 (平均數) , 則其變異數為: \operatorname{var}(X) = \operatorname{E}( ( X - \mu ) ^ 2 ).

That is, it is the expected value of the square of the deviation of X from its own mean. In plain language, it can be expressed as "The average of the square of the distance of each data point from the mean". It is thus the mean squared deviation. The variance of random variable X is typically designated as \operatorname{var}(X), \sigma_X^2, or simply σ2.

Note that the above definition can be used for both discrete and continuous random variables.

Many distributions, such as the Cauchy distribution, do not have a variance because the relevant integral diverges. In particular, if a distribution does not have an expected value, it does not have a variance either. The converse is not true: there are distributions for which the expected value exists, but the variance does not.

[编辑] 特性

If the variance is defined, we can conclude that it is never negative because the squares are positive or zero. The unit of variance is the square of the unit of observation. For example, the variance of a set of heights measured in centimeters will be given in square centimeters. This fact is inconvenient and has motivated many statisticians to instead use the square root of the variance, known as the standard deviation, as a summary of dispersion.

It can be proven easily from the definition that the variance does not depend on the mean value μ. That is, if the variable is "displaced" an amount b by taking X + b, the variance of the resulting random variable is left untouched. By contrast, if the variable is multiplied by a scaling factor a, the variance is multiplied by a2. More formally, if a and b are real constants and X is a random variable whose variance is defined, then

\operatorname{var}(aX+b)=a^2\operatorname{var}(X).

Another formula for the variance that follows in a straightforward manner from the linearity of expected values and the above definition is:

\operatorname{var}(X)= \operatorname{E}(X^2 - 2\,X\,\operatorname{E}(X) + (\operatorname{E}(X))^2 ) = \operatorname{E}(X^2) - 2(\operatorname{E}(X))^2 + (\operatorname{E}(X))^2 = \operatorname{E}(X^2) - (\operatorname{E}(X))^2.

This is often used to calculate the variance in practice.

One reason for the use of the variance in preference to other measures of dispersion is that the variance of the sum (or the difference) of independent random variables is the sum of their variances. A weaker condition than independence, called uncorrelatedness also suffices. In general,

\operatorname{var}(aX+bY) =a^2 \operatorname{var}(X) + b^2 \operatorname{var}(Y) + 2ab\, \operatorname{cov}(X, Y).

Here \operatorname{cov} is the covariance, which is zero for independent random variables (if it exists).

[编辑] Approximating the variance of a function

The Delta method uses second-order Taylor expansions to approximate the variance of a function of one or more random variables. For example, the approximate variance of a function of one variable is given by

\operatorname{var}\left[f(X)\right]\approx \left(f'(\operatorname{E}\left[X\right])\right)^2\operatorname{var}\left[X\right]

provided that f(\cdot) is twice differentiable and that the mean and variance of X are finite.

[编辑] Population variance and sample variance

In general, the population variance of a finite population is given by

\sigma^2 = \sum_{i=1}^N  \left(x_i - \overline{x} \right)^ 2 \, \Pr(x_i),

where \overline{x} is the population mean. This is merely a special case of the general definition of variance introduced above, but restricted to finite populations.

In many practical situations, the true variance of a population is not known a priori and must be computed somehow. When dealing with large finite populations, it is almost never possible to find the exact value of the population variance, due to time, cost, and other resource constraints. When dealing with infinite populations, this is generally impossible.

A common method of estimating the variance of large (finite or infinite) populations is sampling. We start with a finite sample of values taken from the overall population. Suppose that our sample is the sequence (y_1,\dots,y_N). There are two distinct things we can do with this sample: first, we can treat it as a finite population and describe its variance; second, we can estimate the underlying population variance from this sample.

The variance of the sample (y_1,\dots,y_N), viewed as a finite population, is

\sigma^2 = \frac{1}{N} \sum_{i=1}^N  \left(y_i - \overline{y} \right)^ 2,

where \overline{y} is the sample mean. This is sometimes known as the sample variance; however, that term is ambiguous. Some electronic calculators can calculate σ2 at the press of a button, in which case that button is usually labelled "σ2".

When using the sample (y_1,\dots,y_N) to estimate the variance of the underlying larger population the sample was drawn from, it may be tempting to equate the population variance with σ2. However, σ2 is a biased estimator of the population variance. The following is an unbiased estimator:

s^2 = \frac{1}{N-1} \sum_{i=1}^N  \left(y_i - \overline{y} \right)^ 2,

where \overline{y} is the sample mean. Note that the term N − 1 in the denominator above contrasts with the equation for σ2, which has N in the denominator. Note that s2 is generally not identical to the true population variance; it is merely an estimate, though perhaps a very good one if N is large. Because s2 is a variance estimate and is based on a finite sample, it too is sometimes referred to as the sample variance.

One common source of confusion is that the term sample variance may refer to either the unbiased estimator s2 of the population variance, or to the variance σ2 of the sample viewed as a finite population. Both can be used to estimate the true population variance, but s2 is unbiased. Intuitively, computing the variance by dividing by N instead of N − 1 underestimates the population variance. This is because we are using the sample mean \overline{y} as an estimate of the unknown population mean μ, and the raw counts of repeated elements in the sample instead of the unknown true probabilities.

In practice, for large N, the distinction is often a minor one. In the course of statistical measurements, sample sizes so small as to warrant the use of the unbiased variance virtually never occur. In this context Press et al.[1] commented that if the difference between n and n−1 ever matters to you, then you are probably up to no good anyway - e.g., trying to substantiate a questionable hypothesis with marginal data.

[编辑] 方差的一个非偏估计

这里,我们证明为什么s2是总体偏差的一个非偏估计。如果θ的一个估计\hat{\theta}满足\operatorname{E}\{ \hat{\theta}\} = \theta, 我们就称\hat{\theta}为一个非偏估计。因此,要证明s2非偏,只需要证明 \operatorname{E}\{ s^2\} = \sigma^2. 假定,xi来自于总样本集,其均值为μ,方差为σ2.

\operatorname{E} \{ s^2 \}  = \operatorname{E} \left\{ \frac{1}{n-1} \sum_{i=1}^n  \left( x_i - \overline{x} \right) ^ 2 \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \operatorname{E} \left\{ \left( x_i - \overline{x} \right) ^ 2 \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \operatorname{E} \left\{ \left( (x_i - \mu) - (\overline{x} - \mu) \right) ^ 2 \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \left\{ \operatorname{E} \left\{ (x_i - \mu)^2 \right\}   - 2 \operatorname{E} \left\{ (x_i - \mu) (\overline{x} - \mu) \right\}   + \operatorname{E} \left\{ (\overline{x} - \mu)  ^ 2 \right\} \right\}
= \frac{1}{n-1} \sum_{i=1}^n \left\{ \sigma^2  - 2 \left( \frac{1}{n} \sum_{j=1}^n \operatorname{E} \left\{ (x_i - \mu) (x_j - \mu) \right\} \right)  + \frac{1}{n^2} \sum_{j=1}^n \sum_{k=1}^n \operatorname{E} \left\{ (x_j - \mu) (x_k - \mu) \right\} \right\}
= \frac{1}{n-1} \sum_{i=1}^n  \left\{ \sigma^2  - \frac{2 \sigma^2}{n}  + \frac{\sigma^2}{n} \right\}
= \frac{1}{n-1} \sum_{i=1}^n \frac{(n-1)\sigma^2}{n}
= \frac{(n-1)\sigma^2}{n-1} = \sigma^2

请参考计算方差的算法.

[编辑] 另一个证明

E\left[ \sum_{i=1}^n {(X_i-\overline{X})^2}\right] =E\left[ \sum_{i=1}^n {X_i^2}\right] - nE[ \overline{X}^2]
=nE[X_i^2] - \frac{1}{n} E\left[\left(\sum_{i=1}^n X_i\right)^2\right]
=n(\operatorname{var}[X_i] + (E[X_i])^2) - \frac{1}{n} E\left[\left(\sum_{i=1}^n X_i\right)^2\right]
=n\sigma^2 + \frac{1}{n}(nE[X_i])^2 - \frac{1}{n}E\left[\left(\sum_{i=1}^n X_i\right)^2\right]
=n\sigma^2 - \frac{1}{n}\left( E\left[\left(\sum_{i=1}^n X_i\right)^2\right] - \left(E\left[\sum_{i=1}^n X_i\right]\right)^2\right)
=n\sigma^2 - \frac{1}{n}\left(\operatorname{var}\left[\sum_{i=1}^n X_i\right]\right) =n\sigma^2 - \frac{1}{n}(n\sigma^2) =(n-1)\sigma^2.

[编辑] 一般化

如果X是一个向量其取值范围在Rn空间,并且其每个元素都是一个一维随机变量,我们就把X称为随机向量。随机向量的方差是一维随机变量方差的自然推广,其定义为E[(X − μ)(X − μ)T], 其中 μ = E(X) ,XTX的转秩. 这个方差是一个非负定方阵,通常称为协方差矩阵

如果X是一个复随机变量,那么其方差定义则为E[(X − μ)(X − μ)*], 其中X*X的复共轭向量。根据这个定义,方差为实数。

[编辑] 历史

方差这个词首先由Ronald Fisher在论文The Correlation Between Relatives on the Supposition of Mendelian Inheritance中引入.

[编辑] Moment of inertia

The variance of a probability distribution is analogous to the moment of inertia in classical mechanics of a corresponding linear mass distribution, with respect to rotation about its center of mass. It is because of this analogy that such things as the variance are called moments of probability distributions.

[编辑] 参见

[编辑] 参考出处

  1. ^ Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. (1986) Numerical recipes: The art of scientific computing. Cambridge: Cambridge University Press. (online)

[编辑] 外部连接

其它语言
AD Links