Descriptive statistics Summary statistics that quantitatively describe or summarize features of a collection of information. More...

Static Public Member Functions
static	range (array $numbers)
	Range - the difference between the largest and smallest values It is the size of the smallest interval which contains all the data. More...

static	midrange (array $numbers)
	Midrange - the mean of the largest and smallest values It is the midpoint of the range; as such, it is a measure of central tendency. More...

static	variance (array $numbers, int $ν)
	Variance. More...

static	populationVariance (array $numbers)
	Population variance - Use when all possible observations of the system are present. More...

static	sampleVariance (array $numbers)
	Unbiased sample variance Use when only a subset of all possible observations of the system are present. More...

static	weightedSampleVariance (array $numbers, array $weights, bool $biased=false)
	Weighted sample variance. More...

static	standardDeviation (array $numbers, bool $SD＋=false)
	Standard deviation A measure that is used to quantify the amount of variation or dispersion of a set of data values. More...

static	sd (array $numbers, bool $SD＋=false)
	sd - Standard deviation - convenience method More...

static	meanAbsoluteDeviation (array $numbers)
	MAD - mean absolute deviation. More...

static	medianAbsoluteDeviation (array $numbers)
	MAD - median absolute deviation. More...

static	quartiles (array $numbers, string $method='exclusive')
	Quartiles Three points that divide the data set into four equal groups, each group comprising a quarter of the data. More...

static	quartilesExclusive (array $numbers)
	Quartiles - Exclusive method Three points that divide the data set into four equal groups, each group comprising a quarter of the data. More...

static	quartilesInclusive (array $numbers)
	Quartiles - Inclusive method (R method) Three points that divide the data set into four equal groups, each group comprising a quarter of the data. More...

static	interquartileRange (array $numbers, string $method='exclusive')
	IQR - Interquartile range (midspread, middle fifty) A measure of statistical dispersion. More...

static	iqr (array $numbers, string $method='exclusive')
	IQR - Interquartile range (midspread, middle fifty) Convenience wrapper function for interquartileRange. More...

static	percentile (array $numbers, float $P)
	Compute the P-th percentile of a list of numbers. More...

static	midhinge (array $numbers)
	Midhinge The average of the first and third quartiles and is thus a measure of location. More...

static	coefficientOfVariation (array $numbers)
	Coefficient of variation (cᵥ) Also known as relative standard deviation (RSD) More...

static	describe (array $numbers, bool $population=false)
	Get a report of all the descriptive statistics over a list of numbers Includes mean, median, mode, range, midrange, variance, standard deviation, quartiles, etc. More...

static	fiveNumberSummary (array $numbers)
	Five number summary A descriptive statistic that provides information about a set of observations. More...

Public Attributes
const	POPULATION = true

const	SAMPLE = false

Detailed Description

Descriptive statistics Summary statistics that quantitatively describe or summarize features of a collection of information.

https://en.wikipedia.org/wiki/Descriptive_statistics

Member Function Documentation

◆ coefficientOfVariation()

static MathPHP\Statistics\Descriptive::coefficientOfVariation ( array $numbers )

static

Coefficient of variation (cᵥ) Also known as relative standard deviation (RSD)

A standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as a percentage. The ratio of the standard deviation to the mean. https://en.wikipedia.org/wiki/Coefficient_of_variation

σ

cᵥ = - μ

Parameters

array $numbers

Returns: number

Exceptions

Exception

◆ describe()

static MathPHP\Statistics\Descriptive::describe	(	array	$numbers,
		bool	$population = `false`
	)

static

Get a report of all the descriptive statistics over a list of numbers Includes mean, median, mode, range, midrange, variance, standard deviation, quartiles, etc.

Parameters

array	$numbers
bool	$population	: true means all possible observations of the system are present; false means a sample is used.

Returns: array [ n, mean, median, mode, range, midrange, variance, sd, CV, mean_mad, median_mad, quartiles, skewness, kurtosis, sem, ci_95, ci_99 ]

Exceptions

Exception

◆ fiveNumberSummary()

static MathPHP\Statistics\Descriptive::fiveNumberSummary ( array $numbers )

static

Five number summary A descriptive statistic that provides information about a set of observations.

It consists of the five most important sample percentiles: 1) the sample minimum (smallest observation) 2) the lower quartile or first quartile 3) the median (middle value) 4) the upper quartile or third quartile 5) the sample maximum (largest observation)

https://en.wikipedia.org/wiki/Five-number_summary

Parameters

array $numbers

Returns: array [min, Q1, median, Q3, max]

◆ interquartileRange()

static MathPHP\Statistics\Descriptive::interquartileRange	(	array	$numbers,
		string	$method = `'exclusive'`
	)

static

IQR - Interquartile range (midspread, middle fifty) A measure of statistical dispersion.

Difference between the upper and lower quartiles. https://en.wikipedia.org/wiki/Interquartile_range

IQR = Q₃ - Q₁

Parameters

array	$numbers
string	$method	What quartile method to use (optional - default: exclusive)

Returns: number

◆ iqr()

static MathPHP\Statistics\Descriptive::iqr	(	array	$numbers,
		string	$method = `'exclusive'`
	)

static

IQR - Interquartile range (midspread, middle fifty) Convenience wrapper function for interquartileRange.

Parameters

array	$numbers
string	$method	What quartile method to use (optional - default: exclusive)

Returns: number

◆ meanAbsoluteDeviation()

static MathPHP\Statistics\Descriptive::meanAbsoluteDeviation ( array $numbers )

static

MAD - mean absolute deviation.

The average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. (https://en.wikipedia.org/wiki/Average_absolute_deviation)

  ∑|xᵢ - x̄|

MAD = ------— N

x̄ is the mean N is the number of numbers in the population set

Parameters

array $numbers

Returns: number|null

◆ medianAbsoluteDeviation()

static MathPHP\Statistics\Descriptive::medianAbsoluteDeviation ( array $numbers )

static

MAD - median absolute deviation.

The average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. It is a robust measure of the variability of a univariate sample of quantitative data. (https://en.wikipedia.org/wiki/Median_absolute_deviation)

MAD = median(|xᵢ - x̄|)

x̄ is the median

Parameters

array $numbers

Returns: number|null

◆ midhinge()

static MathPHP\Statistics\Descriptive::midhinge ( array $numbers )

static

Midhinge The average of the first and third quartiles and is thus a measure of location.

Equivalently, it is the 25% trimmed mid-range or 25% midsummary; it is an L-estimator. https://en.wikipedia.org/wiki/Midhinge

Midhinge = (first quartile, third quartile) / 2

Parameters

array $numbers

Returns: float|null

◆ midrange()

static MathPHP\Statistics\Descriptive::midrange ( array $numbers )

static

Midrange - the mean of the largest and smallest values It is the midpoint of the range; as such, it is a measure of central tendency.

(https://en.wikipedia.org/wiki/Mid-range)

max x + min x

M = ----------— 2

Parameters

array $numbers

Returns: number|null

◆ percentile()

static MathPHP\Statistics\Descriptive::percentile	(	array	$numbers,
		float	$P
	)

static

Compute the P-th percentile of a list of numbers.

Linear interpolation between closest ranks method - Second variant, C = 1 P-th percentile (0 <= P <= 100) of a list of N ordered values (sorted from least to greatest) Similar method used in NumPy and Excel https://en.wikipedia.org/wiki/Percentile#Second_variant.2C_.7F.27.22.60UNIQ–postMath-00000043-QINU.60.22.27.7F

x - — (N - 1) + 1 100

P = percentile N = number of elements in list

ν(x) = νₓ + x％1(νₓ₊₁ - νₓ)

⌊x⌋ = integer part of x x％1 = fraction part of x νₓ = number in position x in sorted list of numbers νₓ₊₁ = number in position x + 1 in sorted list of number

Parameters

array	$numbers
float	$P	percentile to calculate

Returns: float in list corresponding to P percentile

Exceptions

Exception

◆ populationVariance()

static MathPHP\Statistics\Descriptive::populationVariance ( array $numbers )

static

Population variance - Use when all possible observations of the system are present.

If used with a subset of data (sample variance), it will be a biased variance.

 ∑⟮xᵢ - μ⟯²

σ² = -------— N

μ is the population mean N is the number of numbers in the population set

Parameters

array $numbers

Returns: float|null

Exceptions

Exception

◆ quartiles()

static MathPHP\Statistics\Descriptive::quartiles	(	array	$numbers,
		string	$method = `'exclusive'`
	)

static

Quartiles Three points that divide the data set into four equal groups, each group comprising a quarter of the data.

https://en.wikipedia.org/wiki/Quartile

There are multiple methods for computing quartiles:

Inclusive
Exclusive

Parameters

array	$numbers
string	$method	What quartile method to use (optional - default: exclusive)

Returns: array [ 0%, Q1, Q2, Q3, 100%, IQR ]

◆ quartilesExclusive()

static MathPHP\Statistics\Descriptive::quartilesExclusive ( array $numbers )

static

Quartiles - Exclusive method Three points that divide the data set into four equal groups, each group comprising a quarter of the data.

https://en.wikipedia.org/wiki/Quartile

0% is smallest number Q1 (25%) is first quartile (lower quartile, 25th percentile) Q2 (50%) is second quartile (median, 50th percentile) Q3 (75%) is third quartile (upper quartile, 75th percentile) 100% is largest number interquartile_range is the difference between the upper and lower quartiles. (IQR = Q₃ - Q₁)

Method used

Use the median to divide the ordered data set into two halves.
- If there are an odd number of data points in the original ordered data set, do not include the median (the central value in the ordered list) in either half.
- If there are an even number of data points in the original ordered data set, split this data set exactly in half.
The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data.

This rule is employed by the TI-83 calculator boxplot and "1-Var Stats" functions. This is the most basic method that is commonly taught in math textbooks.

Parameters

array $numbers

Returns: array [ 0%, Q1, Q2, Q3, 100%, IQR ]

◆ quartilesInclusive()

static MathPHP\Statistics\Descriptive::quartilesInclusive ( array $numbers )

static

Quartiles - Inclusive method (R method) Three points that divide the data set into four equal groups, each group comprising a quarter of the data.

https://en.wikipedia.org/wiki/Quartile

0% is smallest number Q1 (25%) is first quartile (lower quartile, 25th percentile) Q2 (50%) is second quartile (median, 50th percentile) Q3 (75%) is third quartile (upper quartile, 75th percentile) 100% is largest number interquartile_range is the difference between the upper and lower quartiles. (IQR = Q₃ - Q₁)

Method used

Use the median to divide the ordered data set into two halves.
- If there are an odd number of data points in the original ordered data set, include the median (the central value in the ordered list) in both halves.
- If there are an even number of data points in the original ordered data set, split this data set exactly in half.
The lower quartile value is the median of the lower half of the data. The upper quartile value is the median of the upper half of the data.

The values found by this method are also known as "Tukey's hinges". This is the method that the programming language R uses by default.

Parameters

array $numbers

Returns: array [ 0%, Q1, Q2, Q3, 100%, IQR ]

◆ range()

static MathPHP\Statistics\Descriptive::range ( array $numbers )

static

Range - the difference between the largest and smallest values It is the size of the smallest interval which contains all the data.

It provides an indication of statistical dispersion. (https://en.wikipedia.org/wiki/Range_(statistics))

R = max x - min x

Parameters

array $numbers

Returns: number|null

◆ sampleVariance()

static MathPHP\Statistics\Descriptive::sampleVariance ( array $numbers )

static

Unbiased sample variance Use when only a subset of all possible observations of the system are present.

∑⟮xᵢ - x̄⟯² S² = -------— n - 1

x̄ is the sample mean n is the number of numbers in the sample set

Parameters

array $numbers

Returns: float|null

Exceptions

Exception

◆ sd()

static MathPHP\Statistics\Descriptive::sd	(	array	$numbers,
		bool	$SD＋ = `false`
	)

static

sd - Standard deviation - convenience method

Parameters

array	$numbers
bool	$SD＋	: true returns SD+ (uses population variance); false returns SD (uses sample variance); Default is false (SD (sample variance))

Returns: float|null

Exceptions

Exception

◆ standardDeviation()

static MathPHP\Statistics\Descriptive::standardDeviation	(	array	$numbers,
		bool	$SD＋ = `false`
	)

static

Standard deviation A measure that is used to quantify the amount of variation or dispersion of a set of data values.

A low standard deviation indicates that the data points tend to be close to the mean (also called the expected value) of the set. A high standard deviation indicates that the data points are spread out over a wider range of values. (https://en.wikipedia.org/wiki/Standard_deviation)

σ = √⟮σ²⟯ = √⟮variance⟯ SD+ = √⟮σ²⟯ = √⟮sample variance⟯

Parameters

array	$numbers
bool	$SD＋	: true returns SD+ (uses population variance); false returns SD (uses sample variance); Default is false (SD (sample variance))

Returns: float|null

Exceptions

Exception

◆ variance()

static MathPHP\Statistics\Descriptive::variance	(	array	$numbers,
		int	$ν
	)

static

Variance.

Variance measures how far a set of numbers are spread out. A variance of zero indicates that all the values are identical. Variance is always non-negative: a small variance indicates that the data points tend to be very close to the mean (expected value) and hence to each other. A high variance indicates that the data points are very spread out around the mean and from each other. (https://en.wikipedia.org/wiki/Variance)

 ∑⟮xᵢ - μ⟯²

σ² = -------— ν

Generalized method that allows setting the degrees of freedom. For population variance, set d.f. (ν) to n For sample variance, set d.f (ν) to n - 1 Or use popluationVariance or sampleVaraince covenience methods.

μ is the population mean ν is the degrees of freedom, which usually is the number of numbers in the population set or n - 1 for sample set.

Parameters

array	$numbers
int	$ν	degrees of freedom

Returns: float|null

Exceptions

Exception

◆ weightedSampleVariance()

static MathPHP\Statistics\Descriptive::weightedSampleVariance	(	array	$numbers,
		array	$weights,
		bool	$biased = `false`
	)

static

Weighted sample variance.

Biased case

  ∑wᵢ⟮xᵢ - μw⟯²

σ²w = -------— ∑wᵢ

Unbiased estimator for frequency weights

  ∑wᵢ⟮xᵢ - μw⟯²

σ²w = -------— ∑wᵢ - 1

μw is the weighted mean

https://en.wikipedia.org/wiki/Weighted_arithmetic_mean#Weighted_sample_variance

Parameters

array	$numbers
array	$weights
bool	$biased

Returns: number

Exceptions

Exception

The documentation for this class was generated from the following file:

src/Statistics/Descriptive.php

Static Public Member Functions

Public Attributes

Detailed Description

Member Function Documentation

◆ coefficientOfVariation()

◆ describe()

◆ fiveNumberSummary()

◆ interquartileRange()

◆ iqr()

◆ meanAbsoluteDeviation()

◆ medianAbsoluteDeviation()

◆ midhinge()

◆ midrange()

◆ percentile()

◆ populationVariance()

◆ quartiles()

◆ quartilesExclusive()

◆ quartilesInclusive()

◆ range()

◆ sampleVariance()

◆ sd()

◆ standardDeviation()

◆ variance()

◆ weightedSampleVariance()