# Descriptive statistics

15 min readJul 9, 2021

--

A large number of methods collectively compute descriptive statistics and other related operations on DataFrame. Most of these are aggregations like sum(), mean(), but some of them, like sumsum(), produce an object of the same size. Generally speaking, these methods take an axis argument, just like ndarray.{sum, std, …}, but the axis can be specified by name or integer

• DataFrame − “index” (axis=0, default), “columns” (axis=1)

Let us create a DataFrame and use this object throughout this chapter for all the operations.

# Example

Live Demo

`import pandas as pdimport numpy as np#Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}#Create a DataFramedf = pd.DataFrame(d)print df`

Its output is as follows −

`Age  Name   Rating0   25   Tom     4.231   26   James   3.242   25   Ricky   3.983   23   Vin     2.564   30   Steve   3.205   29   Smith   4.606   23   Jack    3.807   34   Lee     3.788   40   David   2.989   30   Gasper  4.8010  51   Betina  4.1011  46   Andres  3.65`

# sum()

Returns the sum of the values for the requested axis. By default, axis is index (axis=0).

Live Demo

`import pandas as pdimport numpy as np #Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}#Create a DataFramedf = pd.DataFrame(d)print df.sum()`

Its output is as follows −

`Age                                                    382Name     TomJamesRickyVinSteveSmithJackLeeDavidGasperBe...Rating                                               44.92dtype: object`

Each individual column is added individually (Strings are appended).

# axis=1

This syntax will give the output as shown below.

Live Demo

`import pandas as pdimport numpy as np #Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])} #Create a DataFramedf = pd.DataFrame(d)print df.sum(1)`

Its output is as follows −

`0    29.231    29.242    28.983    25.564    33.205    33.606    26.807    37.788    42.989    34.8010   55.1011   49.65dtype: float64`

# mean()

Returns the average value

Live Demo

`import pandas as pdimport numpy as np#Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}#Create a DataFramedf = pd.DataFrame(d)print df.mean()`

Its output is as follows −

`Age       31.833333Rating     3.743333dtype: float64`

# std()

Returns the Bressel standard deviation of the numerical columns.

`import pandas as pdimport numpy as np#Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}#Create a DataFramedf = pd.DataFrame(d)print df.std()`

Its output is as follows −

`Age       9.232682Rating    0.661628dtype: float64`

# Functions & Description

Let us now understand the functions under Descriptive Statistics in Python Pandas. The following table list down the important functions −

Sr.No.FunctionDescription1count()Number of non-null observations2sum()Sum of values3mean()Mean of Values4median()Median of Values5mode()Mode of values6std()Standard Deviation of the Values7min()Minimum Value8max()Maximum Value9abs()Absolute Value10prod()Product of Values11cumsum()Cumulative Sum12cumprod()Cumulative Product

Note − Since DataFrame is a Heterogeneous data structure. Generic operations don’t work with all functions.

• Functions like sum(), cumsum() work with both numeric and character (or) string data elements without any error. Though n practice, character aggregations are never used generally, these functions do not throw any exception.
• Functions like abs(), cumprod() throw exception when the DataFrame contains character or string data because such operations cannot be performed.

# Summarizing Data

The describe() function computes a summary of statistics pertaining to the DataFrame columns.

`import pandas as pdimport numpy as np#Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}#Create a DataFramedf = pd.DataFrame(d)print df.describe()`

Its output is as follows −

`Age         Ratingcount    12.000000      12.000000mean     31.833333       3.743333std       9.232682       0.661628min      23.000000       2.56000025%      25.000000       3.23000050%      29.500000       3.79000075%      35.500000       4.132500max      51.000000       4.800000`

This function gives the mean, std and IQR values. And, function excludes the character columns and given summary about numeric columns. ‘include’ is the argument which is used to pass necessary information regarding what columns need to be considered for summarizing. Takes the list of values; by default, ‘number’.

• object − Summarizes String columns
• number − Summarizes Numeric columns
• all − Summarizes all columns together (Should not pass it as a list value)

Now, use the following statement in the program and check the output −

`import pandas as pdimport numpy as np#Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}#Create a DataFramedf = pd.DataFrame(d)print df.describe(include=['object'])`

Its output is as follows −

`Namecount       12unique      12top      Rickyfreq         1`

Now, use the following statement and check the output −

`import pandas as pdimport numpy as np#Create a Dictionary of seriesd = {'Name':pd.Series(['Tom','James','Ricky','Vin','Steve','Smith','Jack',   'Lee','David','Gasper','Betina','Andres']),   'Age':pd.Series([25,26,25,23,30,29,23,34,40,30,51,46]),   'Rating':pd.Series([4.23,3.24,3.98,2.56,3.20,4.6,3.8,3.78,2.98,4.80,4.10,3.65])}#Create a DataFramedf = pd.DataFrame(d)print df. describe(include='all')`

Its output is as follows −

`Age          Name       Ratingcount   12.000000        12    12.000000unique        NaN        12          NaNtop           NaN     Ricky          NaNfreq          NaN         1          NaNmean    31.833333       NaN     3.743333std      9.232682       NaN     0.661628min     23.000000       NaN     2.56000025%     25.000000       NaN     3.23000050%     29.500000       NaN     3.79000075%     35.500000       NaN     4.132500max     51.000000       NaN     4.800000`

# numpy.random() in Python

The random is a module present in the NumPy library. This module contains the functions which are used for generating random numbers. This module contains some simple random data generation methods, some permutation and distribution functions, and random generator functions.

All the functions in a random module are as follows:

# Simple random data

There are the following functions of simple random data:

1) p.random.rand(d0, d1, …, dn)

Triggers in SQL (Hindi)

This function of random module is used to generate random numbers or values in a given shape.

Example:

1. import numpy as np
2. a=np.random.rand(5,2)
3. a

Output:

`array([[0.74710182, 0.13306399],           [0.01463718, 0.47618842],           [0.98980426, 0.48390004],           [0.58661785, 0.62895758],           [0.38432729, 0.90384119]])`

2) np.random.randn(d0, d1, …, dn)

This function of random module return a sample from the “standard normal” distribution.

Example:

1. import numpy as np
2. a=np.random.randn(2,2)
3. a

Output:

`array([[ 1.43327469, -0.02019121],       [ 1.54626422,  1.05831067]])b=np.random.randn()b-0.3080190768904835`

3) np.random.randint(low[, high, size, dtype])

This function of random module is used to generate random integers from inclusive(low) to exclusive(high).

Example:

1. import numpy as np
2. a=np.random.randint(3, size=10)
3. a

Output:

`array([1, 1, 1, 2, 0, 0, 0, 0, 0, 0])`

4) np.random.random_integers(low[, high, size])

This function of random module is used to generate random integers number of type np.int between low and high.

Example:

1. import numpy as np
2. a=np.random.random_integers(3)
3. a
4. b=type(np.random.random_integers(3))
5. b
6. c=np.random.random_integers(5, size=(3,2))
7. c

Output:

`2<type 'numpy.int32'>array([[1, 1],           [2, 5],           [1, 3]])`

5) np.random.random_sample([size])

This function of random module is used to generate random floats number in the half-open interval [0.0, 1.0).

Example:

1. import numpy as np
2. a=np.random.random_sample()
3. a
4. b=type(np.random.random_sample())
5. b
6. c=np.random.random_sample((5,))
7. c

Output:

`0.09250360565571492<type 'float'>array([0.34665418, 0.47027209, 0.75944969, 0.37991244, 0.14159746])`

6) np.random.random([size])

This function of random module is used to generate random floats number in the half-open interval [0.0, 1.0).

Example:

1. import numpy as np
2. a=np.random.random()
3. a
4. b=type(np.random.random())
5. b
6. c=np.random.random((5,))
7. c

Output:

`0.008786953974334155<type 'float'>array([0.05530122, 0.59133394, 0.17258794, 0.6912388 , 0.33412534])`

7) np.random.ranf([size])

This function of random module is used to generate random floats number in the half-open interval [0.0, 1.0).

Example:

1. import numpy as np
2. a=np.random.ranf()
3. a
4. b=type(np.random.ranf())
5. b
6. c=np.random.ranf((5,))
7. c

Output:

`0.2907792098474542<type 'float'>array([0.34084881, 0.07268237, 0.38161256, 0.46494681, 0.88071377])`

8) np.random.sample([size])

This function of random module is used to generate random floats number in the half-open interval [0.0, 1.0).

Example:

1. import numpy as np
2. a=np.random.sample()
3. a
4. b=type(np.random.sample())
5. b
6. c=np.random.sample((5,))
7. c

Output:

`0.012298209913766511<type 'float'>array([0.71878544, 0.11486169, 0.38189074, 0.14303308, 0.07217287])`

9) np.random.choice(a[, size, replace, p])

This function of random module is used to generate random sample from a given 1-D array.

Example:

1. import numpy as np
2. a=np.random.choice(5,3)
3. a
4. b=np.random.choice(5,3, p=[0.2, 0.1, 0.4, 0.2, 0.1])
5. b

Output:

`array([0, 3, 4])array([2, 2, 2], dtype=int64)`

10) np.random.bytes(length)

This function of random module is used to generate random bytes.

Example:

1. import numpy as np
2. a=np.random.bytes(7)
3. a

Output:

`'nQ\x08\x83\xf9\xde\x8a'`

# Permutations

There are the following functions of permutations:

1) np.random.shuffle()

This function is used for modifying a sequence in-place by shuffling its contents.

Example:

1. import numpy as np
2. a=np.arange(12)
3. a
4. np.random.shuffle(a)
5. a

Output:

`array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])array([10,  3,  2,  4,  5,  8,  0,  9,  1, 11,  7,  6])`

2) np.random.permutation()

This function permute a sequence randomly or return a permuted range.

Example:

1. import numpy as np
2. a=np.random.permutation(12)
3. a

Output:

`array([ 8,  7,  3, 11,  6,  0,  9, 10,  2,  5,  4,  1])`

# Distributions

There are the following functions of permutations:

1) beta(a, b[, size])

This function is used to draw samples from a Beta distribution.

Example:

1. def setup(self):
2. self.dist = dist.beta
3. self.cargs = []
4. self.ckwd = dict(alpha=2, beta=3)
5. self.np_rand_fxn = numpy.random.beta
6. self.np_args = [2, 3]
7. self.np_kwds = dict()

2) binomial(n, p[, size])

This function is used to draw sample from a binomial distribution.

Example:

1. import numpy as np
2. n, p = 10, .6
3. s1= np.random.binomial(n, p, 10)
4. s1

Output:

`array([6, 7, 7, 9, 3, 7, 8, 6, 6, 4])`

3) chisquare(df[, size])

This function is used to draw sample from a binomial distribution.

Example:

1. import numpy as np
2. np.random.chisquare(2,4)
3. sum(np.random.binomial(9, 0.1, 20000) == 0)/20000.

Output:

`array([6, 7, 7, 9, 3, 7, 8, 6, 6, 4])`

4) dirichlet(alpha[, size])

This function is used to draw a sample from the Dirichlet distribution.

Example:

1. Import numpy as np
2. import matplotlib.pyplot as plt
3. s1 = np.random.dirichlet((10, 5, 3), 20).transpose()
4. plt.barh(range(20), s1[0])
5. plt.barh(range(20), s1[1], left=s1[0], color=’g’)
6. plt.barh(range(20), s1[2], left=s1[0]+s1[1], color=’r’)
7. plt.title(“Lengths of Strings”)
8. plt.show()

Output:

5) exponential([scale, size])

This function is used to draw sample from an exponential distribution.

Example:

1. def __init__(self, sourceid, targetid):
2. self.__type = ‘Transaction’
3. self.id = uuid4()
4. self.source = sourceid
5. self.target = targetid
6. self.date = self._datetime.date(start=2015, end=2019)
7. self.time = self._datetime.time()
8. if random() < 0.05:
9. self.amount = self._numbers.between(100000, 1000000)
10. self.amount = npr.exponential(10)
11. if random() < 0.15:
12. self.currency = self._business.currency_iso_code()
13. else:
14. self.currency = None

6) f(dfnum, dfden[, size])

This function is used to draw sample from an F distribution.

Example:

1. import numpy as np
2. dfno= 1.
3. dfden = 48.
4. s1 = np.random.f(dfno, dfden, 10)
5. np.sort(s1)

Output:

`array([0.00264041, 0.04725478, 0.07140803, 0.19526217, 0.23979   ,       0.24023478, 0.63141254, 0.95316446, 1.40281789, 1.68327507])`

7) gamma(shape[, scale, size])

This function is used to draw sample from a Gamma distribution

Example:

1. import numpy as np
2. shape, scale = 2., 2.
3. s1 = np.random.gamma(shape, scale, 1000)
4. import matplotlib.pyplot as plt
5. import scipy.special as spss
6. count, bins, ignored = plt.hist(s1, 50, density=True)
7. a = bins**(shape-1)*(np.exp(-bins/scale) /
8. (spss.gamma(shape)*scale**shape))
9. plt.plot(bins, a, linewidth=2, color=’r’)
10. plt.show()

8) geometric(p[, size])

This function is used to draw sample from a geometric distribution.

Example:

1. import numpy as np
2. a = np.random.geometric(p=0.35, size=10000)
3. (a == 1).sum() / 1000

Output:

`3.`

9) gumbel([loc, scale, size])

This function is used to draw sample from a Gumble distribution.

Example:

1. import numpy as np
2. lov, scale = 0, 0.2
3. s1 = np.random.gumbel(loc, scale, 1000)
4. import matplotlib.pyplot as plt
5. count, bins, ignored = plt.hist(s1, 30, density=True)
6. plt.plot(bins, (1/beta)*np.exp(-(bins — loc)/beta)* np.exp( -np.exp( -(bins — loc) /beta) ),linewidth=2, color=’r’)
7. plt.show()

Output:

10) hypergeometric(ngood, nbad, nsample[, size])

This function is used to draw sample from a Hypergeometric distribution.

Example:

1. import numpy as np
2. good, bad, samp = 100, 2, 10
3. s1 = np.random.hypergeometric(good, bad, samp, 1000)
4. plt.hist(s1)
5. plt.show()

Output:

`(array([ 13.,   0.,   0.,   0.,   0., 163.,   0.,   0.,   0., 824.]), array([ 8. ,  8.2,  8.4,  8.6,  8.8,  9. ,  9.2,  9.4,  9.6,  9.8, 10. ]), <a list of 10 Patch objects>)`

11) laplace([loc, scale, size])

This function is used to draw sample from the Laplace or double exponential distribution with specified location and scale.

Example:

1. import numpy as np
2. location, scale = 0., 2.
3. s = np.random.laplace(location, scale, 10)
4. s

Output:

`array([-2.77127948, -1.46401453, -0.03723516, -1.61223942,  2.29590691,        1.74297722,  1.49438411,  0.30325513, -0.15948891, -4.99669747])`

12) logistic([loc, scale, size])

This function is used to draw sample from logistic distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. location, scale = 10, 1
4. s1 = np.random.logistic(location, scale, 10000)
5. count, bins, ignored = plt.hist(s1, bins=50)
6. count
7. bins
8. ignored
9. plt.show()

Output:

`array([1.000e+00, 1.000e+00, 1.000e+00, 0.000e+00, 1.000e+00, 1.000e+00,       1.000e+00, 5.000e+00, 7.000e+00, 1.100e+01, 1.800e+01, 3.500e+01,       5.300e+01, 6.700e+01, 1.150e+02, 1.780e+02, 2.300e+02, 3.680e+02,       4.910e+02, 6.400e+02, 8.250e+02, 9.100e+02, 9.750e+02, 1.039e+03,       9.280e+02, 8.040e+02, 6.530e+02, 5.240e+02, 3.380e+02, 2.470e+02,       1.650e+02, 1.150e+02, 8.500e+01, 6.400e+01, 3.300e+01, 1.600e+01,       2.400e+01, 1.400e+01, 4.000e+00, 5.000e+00, 2.000e+00, 2.000e+00,       1.000e+00, 1.000e+00, 0.000e+00, 1.000e+00, 0.000e+00, 0.000e+00,       0.000e+00, 1.000e+00])array([ 0.50643911,  0.91891814,  1.33139717,  1.7438762 ,  2.15635523,        2.56883427,  2.9813133 ,  3.39379233,  3.80627136,  4.2187504 ,        4.63122943,  5.04370846,  5.45618749,  5.86866652,  6.28114556,        6.69362459,  7.10610362,  7.51858265,  7.93106169,  8.34354072,        8.75601975,  9.16849878,  9.58097781,  9.99345685, 10.40593588,       10.81841491, 11.23089394, 11.64337298, 12.05585201, 12.46833104,       12.88081007, 13.2932891 , 13.70576814, 14.11824717, 14.5307262 ,       14.94320523, 15.35568427, 15.7681633 , 16.18064233, 16.59312136,       17.00560039, 17.41807943, 17.83055846, 18.24303749, 18.65551652,       19.06799556, 19.48047459, 19.89295362, 20.30543265, 20.71791168,       21.13039072])<a list of 50 Patch objects>`

13) lognormal([mean, sigma, size])

This function is used to draw sample from a log-normal distribution.

Example:

1. import numpy as np
2. mu, sigma = 2., 1.
3. s1 = np.random.lognormal(mu, sigma, 1000)
4. import matplotlib.pyplot as plt
5. count, bins, ignored = plt.hist(s1, 100, density=True, align=’mid’)
6. a = np.linspace(min(bins), max(bins), 10000)
7. pdf = (np.exp(-(np.log(a) — mu)**2 / (2 * sigma**2))/ (a * sigma * np.sqrt(2 * np.pi)))
8. plt.plot(a, pdf, linewidth=2, color=’r’)
9. plt.axis(‘tight’)
10. plt.show()

Output:

14) logseries(p[, size])

This function is used to draw sample from a logarithmic distribution.

Example:

1. import numpy as np
2. x = .6
3. s1 = np.random.logseries(x, 10000)
4. count, bins, ignored = plt.hist(s1)
5. def logseries(k, p):
6. return -p**k/(k*log(1-p))
7. plt.plot(bins, logseries(bins, x)*count.max()/logseries(bins, a).max(), ‘r’)
8. plt.show()

Output:

15) multinomial(n, pvals[, size])

This function is used to draw sample from a multinomial distribution.

Example:

1. import numpy as np
2. np.random.multinomial(20, [1/6.]*6, size=1)

Output:

`array([[4, 2, 5, 5, 3, 1]])`

16) multivariate_normal(mean, cov[, size, …)

This function is used to draw sample from a multivariate normal distribution.

Example:

1. import numpy as np
2. mean = (1, 2)
3. coveriance = [[1, 0], [0, 100]]
4. import matplotlib.pyplot as plt
5. a, b = np.random.multivariate_normal(mean, coveriance, 5000).T
6. plt.plot(a, b, ‘x’)
7. plt.axis(‘equal’023
8. 030
9. )
10. plt.show()

Output:

17) negative_binomial(n, p[, size])

This function is used to draw sample from a negative binomial distribution.

Example:

1. import numpy as np
2. s1 = np.random.negative_binomial(1, 0.1, 100000)
3. for i in range(1, 11):
4. probability = sum(s1<i) / 100000.
5. print i, “wells drilled, probability of one success =”, probability

Output:

`1 wells drilled, probability of one success = 02 wells drilled, probability of one success = 03 wells drilled, probability of one success = 04 wells drilled, probability of one success = 05 wells drilled, probability of one success = 06 wells drilled, probability of one success = 07 wells drilled, probability of one success = 08 wells drilled, probability of one success = 09 wells drilled, probability of one success = 010 wells drilled, probability of one success = 0`

18) noncentral_chisquare(df, nonc[, size])

This function is used to draw sample from a noncentral chi-square distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. val = plt.hist(np.random.noncentral_chisquare(3, 25, 100000), bins=200, normed=True)
4. plt.show()

Output:

19) normal([loc, scale, size])

This function is used to draw sample from a normal distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. mu, sigma = 0, 0.2 # mean and standard deviation
4. s1 = np.random.normal(mu, sigma, 1000)
5. abs(mu — np.mean(s1)) < 0.01
6. abs(sigma — np.std(s1, ddof=1)) < 0.01
7. count, bins, ignored = plt.hist(s1, 30, density=True)
8. plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) *np.exp( — (bins — mu)**2 / (2 * sigma**2) ), linewidth=2, color=’r’)
9. plt.show()

Output:

20) pareto(a[, size])

This function is used to draw samples from a Lomax or Pareto II with specified shape.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. b, m1 = 3., 2. # shape and mode
4. s1 = (np.random.pareto(b, 1000) + 1) * m1
5. count, bins, _ = plt.hist(s1, 100, density=True)
6. fit = b*m**b / bins**(b+1)
7. plt.plot(bins, max(count)*fit/max(fit), linewidth=2, color=’r’)
8. plt.show()

Output:

21) power(a[, size])

This function is used to draw samples in [0, 1] from a power distribution with positive exponent a-1.

Example:

1. import numpy as np
2. x = 5. # shape
3. samples = 1000
4. s1 = np.random.power(x, samples)
5. import matplotlib.pyplot as plt
6. count, bins, ignored = plt.hist(s1, bins=30)
7. a = np.linspace(0, 1, 100)
8. b = x*a**(x-1.)
9. density_b = samples*np.diff(bins)[0]*b
10. plt.plot(a, density_b)
11. plt.show()

Output:

22) rayleigh([scale, size])

This function is used to draw sample from a Rayleigh distribution.

Example:

1. val = hist(np.random.rayleigh(3, 100000), bins=200, density=True)
2. meanval = 1
3. modeval = np.sqrt(2 / np.pi) * meanval
4. s1 = np.random.rayleigh(modeval, 1000000)
5. 100.*sum(s1>3)/1000000.

Output:

`0.087300000000000003`

23) standard_cauchy([size])

This function is used to draw sample from a standard Cauchy distribution with mode=0.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. s1 = np.random.standard_cauchy(1000000)
4. s1 = s1[(s1>-25) & (s1<25)] # truncate distribution so it plots well
5. plt.hist(s1, bins=100)
6. plt.show()

Output:

24) standard_exponential([size])

This function is used to draw sample from a standard exponential distribution.

Example:

1. import numpy as np
2. n = np.random.standard_exponential((2, 7000))

Output:

`array([[0.53857931, 0.181262  , 0.20478701, ..., 3.66232881, 1.83882709,        1.77963295],       [0.65163973, 1.40001955, 0.7525986 , ..., 0.76516523, 0.8400617 ,        0.88551011]])`

25) standard_gamma([size])

This function is used to draw sample from a standard Gamma distribution.

Example:

1. import numpy as np
2. shape, scale = 2., 1.
3. s1 = np.random.standard_gamma(shape, 1000000)
4. import matplotlib.pyplot as plt
5. import scipy.special as sps
6. count1, bins1, ignored1 = plt.hist(s, 50, density=True)
7. y = bins1**(shape-1) * ((np.exp(-bins1/scale))/ (sps.gamma(shape) * scale**shape))
8. plt.plot(bins1, y, linewidth=2, color=’r’)
9. plt.show()

Output:

26) standard_normal([size])

This function is used to draw sample from a standard Normal distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. s1= np.random.standard_normal(8000)
4. s1
5. q = np.random.standard_normal(size=(3, 4, 2))
6. q

Output:

`array([-3.14907597,  0.95366265, -1.20100026, ...,  3.47180222,        0.9608679 ,  0.0774319 ])array([[[ 1.55635461, -1.29541713],        [-1.50534663, -0.02829194],        [ 1.03949348, -0.26128132],        [ 1.51921798,  0.82136178]],       [[-0.4011052 , -0.52458858],        [-1.31803814,  0.37415379],        [-0.67077365,  0.97447018],        [-0.20212115,  0.67840888]],       [[ 1.86183474,  0.19946562],        [-0.07376021,  0.84599701],        [-0.84341386,  0.32081667],        [-3.32016062, -1.19029818]]])`

27) standard_t(df[, size])

This function is used to draw sample from a standard Student’s distribution with df degree of freedom.

Example:

1. intake = np.array([5260., 5470, 5640, 6180, 6390, 6515, 6805, 7515,8230,8770])
2. s1 = np.random.standard_t(10, size=100000)
3. np.mean(intake)
4. intake.std(ddof=1)
5. t = (np.mean(intake)-7725)/(intake.std(ddof=1)/np.sqrt(len(intake)))
6. h = plt.hist(s1, bins=100, density=True)
7. np.sum(s1<t) / float(len(s1))
8. plt.show()

Output:

`6677.51174.11018316945980.00864`

28) triangular(left, mode, right[, size])

This function is used to draw sample from a triangular distribution over the interval.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. h = plt.hist(np.random.triangular(-4, 0, 8, 1000000), bins=300,density=True)
4. plt.show()

Output:

29) uniform([low, high, size])

This function is used to draw sample from a uniform distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. s1 = np.random.uniform(-1,0,1000)
4. np.all(s1 >= -1)
5. np.all(s1 < 0)
6. count, bins, ignored = plt.hist(s1, 15, density=True)
7. plt.plot(bins, np.ones_like(bins), linewidth=2, color=’r’)
8. plt.show()

Output:

30) vonmises(m1, m2[, size])

This function is used to draw sample from a von Mises distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. m1, m2 = 0.0, 4.0
4. s1 = np.random.vonmises(m1, m2, 1000)
5. from scipy.special import i0
6. plt.hist(s1, 50, density=True)
7. x = np.linspace(-np.pi, np.pi, num=51)
8. y = np.exp(m2*np.cos(x-m1))/(2*np.pi*i0(m2))
9. plt.plot(x, y, linewidth=2, color=’r’)
10. plt.show()

Output:

31) wald(mean, scale[, size])

This function is used to draw sample from a Wald, or inverse Gaussian distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. h = plt.hist(np.random.wald(3, 3, 100000), bins=250, density=True)
4. plt.show()

Output:

32) weibull(a[, size])

This function is used to draw sample from a Weibull distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. from scipy import special
4. x=2.0
5. s=np.random.weibull(x, 1000)
6. a = np.arange(1, 100.)/50.
7. def weib(x, n, a):
8. return (a/n)*(x/n)**np.exp(-(x/n)**a)
9. count, bins, ignored = plt.hist(np.random.weibull(5.,1000))
10. a= np.arange(1,100.)/50.
11. scale = count.max()/weib(x, 1., 5.).max()
12. scale = count.max()/weib(a, 1., 5.).max()
13. plt.plot(x, weib(x, 1., 5.)*scale)
14. plt.show()

Output:

33) zipf(a[, size])

This function is used to draw sample from a Zipf distribution.

Example:

1. import numpy as np
2. import matplotlib.pyplot as plt
3. from scipy import special
4. x=2.0
5. s=np.random.zipf(x, 1000)
6. count, bins, ignored = plt.hist(s[s<50], 50, density=True)
7. a = np.arange(1., 50.)
8. b= a**(-x) / special.zetac(x)
9. plt.plot(a, b/max(b), linewidth=2, color=’r’)
10. plt.show()

Output: