Content text Unit 2 Descreptive statistics.pdf
2.1 Introduction One of the important objectives of statistical analysis is to describe the characteristics of a frequency distribution by determining various numerical measures. To analyze and interpret the main characteristics of a frequency distribution, it is required to determine the numerical measures central value dispersion, skewness, kurtosis, correlation etc. Averages are the representative value of the frequency distribution which give us the gist nature and characteristics of the huge mass of unwieldy numerical data. After the data have been classified and tabulated, the next step is to analyze it. However, tabular, diagrammatic and graphical approaches are the visual illustration of the unorganized data. These techniques are not capable of describing the quantitative data in detail. Therefore, one of the most important objectives of statistical analysis is to determine various numerical measures which describe the inherent characteristics of a frequency distribution. The first of such measures is ―average‖. The averages are the measures which condense a huge unwieldy set of numerical data into single numerical values which are representative of the entire distribution. 2.2 Measures of Central Tendency Averages have typical nature that all other items (values) of the distribution concentrate around the center. Averages are the values in the central part of the frequency distribution which give us an idea about concentration of the values. So, they are also referred as the "Measures of central tendency". Definition: The single value that can represent whole statistical data is known by central value and its nature is known as measure of central tendency. It lies on the central part of data. For example, Ram is a average student it means that he gets central mark in whole class. 2.3 Various Measures of Central Tendency The measures of central tendency commonly used in practice are as follows: 1. Mean a. Arithmetic Mean (A.M.) (i) Simple and; (ii) Weighted b. Geometric Mean (G.M.) c. Harmonic Mean (H.M.) 2. Median 3. Mode 2.4 Arithmetic Mean The arithmetic mean is the most popular and widely used measure of central tendency. It is also called simply ̳the mean‘ or ̳the average‘. It is also considered as an ideal measure of central tendency or the best-known/golden measures of central tendency because it satisfies almost all requisites of ideal measure of central tendency given by Prof. Yule. Arithmetic Mean (A.M.) is the most commonly used measure among all the averages. This is due to the simplicity of its calculation and other advantages. It is Descriptive Statistics Unit 2
used to calculate the average value of quantitative data when the distribution does not have very large and very small items. It is also used to obtain average value of distribution having closed ended class intervals and having non –extreme items. Definition: Arithmetic mean of a given set of observations is their sum divided by the number of observations. It is denoted by X ̄ (read as "X bar"), (for sample statistics). Population mean is denoted by , read as mu (for population parameter) and sample mean is denoted by X ̄. Arithmetic mean (A.M.) is called an ideal measure (or best measure or golden measure) of central tendency sine it is based on all the observations. Uses of Arithmetic Mean Arithmetic Mean (A.M.) is more suitable average than others while we are dealing with quantitative measures such as average bonus, average income, average sales, average profit, average production, average height, average expenditure, average revenue etc. It gives simple quantitative information or numerical average. Note: AM can not be calculated in case of open end class like below 10, above 20. Similarly, it is not representative in highly skewed data ( if very small and very large figures are given do not use AM). 2.4.1 Calculation of Arithmetic Mean a) Individual Series Individual series is ungrouped data where each and every value of individual item is listed singly after observations. In this ungrouped data, arithmetic mean is calculated as follows: i) Direct Method Let X1, X2, X3, ... , Xn be the n variate values of a random variable X. Then arithmetic mean is computed by the following formula: X ̄ X1 X2 X3 ... Xn n X n where, X the sum of observations n the number of observations. ii) Short-cut Method (or Assumed Mean Method or Change of Origin Method or Deviation Method) If the number of observation is very large and the values of observations of the given data are also large (i.e. given figure is large in digits), calculation of mean (A.M.) by direct method is tedious and time consuming. In this case, we take the deviations of the items from any arbitrary number for computing A.M. This method is known as assumed mean method or short-cut method or deviation method. The formula for calculating the A.M. (mean) by this method is defined by X ̄ A d n where, A Assumed mean or arbitrary value d X – A Deviations of the items from the assumed mean. (Origin changed) 'A' n Number of observations.
Note: There is no any hard and fast rule for the selection of 'A' but it is better to take a value between highest and lowest values. iii) Step Deviation Method or Change of Origin and Scale Method or Coding Method For large value of observations, sometimes values are changed into smaller values by the change of origin and scale. For this, observations are multiplied or divided by a constant and this method is called step deviation method. The formula for calculating A.M.by step deviation method is given by X ̄ A d' n × h where, d' X – A h , h Common factor (Scale change dividing by a factor) n Number of observations Note: There is no any hard and fast rule for the selection of A and h but better to take the value of A between highest and lowest values and to take the value of h is common factor of the values. Example 2.1 The following are the daily incomes of five persons in a certain locality. Persons A B C D E Income (in Rs.) 400 350 450 500 300 Calculate the average income. Solution: Computation of the average income Persons Income (in Rs) (X) A 400 B 350 C 450 D 500 E 300 Total ∑X = 2000 X ̄ X n 2000 5 400 Hence, the average income is Rs.400. Important Note:- Example: Individual Series 4,4,4 ,8,8,8,12,16,20 Direct Method: X ̄ X1 X2 X3 ... Xn n X n = 84/9 = 9.32
Short-cut/Deviation/Origin Changed Method put d = X-A, Where, A is assumed mean Let, A = 12, Then, d = X-12 Items(X) d= X-A = X-12 4 -8 4 -8 4 -8 8 -4 8 -4 8 -4 12 0 16 4 20 8 d= -24 X ̄ A d n = 12 + (-24)/9 = 9.32 Step Deviation/Origin and Scale Changed Method: d ′ = X − A h let, A=12 and h= 4 then d ′ = X−12 4 Marks(X) d ′ = X − 12 4 4 -2 4 -2 4 -2 8 -1 8 -1 8 -1 12 0 16 1 20 2 d ′= -6 X ̄ A d' n × h = 12 + (−6) 9 × 4 = 9.32