Wednesday 31 July 2013

Mean, Median, Mode

“While the individual man is an insolvable
puzzle, in the aggregate he becomes a
mathematical certainty. You can, for
example, never foretell what any one man
will do, but you can say with precision
what an average number will be up to.”
Arthur Conan Doyle,
The Sign of Four
Sherlock Holmes spoke these words to his colleague Dr.Watson as the two were unravelling a mystery. The detective was implying that if a single member is drawn at random from a population, we cannot
predict exactly what that member will look like. However, there are some “average” features of the entire population that an individual is likely to possess. The degree of certainty with which we would expect to observe such average features in any individual depends on our knowledge of the variation among individuals in the population. Sherlock Holmes has led us to two of the most important statistical concepts: average and variation. (While the individual man is an insolvable)








Statistics is something that surrounds us every day – we’re constantly bombarded with statistics, in the form of polls, tests, ratings, etc. Understanding those statistics can be an important thing, but unfortunately, most people have never been taught just what statistics really mean, how they’re computed, or how to distinguish the different between statistics used properly, and statistics misused to deceive.
The most basic concept in statistics in the idea of an average. An average is a single number which represents the idea of a typical value. There are three different numbers which can represent the idea of an average value, and it’s important to know which one is being used, and whether or not that is appropriate. The three values are the mean, the median, and the mode.

MEAN
The mean is what most people are taught as the average in middle school math. Given a set of values, the mean is what you get by adding up all of the values, and dividing that sum by the number of values.
The mean is a very useful number – it summarizes the properties of the group. It’s important to understand that the mean does not represent an individual – in fact, there may be no individual whose value matches the mean.

MEAN = = sum of all data values / number of data values
or, more formally,

DEFINITION: Mean
The mean is the sum of a set of values, divided by the number of values in
the set. The notation for the mean of a set of values is a horizontal bar over
the variable used to represent the set. The formula for the mean of a data
set {x1, x2, . . . , xn} is,

 How to calculate the mean?

QUESTION
What is the mean of the data set {10; 20; 30; 40; 50}?

SOLUTION
Step 1 : Calculate the sum of the data
10 + 20 + 30 + 40 + 50 = 150
Step 2 : Divide by the number of values in the data set to get the mean
Since there are 5 values in the data set, the mean is
Mean =
MEDIAN
DEFINITION: Median
The median of a data set is the value in the central position, when the data
set has been arranged from the lowest to the highest value.
Note that exactly half of the values from the data set are less than the median and the other half are greater than the median.
To calculate the median of a quantitative data set, first sort the data from the smallest to the largest value and then find the value in the middle. If there are an odd number of data, the median will be equal to one of the values in the data set. If there are an even number of data, the median will lie halfway between two values in the data set.
Example 4: Median for an odd number of values
QUESTION
What is the median of {10; 14; 86; 2; 68; 99; 1}?
SOLUTION
Step 1 : Sort the values.
The values in the data set, arranged from the smallest to the largest,
are 1; 2; 10; 14; 68; 86; 99.
Step 2 : Find the number in the middle
There are 7 values in the data set. Since there are an odd number of
values, the median will be equal to the value in the middle, namely,
in the 4th position. Therefore the median of the data set is 14.
Example 5: Median for an even number of values
QUESTION
What is the median of {11; 10; 14; 86; 2; 68; 99; 1}?
SOLUTION
Step 1 : Sort the values
The values in the data set, arranged from the smallest to the largest,
are
1; 2; 10; 11; 14; 68; 86; 99
Step 2 : Find the number in the middle
There are 8 values in the data set. Since there are an even number
of values, the median will be halfway between the two values in the
middle, namely, between the 4th and 5th positions. The value in the
4th position is 11 and the value in the 5th position is 14. The median
lies halfway between these two values and is therefore
Median =
MODE
DEFINITION: Mode
The mode of a data set is the value that occurs most often in the set. The
mode can also be described as the most frequent or most common value in
the data set.
To calculate the mode, we simply count the number of times that each value appears in the data set and then find the value that appears most often.
A data set can have more than one mode if there is more than one value with the highest count. For example, both 2 and 3 are modes in the data set {1; 2; 2; 3; 3}. If all points in a data set occur with equal frequency, it is equally accurate to describe the data set as having many modes or no mode.
Example 6: Finding the mode
QUESTION
Find the mode of the data set {2; 2; 3; 4; 4; 4; 6; 6; 7; 8; 8; 10; 10}.
SOLUTION
Step 1 : Count the number of times that each value appears in the data set
Value
Count
2
2
3
1
4
3
6
2
7
1
8
2
10
2

Step 2 : Find the value that appears most often
From the table above we can see that 4 is the only value that appears
3 times, and all the other values appear less often. Therefore the
mode of the data set is 4.
One problem with using the mode as a measure of central tendency is that we can usually not compute the mode of a continuous data set. Since continuous values can lie anywhere on the real line, any particular value will almost never repeat. This means that the frequency of each value in the data set will be 1 and that there will be no mode. We will look at one way of addressing this problem in the section on grouping data.

Example : Comparison of measures of central tendency in real life situation.
QUESTION
There are regulations in South Africa related to bread production to protect consumers. By law, if a loaf of bread is not labelled, it must weigh 800 g, with the leeway of 5 per cent under or 10 per cent over. Vishnu is interested in how a well-known, national retailer measures up to this standard. He visited his local branch of the supplier and recorded the masses of 10 different loaves of bread for one week. The results, in grams, are given below:
Monday  Tuesday  Wednesday   Thursday   Friday   Saturday   Sunday
802,4        787,8        815,7             807,4           801,5     786,6         799,0
796,8        798,9        809,7             798,7           818,3     789,1         806,0
802,5        793,6        785,4             809,3           787,7     801,5         799,4
819,6        812,6        809,1             791,1           805,3     817,8         801,0
801,2        795,9        795,2             820,4           806,6     819,5         796,7
789,0        796,3        787,9             799,8           789,5     802,1         802,2
789,0        797,7        776,7             790,7           803,2     801,2         807,3
808,8        780,4        812,6             801,8           784,7     792,2         809,8
802,4        790,8        792,4             789,2           815,6     799,4         791,2
796,2        817,6        799,1             826,0           807,9     806,7         780,2

1. Is this data set qualitative or quantitative? Explain your answer.
2. Determine the mean, median and mode of the mass of a loaf of bread for
each day of the week. Give your answer correct to 1 decimal place.
3. Based on the data, do you think that this supplier is providing bread within
the South African regulations?
SOLUTION
Step 1 : Qualitative or quantitative?
Since each mass can be represented by a number, the data set is
quantitative. Furthermore, since a mass can be any real number, the
data are continuous.

Step 2 : Calculate the mean
In each column (for each day of the week), we add up the measurements
and divide by the number of measurements, 10. For Monday,
the sum of the measured values is 8007.9 and so the mean for Monday
is
In the same way, we can compute the mean for each day of the week.
See the table below for the results.
Step 3 : Calculate the median
In each column we sort the numbers from lowest to highest and find the value in the middle. Since there are an even number of measurements (10), the median is halfway between the two numbers in the middle. For Monday, the sorted list of numbers is
789,0; 789,0; 796,2; 796,7; 801,2;
802,3; 802,3; 802,5; 808,7; 819,6
The two numbers in the middle are 801,2 and 802,3 and so the median is
In the same way, we can compute the median for each day of the week:
Day
Mean(g)
Median(g)
Monday
800.8
801.8
Tuesday
797.2
796.1
Wednesday
798.7
797.2
Thursday
803.4
800.8
Friday
802.0
804.3
Saturday
801.6
801.4
Sunday
799.3
800.2
From the above calculations we can see that the means and medians
are close to one another, but not quite equal. In the next worked
example we will see that the mean and median are not always close
to each other.
Step 4 : Determine the mode
Since the data are continuous we cannot compute the mode. In the
next section we will see how we can group data in order to make it
possible to compute an approximation for the mode.
Step 5 : Conclusion: Is the supplier reliable?
From the question, the requirements are that the mass of a loaf
of bread be between 800 g minus 5%, which is 760 g, and plus 10%,
which is 880 g. Since every one of the measurements made by Vishnu
lies within this range and since the means and medians are all close
to 800 g, we can conclude that the supplier is reliable.

Reference:
1.     Handbook on Mathematics.
2.     Book of Everything in Mathematics- Grade 9-10
3.     Wikipedia

4.     Wolfram Maths World.

No comments:

Post a Comment