Section 1.8 Visual Statistical Measures - Graphical Representation of Data

xxxxxxxxxx
# This function is used to convert an input string into separate entries
def g(s):
S = str(s).replace(',',' ').replace('(',' ').replace(')',' ').split()
return S
def _(freq = input_box("1,1,1,1,2,2,2,3,3,3,3,1,5",
label="Enter data separated by commas",
width=60)):
freq = g(freq)
freq = [int(k) for k in freq]
m = min(freq)
M = max(freq)
bn = M-m+1
histogram( freq, range=[m-1/2,M+1/2], bins = bn, align="mid",
linewidth=2, edgecolor="blue", color="yellow").show()
xxxxxxxxxx
# This function is used to convert an input string into separate entries
def g(s):
S = str(s).replace(',',' ').replace('(',' ').replace(')',' ').split()
return S
def _(freq = input_box("1,1,1,1,2,2,2,3,3,3,3,1,5",label="Data separated by commas")):
freq = g(freq)
freq = [int(k) for k in freq]
top = len(freq)
m = min(freq)
M = max(freq)
bn = M-m+1
histogram( freq, range=[m-1/2,M+1/2], cumulative = "true", bins = bn,
align="mid", linewidth=2, edgecolor="blue", color="yellow").show(ymax=top)
Example 1.8.2. Simple Stem and Leaf Plot.
Consider the data points 25, 3, 17, 12, 22, 34, 12, 11, 16, 42, 9, 12, 17. In this case we will consider the stems to be the tens digits and the leaves to be the ones digits. This gives
Stems | Leaves |
0 | 3 9 |
1 | 7 2 2 1 6 2 7 |
2 | 5 2 |
3 | 4 |
4 | 2 |
Stems | Leaves |
0 | 3 9 |
1 | 1 2 2 2 6 7 7 |
2 | 2 5 |
3 | 4 |
4 | 2 |
Notice, in each case you can extract the original data values by recombining the stem with a corresponding leaf. Indeed, for these 13 data values the median should be be 7th in the sorted list or the value in the 10's stem with leaf 6...that is, 16.
Example 1.8.5. Stem and Leaf Plot for State Populations.
Using the state population data above, consider organizing the data but using a "two-pass sort" where you first roughly break data up into groups based upon ranges which relate to their first digit(s). In this case, let's break up into groups according to populations corresponding to 0-4 million, 5-9 million, 10-14 million, 15-19, million, 20-24 million, 25-29 million, 30-35 million, and 35-39 million. We can represent these classes by using the stems 0L, 0H, 1L, 1H, 2L, 2H, 3L, and 3H where the L and H represent the one's digits L in {0, 1, 2, 3, 4} and H in {5, 6, 7, 8, 9}. Once we group the data into these smaller groups then we can write the remaining portion of the number horizontally as leaves (in this case with one decimal place for all values.) This gives a step-and-leaf plot. If we additionally sort the data in the leaves then this gives you an ordered stem-and-leaf plot. For the state population data, the ordered stem-and-leaf plot is given by

xxxxxxxxxx
data <- c (0.6,0.6,0.6,0.7,0.7,0.8,0.9,1,1.1,1.3,1.3,
1.4,1.6,1.9,1.9,2.1,2.8,2.9,2.9,3,3,3.1,
3.6,3.9,3.9,4.4,4.6,4.8,4.8,5.3,5.4,5.7,
5.9,6,6.5,6.6,6.6,6.7,7,8.3,8.9,9.8,9.9,
10,11.6,12.8,12.9,19.6,19.7,26.4,38.3)
paste("Inter Quantile Range =",IQR(data))
paste("Box and Whisker Diagram - Box Plot):")
boxplot(data, horizontal=TRUE)