• Statistics Notes

    Click on book icon with peacock to the left for assignment lists. 

    Chapter 5 -- Random Variables

    5-6 answers

    5.7 answers


    Chapter 4 -- Probability and Randomness

     A#29 Stat 4.1 Key , A#30 Stat 4.2 Solutions , A#31 Stat 4.3 solution key

    Stat 4.4 solution key A#32 , Stat 4.5 solution key A#33

    Ch 4 Review solutions A#34

     Chapter 4 Test Wed 12/11 in HUB 10-12


    Chapter 3 Test given in SIS due to school closure last week.

    Copies will be in test folder Tuesday, Nov 5. 

     Please finish assignments #19-28 and submit to Kate in SIS

    or Sue in office by this Friday.

    You have until November 15 to complete the test in SIS.

    (It' NOT open book or notes this time,

    but you get to decide when you're ready to take it.)

    (It will NOT be in the R2 grade, but the Chapter 3 assignments will be.)

     Lessons 3.1-3.3 Answers

    Lessons 3.4-3.6 Answers

    Lesson 3.7 Key , Lesson 3.8 KeyLesson 3.9 Key

    Lesson 3 Review Solutions


    Monday's Test will be open book/open notes.

    Bring in your book and notebook.

    Assignment #16 answers 

    Assignment #17 answers

    Assignment #18 answers

    Assignment #19 Review KEY




     Sandwich fat g - Calories Plot

    Enter the following into a graphing calculator or spreadsheet (Google Sheets on Chromebook)

    Quarter Pounder 19 410
    Big Mac 31 580
    cheeseburger 16 343
    Wendy's 35 570
    Whopper 39 640
    Carl's Jr West 43 660

    Make a scatterplot.

     Sandwich Regression Plot


    To find regression equation on calculator:

    Press STAT, highlight "Calc", press 4: Linreg ax+b.

    Enter L1, L2 to specify the data.

    Press Y=.  Then press VARS, choose 5: Statistics,

    arrow cursor over to "EQ", choose 1: RegEQ.

    This should enter the best-fit equation into Y1.

    Press ZOOMStat (9) or GRAPH.


    The equation should be y = 10.4x + 225. (y = mx + b)

    Statisticians write it as y = 225 + 10.4x.  (y = a + bx)

    To give it more context, the units are substituted for the variables to make a predictive model:

    CALORIES = 225 + 10.4(fat g)


    As more sandwich data is added, the slopes and intercepts change slightly.

    The slope is the correlation factor times Sy/Sx,  where Sy is the standard deviation of the response variable, and Sx the explanatory.

    b = r * Sy / Sx .

    b = 0.97 (128.4 / 10.9) = 11.4

    The y-intercept is calculated by substituting the means for x and y, respectively.

    a = Mean y - b * (mean x)

    a = 534 - 11.4 (30.5) = 186.3

     So this model would be y = 186.3 + 11.4 x.


     Add the following sandwich data:


    hamburger 10 266
    double cheeseburger 11 440
    Bacon Double 22 400
    Impossible 14 240
    Beyond Meat 20 270
    Baconator 62 970
    Pretzel King 60 920

    The BKK Pretzel Bacon King debuted September 19, 2019.



    fat calories equation

    This line is slightly different because there are extra data points in it.

    They are the Impossible Burger, Beyond Meat, Baconator and the BK Pretzel Burger.

    Can you spot them?

     We can use the predictive model EQ (Calories = 107 + 13.8*(fat g)) to predict calorie values for the McDonald's McRib sandwich.

    (Coincidentally rereleased this week.)

    McRib CAL = 107 + 13.8(22 fat g) = 107 + 303.6 = 410.6

    The actual calories are 480.  This difference is called a residual.  It is 69.4.

    Double Down 32 540
    McRib 22 480

    Including these sandwiches to the data alters the regression equation slightly.

    The newly predicted value for the McRib would be 423.6, so it's residual drops to 34.6
    How did these change the correlation factor r?

    Sandwich Regression

    What is R^2?  It is the square of the corrrelation factor.

     For the KFC Double Down, the predicted value is 

    CAL = 142 +12.8(32 fat g) = 551.6.  Its residual is 540 - 551.6 = -11.6 calories.

    If we plot the residuals (datum - predicted value) by the explanatory variable,

    Residuals Plot


     there should be no correlation (Note R^2 is about 0.) 

    That is because most of the variance was accounted for by the regression equation.



    Regression using the calculator

    Regression using calc example problem

    CO2 Levels regression worksheet


    Sections 2.4-2.5 Solution KEY

    Correlation worksheet

    Correlations Solution KEY

    Football correlations wins and points

    Baseball correlations Win-Loss and Runs


    SAT 2018 Correlations

    Rich students get better SAT scores article published week of this lesson!




    Will UC schools drop their SAT scores requirement?


    posted two days after Monday's lesson, two before Friday's class.



    Sections 2.1-2.3 KEY



    Standard Deviation

    7 day forecast


      temperatures average deviation squared deviations Sx standard deviation
    SD Above/Below
    Thu, 9/ 12/ 2019 93 79.3 13.7 188.08 6.6 7.1 1.92
    Fri, 9/ 13/ 2019 84 79.3 4.7 22.22 6.6 7.1 0.66
    Sat, 9/ 14/ 2019 75 79.3 -4.3 18.37 6.6 7.1 -0.60
    Sun, 9/ 15/ 2019 72 79.3 -7.3 53.08 6.6 7.1 -1.02
    Mon, 9/ 16/ 2019 75 79.3 -4.3 18.37 6.6 7.1 -0.60
    Tue, 9/ 17/ 2019 77 79.3 -2.3 5.22 6.6 7.1 -0.32
    Wed, 9/ 18/ 2019 79 79.3 -0.3 0.08 6.6 7.1 -0.04
          Average= sum = Sum / n sum / (n-1)  
          0.0 305.43

    How to find Standard Deviation (STDEV)

    #1.  Find the average (arithmetic mean.)

    #2  Subtract the mean from each data value to get a deviation.

    #3  Square each deviation.

    #4  Sum up all squared deviations.  

    #5  Divide by n-1.*  This result is called the variance.

          *(If you have all data, a population, you can divide by n to get Sx. 

          If you have a sample of a data set, divide by n-1.)

    #6  Take the square root of the variance to restore original units (instead of units squared.)

    This is called the standard deviation.  Usually more than 2/3 of the data will lie within one standard deviation of the mean.  See below.


    7 day forecast


    Five of the seven data points, 5/7 = 72%, are within one standard deviation above or below the mean.




    Since Friday's class is canceled due to a minimum day, I will conduct an optional review session for both Friday and Monday stat classes at 12 for half an hour.


    Location is the HUB, where Monday's class meets.  If you can't make Friday at 12, I could also conduct one Thursday at 3.


    I will have a review guide of concepts, and can do any practice problems.


    Tests for each class on Chapter 1 Monday, Sept. 23rd, and Friday Sept. 27th.




    NOTES 1.7-1.9

    Review ASNWERS

    Stemplot and Histogram Examples (President Ages and Shoes)

    Practice Histogram warmups

    Sections 1.1 to 1.3 Power Point



    Describe the distribution (symmetric, uniform, skewed, clustered) of the soccer goals histogram:

    Premier League goals

    What would be the best measures of center (mode, median, mean)?


    Discuss the distribution of the runs scored in MLB's 2017 season.


    MLB 2017 Runs

    Would the mean or the median be the better measure of center?  What is the mode?

    NFL actual vs perceived frequency

    What is the actual mode?  What is the perceived mode?


    NFL scores


    Why does the above histogram have a different mode than the perceived actual histogram?


    NBA scores


    Describe the distribution of NBA scores.

    Are the mean, median, and mode the same measures of center?


    MLB runs


    Where is most of the data?  Which direction are baseball runs skewed?

    Will the mean or median be greater?  Which three measures of center is the least?



    Below is a histogram of Nationaal Hockey League goals.

    NHL Goals  

    Describe its distribution and measures of center.




Last Modified on February 7, 2020