• Statistics Notes

     


    Monday's Test will be open book/open notes.

    Bring in your book and notebook.

    Assignment #16 answers 

    Assignment #17 answers

    Assignment #18 answers

    Assignment #19 Review KEY

     


     

    Regression

     Sandwich fat g - Calories Plot

    Enter the following into a graphing calculator or spreadsheet (Google Sheets on Chromebook)


    SANDWICH fat (g) CALORIES
    Quarter Pounder 19 410
    Big Mac 31 580
    cheeseburger 16 343
    Wendy's 35 570
    Whopper 39 640
    Carl's Jr West 43 660

    Make a scatterplot.

     Sandwich Regression Plot

     

    To find regression equation on calculator:

    Press STAT, highlight "Calc", press 4: Linreg ax+b.

    Enter L1, L2 to specify the data.

    Press Y=.  Then press VARS, choose 5: Statistics,

    arrow cursor over to "EQ", choose 1: RegEQ.

    This should enter the best-fit equation into Y1.

    Press ZOOMStat (9) or GRAPH.

     

    The equation should be y = 10.4x + 225. (y = mx + b)

    Statisticians write it as y = 225 + 10.4x.  (y = a + bx)

    To give it more context, the units are substituted for the variables to make a predictive model:

    CALORIES = 225 + 10.4(fat g)

     

    As more sandwich data is added, the slopes and intercepts change slightly.

    The slope is the correlation factor times Sy/Sx,  where Sy is the standard deviation of the response variable, and Sx the explanatory.

    b = r * Sy / Sx .

    b = 0.97 (128.4 / 10.9) = 11.4

    The y-intercept is calculated by substituting the means for x and y, respectively.

    a = Mean y - b * (mean x)

    a = 534 - 11.4 (30.5) = 186.3

     So this model would be y = 186.3 + 11.4 x.

     

     Add the following sandwich data:

     

    hamburger 10 266
    double cheeseburger 11 440
    Bacon Double 22 400
    Impossible 14 240
    Beyond Meat 20 270
    Baconator 62 970
    Pretzel King 60 920
         

    The BKK Pretzel Bacon King debuted September 19, 2019.

    https://thetakeout.com/review-burger-king-pretzel-bacon-king-bun-1838914498

     

    fat calories equation

    This line is slightly different because there are extra data points in it.

    They are the Impossible Burger, Beyond Meat, Baconator and the BK Pretzel Burger.

    Can you spot them?

     We can use the predictive model EQ (Calories = 107 + 13.8*(fat g)) to predict calorie values for the McDonald's McRib sandwich.

    (Coincidentally rereleased this week.)

    McRib CAL = 107 + 13.8(22 fat g) = 107 + 303.6 = 410.6

    The actual calories are 480.  This difference is called a residual.  It is 69.4.

    Double Down 32 540
    McRib 22 480

    Including these sandwiches to the data alters the regression equation slightly.

    The newly predicted value for the McRib would be 423.6, so it's residual drops to 34.6
    How did these change the correlation factor r?

    Sandwich Regression

    What is R^2?  It is the square of the corrrelation factor.

     For the KFC Double Down, the predicted value is 

    CAL = 142 +12.8(32 fat g) = 551.6.  Its residual is 540 - 551.6 = -11.6 calories.

    If we plot the residuals (datum - predicted value) by the explanatory variable,

    Residuals Plot

     

     there should be no correlation (Note R^2 is about 0.) 

    That is because most of the variance was accounted for by the regression equation.

     



     

    Regression using the calculator

    Regression using calc example problem

    CO2 Levels regression worksheet

     


    Sections 2.4-2.5 Solution KEY

    Correlation worksheet

    Correlations Solution KEY

    Football correlations wins and points

    Baseball correlations Win-Loss and Runs

     

    SAT 2018 Correlations

    Rich students get better SAT scores article published week of this lesson!

     

    cnbc.com/2019/10/03/rich-students-get-better-sat-scores-heres-why.html

     

    Will UC schools drop their SAT scores requirement?

    https://www.latimes.com/california/story/2019-10-02/uc-sat-test-optionalhttps://www.latimes.com/california/story/2019-10-02/uc-sat-test-optional

    posted two days after Monday's lesson, two before Friday's class.

     latimes.com/california/story/2019-10-02/uc-sat-test-optional

     

    Sections 2.1-2.3 KEY

     


     

    Standard Deviation

    7 day forecast

     

      temperatures average deviation squared deviations Sx standard deviation
    SD Above/Below
    Thu, 9/ 12/ 2019 93 79.3 13.7 188.08 6.6 7.1 1.92
    Fri, 9/ 13/ 2019 84 79.3 4.7 22.22 6.6 7.1 0.66
    Sat, 9/ 14/ 2019 75 79.3 -4.3 18.37 6.6 7.1 -0.60
    Sun, 9/ 15/ 2019 72 79.3 -7.3 53.08 6.6 7.1 -1.02
    Mon, 9/ 16/ 2019 75 79.3 -4.3 18.37 6.6 7.1 -0.60
    Tue, 9/ 17/ 2019 77 79.3 -2.3 5.22 6.6 7.1 -0.32
    Wed, 9/ 18/ 2019 79 79.3 -0.3 0.08 6.6 7.1 -0.04
          Average= sum = Sum / n sum / (n-1)  
          0.0 305.43
    =variance
       

    How to find Standard Deviation (STDEV)

    #1.  Find the average (arithmetic mean.)

    #2  Subtract the mean from each data value to get a deviation.

    #3  Square each deviation.

    #4  Sum up all squared deviations.  

    #5  Divide by n-1.*  This result is called the variance.

          *(If you have all data, a population, you can divide by n to get Sx. 

          If you have a sample of a data set, divide by n-1.)

    #6  Take the square root of the variance to restore original units (instead of units squared.)

    This is called the standard deviation.  Usually more than 2/3 of the data will lie within one standard deviation of the mean.  See below.

     

    7 day forecast

     

    Five of the seven data points, 5/7 = 72%, are within one standard deviation above or below the mean.

     


    STATISTICS REVIEW SESSION (Optional)

    09/15/201909/20/2019

    Since Friday's class is canceled due to a minimum day, I will conduct an optional review session for both Friday and Monday stat classes at 12 for half an hour.

     

    Location is the HUB, where Monday's class meets.  If you can't make Friday at 12, I could also conduct one Thursday at 3.

     

    I will have a review guide of concepts, and can do any practice problems.

     

    Tests for each class on Chapter 1 Monday, Sept. 23rd, and Friday Sept. 27th.

     


     

    1.7 ANSWER KEY , 1.8 ANSWER KEY , 1.9 ANSWER KEY

    NOTES 1.7-1.9

    Review ASNWERS

    Stemplot and Histogram Examples (President Ages and Shoes)

    Practice Histogram warmups

    Sections 1.1 to 1.3 Power Point

     


    Histograms

    Describe the distribution (symmetric, uniform, skewed, clustered) of the soccer goals histogram:

    Premier League goals

    What would be the best measures of center (mode, median, mean)?

     

    Discuss the distribution of the runs scored in MLB's 2017 season.

     

    MLB 2017 Runs

    Would the mean or the median be the better measure of center?  What is the mode?

    NFL actual vs perceived frequency

    What is the actual mode?  What is the perceived mode?

     

    NFL scores

     

    Why does the above histogram have a different mode than the perceived actual histogram?

     

    NBA scores

     

    Describe the distribution of NBA scores.

    Are the mean, median, and mode the same measures of center?

     

    MLB runs

     

    Where is most of the data?  Which direction are baseball runs skewed?

    Will the mean or median be greater?  Which three measures of center is the least?

     

     

    Below is a histogram of Nationaal Hockey League goals.

    NHL Goals  

    Describe its distribution and measures of center.

     


      

     

Last Modified on Saturday at 11:23 AM