## Hypothesis Testing Assignments

1. The Florida Department of Labor and Employment Security reported the state mean annual wage was \$26,133. A hypothesis test of wages by county can be conducted to see whether the mean annual wage for a particular county differs from the state mean.
a. Formulate the hypothesis that can be used to determine whether the mean annual wage in Baker county differs from the state annual mean wage of \$26,133.
b. A sample of 550 people in Baker County showed a sample mean of \$25,457. Assume a population standard deviation of \$7600. What is the p-value? Use a significance level of 5%. What is your conclusion?
2. Glow toothpaste maintains that their tubes have always contained an mean of 12 ounces. The production group believes that the mean weight has changed. The weight in ounces for a sample of 15 tubes of toothpaste had an average value of 12,09 ounces and a standard deviation of 0.20 ounces. Use an appropriate hypothesis test to determine if the data show evidence of change in the mean weight. Use 90% confidence level.
3. Enumerate the 36 possible outcomes from rolling a pair of dice, and compute the probability of rolling each of the numbers from 2 to 12.
4. The Excel file contains mean temperatures for January and July and average annual precipitation for selected cities across the U.S. Construct 90% confidence intervals for the mean temperatures and precipitation.
5. If, based on a sample size of 100, a political candidate found that 59 people would vote for her in a two-person race. What is the 95% confidence interval for her expected proportion of the vote?
6. The Excel file contains the list of all the 76 items McDonalds serve and they are classified as sandwiches, fries, chicken pieces, salads, breakfasts, and desserts/shakes. Each record contains the serving size, calories, fat, cholesterol, sodium, and carb contents. For the entire month of September 1-30, you ate one of the sandwiches picked randomly from the list for dinner. Now you are sick of eating sandwiches. You decided to eat a salad for the entire month of October 1-31, picked in the same way you did in September. Your task is to analyze the data, summarize your experience and compare the differences between September and October. You have in your possession some very powerful statistical tools:
a. Descriptive Statistics;
b. Sampling;
c. Confidence Interval;
d. Hypothesis Testing:
e. Graphical display.

## BSB123 Data Analysis Assessment Item 2 Research Report (2017 S1)

The file: Birthweights.xlsx contains data on the following variables for a sample of 1000 births recorded in a large local hospital in 2015:

Variable Description
Birthweight Birthweight in grams
Gestation Length of pregnancy in days
Smoke Whether the mother is a smoker or not
Pre-pregnancy weight Mother’s pre-pregnancy weight in kilograms
Height Mothers height in centimetres
Status Mother’s indigenous status
Age Mother’s age in years

Background
Management at the hospital is interested in being able to better manage room allocations and bookings in their maternity ward. They are keen to identify mothers at risk of having low birth weight babies who may require additional hospital resources during their stay in the hospital.

The hospital has collected data for a number of previous births at the hospital. The data contains information on the variables outlined in the table above. As a consultant, they have approached you and asked if you could analyse this dataset.

Part 1 - Analysis (80%)

1. Past records (2004) show that the average birthweight was 3500 grams. Test at 5% if the average birthweight in 2015 has increased with the improvement in general nutrition.
(Include all six steps for hypothesis testing.) 2 marks)
2. Perform a two-sample t-test for each of the following tasks. (Include all six steps for hypothesis testing in each.)
(a) Determine if there is evidence that on average the weight of a baby of a mother who smokes is less than that of a mother who does not. ( = 5%) (2 marks)
(b) Determine if being indigenous is a disadvantage in terms of birthweight. ( = 5%) (2 marks)
The hospital management is particularly interested in whether you can develop a regression model to help them to predict the birthweight of a baby based on the variables in the data supplied. The model could then be used to predict birthweight to identify babies at risk in future.
3. By using the forward stepwise method, develop a multiple regression model to predict the birthweight.

Step 1: Gestation only
Step 2: Gestation and Smoke
Step 3: Gestation, Smoke and Pre-pregnancy Weight
Step 4: Gestation, Smoke, Pre-pregnancy Weight and Height
Step 5: Gestation, Smoke, Pre-pregnancy Weight, Height and Status
Step 6: Gestation, Smoke, Pre-pregnancy Weight, Height, Status and Age
(a) Interpret the regression coefficients of all six (6) independent variables in the model obtained in Step 6, and comment on the statistical significance of each. (3 marks)
(b) Use Excel to obtain the correlation matrix for the following variables: Gestation, Pre-pregnancy Weight, Height, Age and Birthweight. Do you think multi-collinearity is a problem in the regression model? Are the correlation coefficients consistent with the regression coefficients obtained in the model in Step 6? Discuss briefly. (3 marks)
(c) Focusing on Steps 3 and 4, discuss fully how the introduction of Height in Step 4 affects the regression coefficient of Pre-pregnancy Weight. (3 marks)
(d) Based on the results in (a) to (c), explain which independent variables should be included or excluded to formulate the final model. State the final model.
(2 marks)
(e) Comment on the overall adequacy of the final model. (2 marks)
(f) Consider an indigenous mother who is a smoker, 20 years of age, and 160cm tall with a pre-pregnancy weight of 58kg and gestational age of 267 days. What is the expected weight of the child, using the final model you have developed in (d)? (2 marks)
4. Compute the difference in the average birthweight of babies of indigenous and non-indigenous mothers (called the birthweight difference, for simplicity). Discuss fully if there is any discrepancy between the regression coefficient of Status obtained in the regression model and the birthweight difference. (3 marks)

Part 2 – Report (20%)
You are required to submit a concise report (word limit: 400) presenting any important features or relationships in the data. The content of your report should be based on, but not restricted to, insights gleaned from your analyses conducted in Part 1. (6 marks)

Notes:

## Part 1 - Analysis

• For presentation and ease of marking, it is advisable to include relevant Excel output in your answer to each question in this part instead of placing them in appendices.
• There is no word limit in Part 1.
Part 2 - Report
• The report is primarily based on the data provided. If, however, you wish to include, and refer to, additional information, you can use any referencing system as long as it is used consistently.
• You can include relevant charts and Excel objects in your report.
• Use 1 & ½ spacing and font size of 11.
• The word limit of 400 (with a tolerance of 10%) is exclusive of words in tables, appendices and reference list (if any).

Submission
• You should submit your response to both parts as a single pdf document saved in the format:
BSB123 Report_StudentName.pdf
• Due: 11:59 pm 28 May 2017 (Sunday) via Blackboard

For any assistance with this project, contact MyMathLab homework Help

## Option 4: Conduct an Oral History

For this assignment, you will have an opportunity to conduct an oral history with someone to learn about their immigration experience.
Find someone who you think be a good storyteller. You may interview someone you know or who you are related to if you like, but it may be easier to come up with questions for (and ask questions of) a person you don’t know as well. You must select someone willing to have their interview transcribed and published – so be transparent about the assignment and you intentions.

## Background research:

You will need to know some basic facts about the country from which they immigrated as well as the time period in which they came to the US, both to create a good list of questions as well as provide a little context for their story, so do some research on dates and events. Use trusted academic sources (reference texts, academic books, academic journals), not Wikipedia, newspapers, magazines, or blogs. You will be citing your sources in a bibliography.

## Create a list of interview questions

:
After conducting your background research, create a list of about 15-20 questions. Aim for a few simple questions, with the majority being open-ended questions (i.e., Ones that cannot be answered with simply a “yes”, or “no”).

## Conduct and record the Interview

Aim for 20-30 minutes Less and you may not have much work with, more and you may have too much. Too much is good; not enough is bad. If this is someone you have safe access to and thus can interview them in persona, bring a backup method for recording just in case your first one fails. Otherwise, try to use something like Zoom or different software that gives you a comparable “face to face’ experience. If possible, also take some handwritten notes during the interview, recording things that might not otherwise be apparent via audio (such as facial expressions, body language, etc.) – this type of visual information is also important.

## Transcribe the Interview:

Type up both your questions and interviewee’s answers word-for-word. It is tedious. It is time-consuming. It is also necessary!

## Write the “story” or narrative

Provide a short introduction, briefly explaining who was interviewed and any background information you think is necessary to understand the story (including details about the individual or historical facts). Then, using your interview transcription, tell this immigration story as completely as you can in the interviewee’s own words. Your goal is to crat a narrative that is mostly about the interviewee and their experience, so use a lot of direct quotes from the interview

## Using Standard Normal Table to Find Probability

Qn23.
Provide an appropriate response. Use the Standard Normal Table to find the probability.
Assume that blood pressure readings are normally distributed with mu = 120, and sigma = 8. A blood pressure reading of 145 or more may require medical attention. What percent of people have a blood pressure reading greater than 145?

``Possible answers; 11.09%, 0.09%, 99.91%, 6.06%``

Qn24.
Seventy-five percent of adults want to live to age 100. You randomly select five adults and ask them whether they want to live to age 100. The random variable represents the number of adults who want to live to age 100. Complete parts (a) through (c) below.

b) Graph the binomial distribution using a histogram and describe its shape.

What is the shape of the histogram?
c) What values of the random variable X would you consider unusual?

## 2.- Assignment M3

This week we will be learning about Bias. Your PowerPoint notes have the essential material and the textbook can help elaborate. This is the first week you will have a data analysis project due. The dataset you need is posted in the materials folder so you should download it from BB and open it with SPSS. I have included screenshots in the PowerPoint notes and your book goes step by step. Remember that there are multiple ways to get the same answer, so as long as you understand the purpose of the test or task, the means to the end may vary. The write-up example I have provided is already formatted and presented in the way I would like to have you turn yours in. This example is in the 6th Edition of APA - please use the 7th format as your style.

M3: Use dataset provided to conduct tests of normality including skew, kurtosis, and Levene’s test. Interpret results and explain findings. See powerpoint notes for help with these tests. Lastly, formulate solutions to any problems with the data. (Docs attached)

Using the Notebook.sav data, check the assumptions of normality and homogeneity of variance for the two films (ignore sex). Are the assumptions met?