## You may use whatever method you prefer to complete the assignment

You may use whatever method you prefer to complete the assignment (i.e. could be done in Excel, Jupyter Notebook, Tableau, etc.).

1. The deliverable should be an email with findings from your work. Include screenshots of tables, charts, etc. to help tell the story. Additionally, attach any code you wrote.

## Questions

1. Forecast the YoY performance of three restaurant chains: EAT (period ending 12/23/2020), DRI (period ending 11/29/2020), and TXRH (period ending 12/29/2020) using the provided transaction data. The data to complete the request can be found in the “Case Study Data.xlsx” file.
2. Write the SQL queries you would have used to calculate the quarterly YoY % changes in each of the restaurant chain’s transaction data.
3. Use the Review Data to formulate an opinion on the quality of each restaurant chain.
4. Tell us if you think there is any fundamental characteristics that separate one of these chains from the others. Tell us if you feel there is any reason to invest or why not to invest in any of these chains. Feel free to research and contemplate each chain’s position within broader secular changes.

Assignment Data file

## Doing Tests of Proportions Quiz - task

Doing Tests of Proportions Quiz,
Qn1.
Download the file deviceprefs.csv from the course materials. This file describes a study in which people with and without disabilities indicated their preferences for touchpads or trackballs as computer input devices. You will use R to analyze this file to answer the questions in this quiz.. With this and every quiz in this course, you can find what you need by understanding and mimicking coursera.R, the R code file used in the lecture. This first question gives you credit for getting R, RStudio, and deviceprefs.csv ready to go. Are you ready to proceed?

Qn2. How many subjects’ preferences were recorded?
Note: For this and every other quiz in this course, when you miss a question, code will be revealed to help you. However, you cannot usually just copy the code verbatim. For example, the variable ‘df” is used throughout these code snippets to refer to the “data frame”, the term R uses for the variable that holds the .csv file that you read in. If you read in your .csv file into a variable with a name other than “df”, then you will need to use your variable name, not “df”. Similarly, the variable “m” is used in these code snippets to hold a fitted statistical model. If you use a variable name other than “m” for your model, you will need to change “m” to be your variable name.
Be sure that when you copy the code provided for missed questions, you understand that code by looking p the documentation for the R functions used. You can do that after loading a function’s library with the question mark operator. For example, if the function is “foo” then you would do:
?foo
As stated, this only works if the library defines “foo” is loaded. You load libraries into memory with the library command:
And this won’t work unless the “foolib” package is installed. You install packages using install.packages, which brings the package files into your computer and should only have to be done once:
Does the data table indicate a one-sample proportion or a two-sample proportion?
As described, the data table shows input device preferences of certain people with and without disability. How many subjects have a disability?
Ignoring for a moment disability status, perform a one-sample chi-square test to see whether the proportion of subjects who preferred the trackball (or touchpad) differed significantly from chance. To the nearest hundredth (two digits), what is the chi-square statistic? Hint: Note that this question is not asking for the p-value!

For people without disabilities, perform a binomial test to see whether their preference for touchpads differed significantly from chance. To the nearest ten-thousandth (four digits), what is the p-value? Hint: Run a binomial test comparing the sum of rows of people without disabilities who prefer the touchpad against the number of all rows of people without disabilities. With two possible preferences, touchpad and trackball, the chance probability would be 1/2/. Do not correct for multiple comparisons; consider this a single test on a subset of the data.

For people with disabilities, perform a binomial test to see whether their preference for touchpads differed significantly from chance. To the nearest ten-thousandth (four digits), what is the p-value? Hint: Run a binomial test comparing the sum of rows of people with disabilities. With two possible preferences, touchpad, and trackball, the chance probability would be 1/2. Do not correct for multiple comparisons; consider this a single test on a subset of the data.

Conduct a two-sample Chi-square test of proportions on preferences by disability status. To the nearest hundredth (two digits), what is the chi-square statistic?

Perform a two-sample G-test on preferences by disability status. To the nearest hundredth (to digits), what is the G statistic? Hint: Use the RVaideMemoire library and its G.test function.
Perform Fisher’s exact test on preferences by disability status. To the nearest ten-thousandth ( four digits), What is the p-value?

## PSY 223 Final Project Guidelines and Rubric

The final project for this course is the creation of a statistical analysis report. The two research courses (PSY 223 and PSY 224) will demystify statistics and research methods in order to show that they are based on simple principles that apply to situations in the social sciences. In psychology, we need to distinguish what is “real” from what is “not real but looks real.” Is this patient really depressed? Does this form of group treatment of adolescents work better than a different form of treatment?

In this summative assessment, you will choose a scenario from a given set to be the basis for your statistical analysis report. Within the scenario, you will be given a data set based on two groups. You will apply the statistical analysis skills you have learned in this course to interpret the data and write up a report of the results. You will be evaluated not only on your computations but also on your explanation of the interpretation of the data.

The project is divided into three milestones and a final product. The milestones will be submitted at various points throughout the course to scaffold learning and to ensure quality final submissions. These milestones will be submitted in Modules Two, Four, and Five. The final project will be submitted in Module
Seven.
In this assignment, you will demonstrate your mastery of the following course outcomes:

Analyze descriptive and inferential statistics for preparing statistically accurate psychological research
Utilize appropriate statistical techniques for computing descriptive statistics and generating graphs regarding statistical analyses of psychological research
Select appropriate statistical procedures for use in statistical analyses regarding psychological research
Interpret the results of statistical analyses of psychological research data for drawing informed conclusions regarding the implications of psychological research
Assess scenarios involving statistical procedures for ensuring alignment with the expectations of the APA Ethical Principles of Psychologists

Find the remaining section of the final paper instructions in the attachment.
1 psy223_final_project_guidelines_and_rubric.pdf

## Estimate the linear effect of dose

Background: This second part of the assignment requires some more theoretical work based on fitting a linear regression model to investigate the effect of three dosage levels on an outcome. Suppose a clinical investigator is interested in examining the relationship between the effect of increasing doses of vitamin D supplement given to individuals who are Vitamin-D deficient. She performs a randomised trial in which she allocates (at random) volunteers to three groups, 1000 IU (International Units), 2000 IU and 3000 IU of supplement, per day for a perion of three months, after whcih the serum levels of a key metabolite of Vitamin D called 25 (OH) D are measured in each participant. (N.B. This is a hypothetical scenrio based on a real question that is current in epidemiology at the moment.)

Question 1
One possible analysis of the data described is to estimate the linear effect of dose, i.e to assume a linear relationship of expected outcome (labelled Y, as usual) to dose level, which for simplicity we will represent as X = 1,2,3 representing doses 1000IU, 2000IU, and 3000 UI respectively. To estimate the average rate of change in Y with dose we would fit the simple linear regression model with the standard assumptions for the error term:

Yi = Bo + B1x1 + ei

To objective is to show (algebraically) that if the sample size allocation between group1 1, group 2 and group 3 is 1:1:4 (i.e. n1-n, n2=n, n3 = 4n), then the leat squares estimate of B1 is

B1 = (4Y_bar3 - 3Y_bar1 - Y-bar2)/7

Question 2
The dataset provided contains some simulated data that might have arisen from the study just described, with 15 participants in groups 1, and 2, and 60 participants in dose group 3. Fit the regression model discussed above and demonstrate that the result obtained from B1 in question 1 is true in this sample.

Dataset for use in this assignment..
dosevd_reg_KA.xlsx

## sample statistics exams - hospital nursing station the following information is available about a patient

Sample statistics exams from my Mathlab Homework Help

1. At a hospital nursing station the following information is available about a patient.
(a) Name: Jim Wood

(b) Age: 17

(c) Weight: 165 lb

(d) Height: 6’1”

(e) Blood type: A

(f) Temperature: 96.8 °F

(g) Condition: Fair

(h) Date of admission: January 21, 1998

(i) Response to treatment: Excellent

For the information (a) to (i) list the highest level of measurement as ratio, interval,
ordinal, or nominal.

1. What technique for gathering data (sampling, experiment, simulation or census) do you think was used in
each of the following studies?
(a) The manager of an automobile repair shop selects a random sample of service records and records the total amount of time each vehicle was in the facility.

(b) The same manager tests a computerized diagnostic machine by comparing its performance on a random

``````  sample of 20 vehicles with the evaluation of a professional mechanic for the same 20 vehicles.
``````

(c) The same manager surveys every customer who has had a car serviced to determine the quality of the

``````  customer’s service and the customer’s level of satisfaction with the service.
``````

(d) An automobile manufacturer uses a computer simulation program to test the aerodynamic properties of a

`````` proposed new automobile body design.
``````

(e) A service manager uses computer software to simulate a new arrangement of automotive workstations to see if the arrangement will provide more efficient service.

1. Describe how you could use a random number table to simulate the experiment of tossing one die 275
times. The results of tossing a die once can be any of the digits 1, 2, 3, 4, 5, or 6.
2. What technique (observational study or experiment) for gathering data do you think was used in the
following study?

In a national forest, 87 deer were caught, tagged, and then released back into the wild. Two weeks later,
62 deer were caught and 43 were found to have tags. From this, it was possible to estimate how many
deer live in the forest.

1. Ticket sales for cultural attractions in a metropolitan area were as follows (in thousands of tickets) Opera: 10; Theater: 45; Symphony: 30; Ballet: 8; Other: 7.

(a) Make a circle graph for this data.

(b) Why is a circle graph a good choice for this data? What other type of graph would be appropriate?

1. Different types of cameras are available to record memories of family, vacations and special moments. A
survey of 1,000 cameras purchased last year showed that 250 were 35 mm, 260 were disk, 450 were instant and 40 were other types.

(a) Make a bar graph showing the camera types and the volume of sales.

(b) Make a Pareto chart of the same data.

1. The first year students on one floor of a dorm were polled to determine how often they phoned home during the first 8 weeks of the fall term. The results are given below.
2. 8 6 25 4 21 10 1 24 12 4 16
3. 2 12 28 14 17 12 1 16 18 18 3
4. 6 6 12 10 20 9 6 8 6 8 15

(a) Make a stem and leaf display of the data using 2 lines per stem.

(b) What can you say about the frequency of calls home for these first term students?

1. Statistical Abstract of the United States (117th edition) reported the value of computers and peripherals
produced in the United States. The data (in billions of dollars) is as follows: For 1990, 52.6; for 1991, 49.1;
for 1992, 54.7; for 1993, 57.9; for 1994, 65.6; for 1995, 81.0.

(a) Organize the data in a table.

(b) Make a time plot of the data.

1. A survey of students using a new automated telephone registration process identified the following
complaints about the new system. There were 350 complaints about the line being busy. Lack of advising
information produced 100 complaints, difficulty in entering course selection codes correctly produced 35 complaints. Difficulty in changing a previous course selection produced 120 complaints.

(a) Make a Pareto chart of this information.

(b) Based on the chart, what suggestions would you make to improve the system and cut down on the number of complaints?

1. If you are creating a frequency polygon based on some data in a frequency table, what information from the frequency table would you use to plot the point representing each class of data?

A. the lower class limit and the class frequency
B. the lower class boundary and the class frequency
C. the class midpoint and class width
D. the class frequency and the upper class boundary
E. the class frequency and the class midpoint

1. Which of the five choices below describes a feature that is true about Pareto charts?

A. The bars in the graph are always displayed in descending order of height from left to right.
B. The bars in the graph can be vertical or horizontal.
C. The bars in the graph always touch.
D. The intervals on the horizontal axis represent equal units of time.
E. Each data value is broken into two parts.

1. Statistical Abstracts (117th edition) reports gasoline excise taxes, in cents per gallon, in the west (mountain region) as follows:

28 26 9 22 19 18 19 24

Find the mean, the median, and the mode of these taxes.

1. In the process of tuna fishing, porpoises are sometimes accidentally caught and killed. A U. S. oceanographic institute wants to study the number of porpoises killed in this way. Records from eight commercial tuna fishing fleets gave the following information about the number of porpoises killed in a
three month period:
2. 6 18 9 0 15 3 10
(a) Find the range.
(b) Find the sample mean.
(c) Find the sample standard deviation.
3. According to data provided by the Statistical Abstract of the United States (117th edition), the number of daily newspapers in the five states in the midwest has a mean =30.8 with standard deviation =38.3.
For five states in the Pacific region =61.4 with a standard deviation =19.07.

(a) Compute the coefficient of variation for each region.

(b) Which region has the greater variation in the number of newspapers?

1. From years of experience fishing for trout in the Yellowstone River you know that the mean length of trout you catch is 14.7 inches with standard deviation 1.5 inches.

(a) Use Chebyshev’s Theorem to find an interval for the lengths of trout which will contain the lengths of at least 75% of the fish you catch.

(b) Use Chebyshev’s Theorem to find an interval for the lengths of trout which will contain the lengths of at least 93.8% of the fish you catch.

1. A study was done showing the age distribution of people doing volunteer work for a random sample of 545 volunteers.
Age 14 – 17 18 – 24 25 – 44 45 – 64 65 – 80
Frequency 142 125 72 124 82
(a) Estimate the sample mean age of volunteers.

(b) Estimate the sample standard deviation.

1. In the French class at Eva College a standard weighting is given to the required activities in all sections. These weights are: Final exam: 40%; Midterm: 30%; Attendance: 10%; Language Lab: 20%. Each of the four activities is graded on a 100 point scale. George earned 93 points on the final, 82 points on the
midterm, 75 points on attendance and 80 points on language lab. Compute his overall average in his French
class.
2. In one personality assessment test, a group of questions relate to self-acceptance. A random sample of 15 scores on the self-acceptance portion are
5 20 22 27 30 17 12 15
3. 9 18 13 12 28 19

(a) Compute the five-number summary and the interquartile range.

(b) Make a box-and-whisker plot.

Need Help with your statistics Exam?