Posts under category statistics homework help

Doing Oneway Repeated Measures ANOVAs

Qn1. Download the file websearch2.csv from the course materials. This file describes a study in which subjects were asked to find 100 distinct facts on the web using different search engines. The number of searches required and a subjective effort rating for each search engine were recorded. How many subjects took part in this experiment?

Qn2. To the nearest hundredth (two digits), what was the average number of searches required for the search engine that had the greatest average overall?

Qn3. Conduct an order effect test on Searches using a paired-samples t-test assuming equal variances. To the nearest ten-thousandths (four digits), what is the p-value from such a test? Hint: Use the reshapape2 library and the dcast function to create a wide-format table with columns for each level of Order.

Qn4. Conduct a paire-samples t-test, assuming equal variances, on searches by Engine.To the nearest hundredths (two digits), what is the absolute value of the t statistic for sich a test? Hint: use the reshape2 library and the dcast function to create a wide-format table with columns for each elvel of engine.

Qn5. Conduct a nonparametric Wilcoxon signed-rank test on the Effort Likert-type ratings. Calculate an exact p-value. To the nearest ten-thousandth (four digits), what is the p-value from such a test? Hint: Use the coin library and its wilcoxsign_test function with distribution = “exact”

Qn6. Download the file websearch3.csv from the course materials. This file describes a study just like the one from websearch2.csv, except that now the three search engines were used instead of two. Once again, the number of searches required and a subjective effort rating for each search engine were recorded. How many subjects took part in this new experiment?

Qn7. To the nearest hundredth (two digits), what was the average number of searches required for the search engine that had the greatest average overall?

Qn8. Conduct a repeated measures ANOVA to determine if there was an order effect on searches. First determine whether there is a violation of sphericity. To the nearest ten-thousandths (four digits), what is the value of Mauchly’s W criterion? Hint: use the ez library and its ezANOVA function passing within-Order, among other things, to test for order effects.

Qn9. Interpret the result of Mauchly’s test of sphericity, and then interpret the appropriate repeated measures ANOVA result. To the nearest ten-thousandth (four digits), what is thep-value from the appropriate F-test?

Qn10, Conduct a repeated measures ANOVA on searches by Engine. First determine whether there is a violation of sphericity. To the nearest ten-thousandth (four digits), what is the value of Mauchly’s W Criterion? Hint: use the ez library and its ez ANOVA function passing within-Engine, among other things, to test a significant main effect.

Qn11. Interpret the result of Mauchly’s test of sphericity, and then interpret the appropriate repeated measures ANOVA result. To the nearest ten-thousandth (four digits), what is the p-value from the appropriate F-test?

Qn12. Strictly speaking, given the result of the repeated measures ANOVA examining searches by Engine, are post hoc pairwise comparisons among levels of Engine Warranted?

-yes
-No

Qn13. Whatever your previous answer, proceed to do post hoc pairwise comparisons. Conduct manual pairwise comparisons of searches among levels of engine using paired-samples t-tests, assuming equal variances and using Holm’s sequential Bonferroni procedure to correct for multiple comparisons. To the nearest ten-thousandths (four digits), what is the smallest corrected p-value resulting from this set of tests? Hint: use the reshape2 library and dcast function to create wide-format table.

Qn14. Conduct a nonparametric Friedman test in the Effort Likert-type ratings. Calculate an asymptomatic p-value. To the nearest ten-thousandth (four digits), what is the chi-square statistic from such a test? Hint: Use the coin library and the friedman_test function.

Qn15. Strictly speaking, given the result of the Friedman test examining Effort by Engine, are post hoc pairwise comparisons among levels of engine warranted?
-Yes

  • No

Qn16. Whenever your previous answer, proceed to do post hoc pairwise comparisons. Conduct manual pairwise comparisons of Effort among levels of Engine using Wilcoxon signed-rank tests, Using Holm’s sequential Bonferroni procedure to correct for multiple comparisons. To the nearest ten-thousandth (four digits), what is the smallest corrected p-value resulting from this set of tests? Hint: Use the reshape2 library and dcast function to create wide-format table. Then use the wilcox.test function with paired=TRUE (and to avoid warnings, exact = FALSE).

,

MEDICARE OVERBILLING ANALYSIS

Your company is running a Medicare audit on Sleaze Hospital. Because Sleaze has a history of overbilling, the focus of your audit is on checking whether the billing amounts are correct. Assume that each invoice is for too high an amount with probability 0.06 and for too low an amount with probability 0.01 (so that the probability of a correct billing is 0.93). Also, assume that the outcome for any invoice is probabilistically independent of the outcomes for other invoices.

For this Assignment, reflect on the case presented. Think about what strategies you might use to calculate associated probabilities for Sleaze Hospital, and then address the series of questions for the completion of the Assignment.

THE ASSIGNMENT: (3–5 PAGES)

If you randomly sample 200 of Sleaze's invoices, what is the probability that you will find at least 15 invoices that overcharge the customer? What is the probability you won't find any that undercharge the customer?

Find an integer, k, such that the probability is at least 0.99 that you will find at least k invoices that overcharge the customer. (Hint: Use trial and error with the BINOMDIST function to find k.)

Suppose that when Sleaze overcharges Medicare, the distribution of the amount overcharged (expressed as a percentage of the correct billing amount) is normally distributed with mean 15% and standard deviation 4%.

What percentage of overbilled invoices are at least 10% more than the legal billing amount?

What percentage of all invoices are at least 10% more than the legal billing amount?

If your auditing company samples 200 randomly chosen invoices, what is the probability that it will find at least five where Medicare was overcharged by at least 10%?

Submit your answers and embedded Excel analysis as a Microsoft Word management report.

LEARNING RESOURCES

Albright, S. C., & Winston, W. L. (2017). Business analytics: Data analysis and decision making (6th ed.). Stamford, CT: Cengage Learning.

Chapter 4, "Probability and Probability Distributions"
Chapter 5, "Normal, Binomial, Poisson, and Exponential Distributions"
Fulton, L. V., Mendez, F. A., Bastian, N. D., & Musal, R. M. (2012). Confusion between odds and probability, a pandemic? Journal of Statistics Education, 20(3), 1–20.
Note: Retrieved from the Walden Library databases.Microsoft. (2016). Statistical functions (reference).Links to an external site. Retrieved from https://support.office.com/en-us/article/Statistical-functions-reference-624dac86-a375-4435-bc25-76d659719ffd

DECISION MAKING UNDER UNCERTAINTY—BIOTECHNICAL ENGINEERING

As you have examined this week, healthcare administration leaders are expected to exercise decision making under conditions of uncertainty. Perhaps more so than any other business, healthcare administration leaders face multiple challenges since ineffective business practices might not result in poor performance with their bottom lines and, if not, it might negatively impact patient safety. Understanding how to appropriately exercise decision making under conditions of uncertainty is a useful skill for effective healthcare administration practice.

For this Assignment, review the resources for this week, and reflect on how healthcare administration leaders must exercise decision making under conditions of uncertainty. Consider how you might engage in decision making under uncertainty, as you complete the Assignment and the Case Study 6.4 on pages 275-276 of your course text. Note: You will need to use Excel and the textbook add-in, "Precision Tree."

Alternatively, you may also download the PrecisionTree software as a Free Trial by accessing the following:

http://www.palisade.com/precisiontree/Links to an external site.

You will need to fill out the information presented and will need to use a personal email address to use the Free Trial provided.

C A SE 6.4 DEVELOPING A HELICOPTER COMPONENT FOR THE ARMY

The Ventron Engineering Company has just been awarded a $2 million development contract by the U.S. Army Aviation Systems Command to develop a blade spar for its Heavy Lift Helicopter program. The blade spar is a metal tube that runs the length of and provides strength to the helicop- ter blade. Due to the unusual length and size of the Heavy Lift Helicopter blade, Ventron is unable to produce a single-piece blade spar of the required dimensions using existing extrusion equipment and material. The engineering department has prepared two alternatives for developing the blade spar: (1) sectioning or (2) an improved extrusion process. Ventron must decide which process to use. (Backing out of the contract at any point is not an option.) The risk report has been prepared by the engineer- ing department. The information from this report is explained next.

The sectioning option involves joining several shorter lengths of extruded metal into a blade spar of sufficient length. This work will require extensive testing and rework over a 12-month period at a total cost of $1.8 million. Although this process will definitely produce an adequate blade spar, it merely represents an extension of existing technology. To improve the extrusion process, on the other hand, it will be necessary to perform two steps: (1) improve the material used, at a cost of $300,000, and (2) modify the extrusion press, at a cost of $960,000. The first step will require six months of work, and if this first step is successful, the second step will require another six months of work. If both steps are successful, the blade spar will be available at that time, that is, a year from now. The engineers estimate that the probabilities of succeeding in steps 1 and 2 are 0.9 and 0.75, respectively. However, if either step is unsuccessful (which will be known only in six months for step 1 and in a year for step 2), Ventron will have no alternative but to switch to the sectioning process—and incur the sectioning cost on top of any costs already incurred.

Development of the blade spar must be com- pleted within 18 months to avoid holding up the rest of the contract. If necessary, the sectioning work can be done on an accelerated basis in a six-month period, but the cost of sectioning will then increase from $1.8 million to $2.4 million. The director of engineering, Dr. Smith, wants to try developing the improved extrusion process. He reasons that this is not only cheaper (if successful) for the current proj- ect, but its expected side benefits for future projects could be sizable. Although these side benefits are dif- ficult to gauge, Dr. Smith’s best guess is an additional $2 million. (These side benefits are obtained only if both steps of the modified extrusion process are completed successfully.)

a. Develop a decision tree to maximize Ventron’s EMV. This includes the revenue from this project, the side benefits (if applicable) from an improved extrusion process, and relevant costs. You don’t need to worry about the time value of money; that is, no discounting or net present values are required. Summarize your findings in words in the spreadsheet.

b. What value of side benefits would make Ventron indifferent between the two alternatives?

c. How much would Ventron be willing to pay, right now, for perfect information about both steps of the improved extrusion process? (This information would tell Ventron, right now, the ultimate success or failure outcomes of both steps.

Doing Tests of Proportions Quiz,
Qn1.
Download the file deviceprefs.csv from the course materials. This file describes a study in which people with and without disabilities indicated their preferences for touchpads or trackballs as computer input devices. You will use R to analyze this file to answer the questions in this quiz.. With this and every quiz in this course, you can find what you need by understanding and mimicking coursera.R, the R code file used in the lecture. This first question gives you credit for getting R, RStudio, and deviceprefs.csv ready to go. Are you ready to proceed?

Qn2. How many subjects’ preferences were recorded?
Note: For this and every other quiz in this course, when you miss a question, code will be revealed to help you. However, you cannot usually just copy the code verbatim. For example, the variable ‘df” is used throughout these code snippets to refer to the “data frame”, the term R uses for the variable that holds the .csv file that you read in. If you read in your .csv file into a variable with a name other than “df”, then you will need to use your variable name, not “df”. Similarly, the variable “m” is used in these code snippets to hold a fitted statistical model. If you use a variable name other than “m” for your model, you will need to change “m” to be your variable name.
Be sure that when you copy the code provided for missed questions, you understand that code by looking p the documentation for the R functions used. You can do that after loading a function’s library with the question mark operator. For example, if the function is “foo” then you would do:
?foo
As stated, this only works if the library defines “foo” is loaded. You load libraries into memory with the library command:
And this won’t work unless the “foolib” package is installed. You install packages using install.packages, which brings the package files into your computer and should only have to be done once:
Does the data table indicate a one-sample proportion or a two-sample proportion?
As described, the data table shows input device preferences of certain people with and without disability. How many subjects have a disability?
Ignoring for a moment disability status, perform a one-sample chi-square test to see whether the proportion of subjects who preferred the trackball (or touchpad) differed significantly from chance. To the nearest hundredth (two digits), what is the chi-square statistic? Hint: Note that this question is not asking for the p-value!

For people without disabilities, perform a binomial test to see whether their preference for touchpads differed significantly from chance. To the nearest ten-thousandth (four digits), what is the p-value? Hint: Run a binomial test comparing the sum of rows of people without disabilities who prefer the touchpad against the number of all rows of people without disabilities. With two possible preferences, touchpad and trackball, the chance probability would be 1/2/. Do not correct for multiple comparisons; consider this a single test on a subset of the data.

For people with disabilities, perform a binomial test to see whether their preference for touchpads differed significantly from chance. To the nearest ten-thousandth (four digits), what is the p-value? Hint: Run a binomial test comparing the sum of rows of people with disabilities. With two possible preferences, touchpad, and trackball, the chance probability would be 1/2. Do not correct for multiple comparisons; consider this a single test on a subset of the data.

Conduct a two-sample Chi-square test of proportions on preferences by disability status. To the nearest hundredth (two digits), what is the chi-square statistic?

Perform a two-sample G-test on preferences by disability status. To the nearest hundredth (to digits), what is the G statistic? Hint: Use the RVaideMemoire library and its G.test function.
Perform Fisher’s exact test on preferences by disability status. To the nearest ten-thousandth ( four digits), What is the p-value?

Estimate the linear effect of dose

Background: This second part of the assignment requires some more theoretical work based on fitting a linear regression model to investigate the effect of three dosage levels on an outcome. Suppose a clinical investigator is interested in examining the relationship between the effect of increasing doses of vitamin D supplement given to individuals who are Vitamin-D deficient. She performs a randomised trial in which she allocates (at random) volunteers to three groups, 1000 IU (International Units), 2000 IU and 3000 IU of supplement, per day for a perion of three months, after whcih the serum levels of a key metabolite of Vitamin D called 25 (OH) D are measured in each participant. (N.B. This is a hypothetical scenrio based on a real question that is current in epidemiology at the moment.)

Question 1
One possible analysis of the data described is to estimate the linear effect of dose, i.e to assume a linear relationship of expected outcome (labelled Y, as usual) to dose level, which for simplicity we will represent as X = 1,2,3 representing doses 1000IU, 2000IU, and 3000 UI respectively. To estimate the average rate of change in Y with dose we would fit the simple linear regression model with the standard assumptions for the error term:

Yi = Bo + B1x1 + ei

To objective is to show (algebraically) that if the sample size allocation between group1 1, group 2 and group 3 is 1:1:4 (i.e. n1-n, n2=n, n3 = 4n), then the leat squares estimate of B1 is

B1 = (4Y_bar3 - 3Y_bar1 - Y-bar2)/7

Question 2
The dataset provided contains some simulated data that might have arisen from the study just described, with 15 participants in groups 1, and 2, and 60 participants in dose group 3. Fit the regression model discussed above and demonstrate that the result obtained from B1 in question 1 is true in this sample.

Dataset for use in this assignment..
dosevd_reg_KA.xlsx