StataQuest^{TM} Practice Set Instructions,
Answers
For use with Quick Notes Statistics, although instructions may be used with any statistics book. Special Note: These instructions were prepared for use with StataQuest 4.0. People using other versions should be prepared to make minor adjustments to these directions.
StataQuest^{TM} Generic Instructions may also be helpful to new users. 
Part I Descriptive Statistics 
Problem I message: StataQuest will do an array. A class width should be done by hand. Those using Quick's data files should load stata006 by choosing File and Open. Use the down arrow next to the Drives box to locate and click on a:. Double click on the datastat in the Folders box. Files stored on the A drive will appear. Load the page 6 data by double clicking on stata006. Others must create a data file for CD Sales. You may want to review creating a data file on page 1of Quick Start for StataQuest . I named my variable CDSales. When saving this data file, I named it stata006 for the software being used and the problem page number. All StataQuest users should choose Editor and Sort. See page PS 6 and 7 for the answer.
Problem II message: StataQuest will do an ungrouped frequency distribution. The StataQuest
Those using Quick's data files should load stata006 by choosing File and Open. Use the scroll
bar next to the Drives box to locate and click on a:. Files stored on the A drive will appear.
Load stata006 with a double click. All StataQuest users should choose Summaries, Tables and Oneway. Highlight CDSales and choose OK. Absolute, relative and cumulative frequency distributions are presented. The StataQuest ungrouped answer differs from Quick's grouped answer. Some of these questions must be done by hand. All StataQuest users should choose Graphs from the main menu. Choose One variable, Histogram, and Continuous. Highlight CDSales, set range from to 5 to 30, choose the bull'seye next to Histogram, enter 5 next to bins, check overlay a normal curve, and choose Draw. Stata's graphs are similar to Quick's. Some of these graphs must be done by hand. Saving and Printing graphs is easy. To save the current graph choose File and Save Graph. Use the scroll bar to set Save file as type to StataQuest Graph (*.gph). Type stata006 in the File name box, choose OK, and StataQuest will add .gph as an extension. The file stata006.gph will be stored on the A drive. This graph can be printed by choosing File and Print Graph. To use a saved graph choose Graphs and View saved graphs. 
Chapters 3 and 4 Measuring Central Tendency and Dispersion of Ungrouped Data Problems message: StataQuest will calculate many of these statistics. Those using Quick's data files should load stata006 by choosing File and Open. Use the scroll arrow next to the Drives box to locate and double click on a:. Load by double clicking on stata006. Others should load their page 6 data file. All StataQuest users should choose Summaries. Choose Means and SDs, highlight CDSales, and choose OK. Study the result. Choose Summaries again. Choose Median/Percentiles, highlight CDSales, and choose OK. Study the result. Quick Notes Statistics has or will explore many of these statistics. StataQuest will not calculate an individual percentile. Compare StataQuest answers with those in Quick on pages PS 1213 and PS 1819. Where answers differ, it is because of rounding. Practice Set 3 answers: 1A) 17 4) Median = 50th percentile = 16 7A) 14 7B) 21 7E) 29 Do other question 7 problems by hand. Practice Set 4 answers: 1C) 30.6 1D) 5.5 Do other problems by hand. Saving and Printing Results is easy. Saving output stored in a Stata Results window requires creating a log file by choosing log. Type stata006 in the File name box and choose OK. StataQuest will add the extension .log and stata006.log will be stored on the A drive. This file can be used by choosing Log, highlighting the desired log file, and choosing OK. Choose the default Append to existing file and choose OK. To print an open log file, choose File and Print Log file. 
Chapters 5 and 6 Measuring Central Tendency and Dispersion of Grouped Data Chapters 5 and 6 problems are very similar to those of chapters 3 and 4. These StataQuest Practice Set Instructions and Answers will explore only the ungrouped practice sets of chapters 3 and 4. 
Part II Practice Sets on Probability, The Basis for Inferential Statistics

Do these problems by hand. 
Chapter 9 Discrete Probability Distributions Problem I message: These problems should be done by hand. Problem II, III, and IV message: The StataQuest calculator will be used to solve these problems. All StataQuest users should select Calculator, Statistical tables, and Binomial. Set No. of trials to 5, Prob. of success to .6, No. of successes to 3, and choose Run. The answer will appear in the Stata Results window. To determine the entire distribution repeat for successes of 0, 1, 2, 4, and 5. Problem V message: The StataQuest calculator will be used to solve this problem. All StataQuest users should select Calculator, Statistical tables, and Poisson. Set lambda (the mean) to 1, the count to 0, and choose Run. To determine the entire distribution repeat for counts of 17. Problem VI message: The StataQuest calculator will be used to solve this problem. All StataQuest users should select Calculator, Statistical tables, and Poisson. Set lambda (the mean) to 3 (1,500 x .002), the count to 0, and choose Run. Answer the next problem by setting the count to 2. See page PS55 of Quick for the answer to the last problem. 
Chapter 10 Continuous Normal Probability Distributions Problem I message: Calculate z by hand. Choose Calculator, Statistical Tables, and Normal. Insert z and choose Run. The probability to z and from z is given. Problem II message: Determine the appropriate probability. Choose Calculator, Inverse Statistical Tables, and Normal. Insert the probability and choose Run. Use the resulting z to solve the problem. 
Chapter 11 Sampling and the Sampling Distribution of the Means Problem I message: You can find a confidence interval for the mean of a sampling distribution by using the actual data or the StataQuest calculator. Those using Quick's data files should load stata068 by choosing File and Open. Use the scroll bar next to Drives to locate and click on a:. Double click on stata068. Others must create a data file for the variable part size. You may want to review creating a data file on page 1 of Quick Start for StataQuest located at http://businessbookmall.com. Create a data file using 36 rows and 1 column. I named my variable Weight. To name a variable, double click on Var1. When saving my data file, I named my data file stata068 for the software being used and the problem page number. All Stata users should select Summaries and Confidence intervals. Highlight Weights. Set the Confidence level to 99. Quick's answer differs because of rounding. Note: This problem can also be done with the StataQuest calculator. Choose Calculator, 1sample normal test. Fill in the appropriate data using .065 for the population standard deviation, 30.025 for the sample mean, and 30.0 for the Hypothesized mean. Choose Run. Problem II message: Redo problem I with a 95% confidence interval. Problem III message: Answer this problem by hand. 
Chapter 12 Sampling Distributions Part II Problem I message: You can find a range for the proportion of a sampling distribution by using the actual data or the StataQuest calculator. Those using Quick's data files should load stata072 by choosing File and Open. Use the scroll bar next to Drives to locate and click on a:. Double click on stata072. Others must create a data file for page 72 data. Enter a 1 for samples that passed and a 0 for those that failed. To name a variable, double click on Var1. I names this variable Inspect. All Stata users should select Summaries and Confidence intervals. Highlight Inspect, set the Confidence level to 95, and choose OK. Note: This problem can also be done with the StataQuest calculator. Choose Calculator, 1sample test of the proportion. Fill in the required data using .9 for the Prob. of success. Choose Normal approximation and Run. Quick's answer differs because of rounding. Problem II, III, and IV message: These problems dealing with determining the appropriate sample size should be done by hand. 

Problem I message: Data for this 1sample test of a mean
was first presented on page 68. StataQuest assumes a small sample and calls the calculated z
for this test statistic a t value. Those using Quick's data files should load stata068. Others should load their page 68 data file. All StataQuest users should choose Statistics, Parametric tests, and 1sample t test. Highlight weight and set the Hypoth. mean to 30 mg. Choose OK. The tvalue of 2.32 is not beyond 2.33 (it is less than 2.33) and H_{0} is accepted. Parts are not too heavy. Note: I use the term beyond to indicate further from H_{0}, the hypothesized population parameter. Problem II message: If a 1tail test is accepted, a 2tail test must be accepted. Here. 2.32 < 2.58. Problem III message: Redo problem II with the new level of significance. Reject H_{0} because 2.32 > 1.96. Parts are not too heavy. Note: These problems can be done with the StataQuest calculator. Choose Calculator and 1sample normal test. Fill in the required data using .065 for the population standard deviation. Choose Run. 
Chapter 14 Large Sample Hypothesis Testing Part II Problem I message: This problem concerns the difference between 2 independent means. Those using Quick's data files should load stata090. Others should create a data file with 70 rows and 2 columns (variables). In the first data variable column entitled Days, enter the data for Supplier A and then the data for Supplier B. In the second group variable column entitled Supplier, enter a 0 next to Supplier A's data and a 1 next to supplier B's data. All StataQuest users choose Statistics, Parametric tests, and 2sample t test. Check 1 data variable, 1 group variable under Options, and choose OK. Highlight Days for the Data variable and Supplier for the Group variable. Check unequal variances. The test statistic t (z) of 1.32 is not beyond 1.96 (it is greater than 1.96), the critical value of t (z), accept H_{0}. Delivery times are the same. Note: This problem can also be done with the StataQuest calculator. Choose Calculator and 2sample normal test. Assume the sample standard deviation equals the population standard deviation. Problem II message: StataQuest calculates z and p simultaneously. Redo the chapter 13 test for p.
Those using StataQuest's data files should load stata068. All StataQuest users should choose Statistics, Parametric tests, and 1sample t test. Highlight Weight and set the Hypothesized mean to 30 mg. StataQuest gives a twotail pvalue using t of .0261. H_{0} is accepted because .0261/2 = .013 > .01. Parts are not too heavy. Quick Notes calculates z = .0103 and also accepts H_{0}. Problem III message: Answer this problem by hand. See page PS 91 for the answer. 
Chapter 15 Hypothesis Testing of Population Proportions Problem I message: This 1sample test of the proportion. StataQuest users should load stata096. Others should create a data file using a 1 for those that passed and a zero for those that failed. I labeled my variable Inspect.All StataQuest users should choose Statistics, Parametric tests, and 1sample test of proportion. Check Normal approximation, highlight passed, and set Probability of success to .86 and choose OK. The p value of .4150 is for twotails. P for onetail is .2075. It is > .01 and H_{0} is accepted. The proportion of parts passing inspection has not increased. Problem II message: This is a 2sample proportions test similar to those explored in chapter 14. Those using Quick's data files should load stata96. Othersshould add to their page 96 data file with 200 rows and 2 columns (variables). In the first column for the data entitled Defects, enter the defects data for the day shift and then the night shift by entering a 0 for passed and a 1 for failed. In the group variable column entitled Shift, enter 0 next to day's data and a 1 next to night's data.All StataQuest users should choose Statistics, Parametric tests, and 2sample test of proportion. Check 1 data variable, 1 group variable under Options and choose OK. Highlight Defects for the Data variable and Shift for the Group variable. P of .03 is for twotails. Because .03 > .01, accept H_{0}. Shift defects are the same. Note: This problem can also be done with the StataQuest calculator. 
Chapter 16 Small Sample Hypothesis Testing Using Student's t Test Problem I message: This is an analysis of 2 independent means. It is similar to problems done earlier. Those using Quick's data files should load stata100. Others should create a data file containing both a data variable and a group variable using procedures discussed earlier.All StataQuest users should choose Statistics, Parametric tests, and 2sample t test. Check 1 data variable, 1 group variable under Options, and choose OK. Highlight SickDays for the Data variable and Group for the Group variable. Choose Unequal variances. Choose OK. Reject H_{0} because p = .0005 < .01. Sickdays are not the same. Note: This problem can also be done with the StataQuest calculator with a 2sample t test. Problem II message: Here, a paired difference test is used to determine if 2 variables are dependent. Those using Quick's data files should load stata100. Others should add 2 variables to their page 100 data file, one for the efficiency rating before training and the other for the efficiency rating after training. Be sure to match each employee with their before and after efficiency rating. All StataQuest users should choose Statistics, Parametric tests, and Paired t test. Highlight Before for Data variable #1 and After for Data variables #2. Choose OK. Reject H_{0} because p = .0028/2 = .0014 < .01. Training increased efficiency. 
Chapter 17 Statistical Quality Control Problem I message: This problem concerns the 30 milligram parts. Those using Quick's data files should load stata104. Others should create a data file with four variables. In the first column (variable) entitled Sample enter the numbers 1 to 12. Columns 24, which I labeled m1, m2, and m3, should contain the 3 cases for each sample. All StataQuest users should choose Graphs and Quality control. Choose Xbar chart and highlight m1, m2, and m3. Choose Draw. For a range chart, repeat the above process choosing Range (R) chart. See page PS 104 of Quick Notes Statistics for the answer. Problem II message: This problem is similar to problem I. See page PS 104 for the answer. Those using Quick's data files should load stata105. Others should create a data file with three variables. In the first column (variable) entitled Sample, enter the numbers 1 to 11. Column 2, entitled defects, should contain the number of defects for each sample. Column 3 should contain the sample size of 50. All StataQuest users should choose Graphs, Quality Control, and Fraction defective (P) chart. Using the scroll bar, load Defects into Variable for number defective, Sample into Identification variable, and Tested into Samplesize variable. Choose Draw. See page PS 105 for the answer. 
Chapter 18 Analysis of Variance Problem I message: This problem concerns the difference between 2 variances. Those using Quick's data files should load stata110. Others should create a data file with 30 rows and 2 columns (variables). In the first data variable column entitled Variance, enter the data for sample 1 and then the data for sample 2. The second column is used for the group variable. I named it Sample and entered a 1 next to Sample one's data and a 2 next to sample two's data. All StataQuest users should choose Statistics, Parametric tests, and 2sample test of variance. Check 1 data variable, 1 group variable under Options, and choose OK. Highlight Variance as the Data variable and Sample as the Group variable. Accept H_{0} because the test statistic of F = 1.17 < 3.82. Variance did not increase. Note: This problem can also be done with the StataQuest calculator. Problem II message: This is a oneway analysis of variance (ANOVA). Those using Quick's data files should load stata111. Others should create a 2 variables data set for the page 110 data. Label the data variable Weight and the group variable Dept. All StataQuest users should select Statistics, ANOVA, and Oneway. Check 1 data variable, 1 group variable under Options, and choose OK. Move Weight into the Data variable and Dept into Group variable. Choose OK. Reject H_{0} because p = .0021 < .05. These populations have different means. 
Chapter 19 TwoFactor Analysis of Variance Problem I message: This is a twoway analysis of variance (ANOVA). Those using Quick's data files should load stata116. Others should create a 3 variable file containing data on weight, department (dept), and time. When entering time, be sure to match each time with the appropriate department and weight.All StataQuest users should select Statistics, ANOVA, and Twoway. Move Weight into Dependent variable, Dept into Cat. var #1, and Time into Cat. var. #2. Check Include interaction and choose OK. StataQuest provides the data needed to calculate F. See page PS 117 for the answer. Problem II message: Do this problem by hand. See page PS 117 for the answer. 
Chapter 20 Nonparametric Hypothesis Testing of Nominal Data Problem I message: Do this nonparametric test using ChiSquare by hand. See page PS 122 for the answer. Problem II message: This problem emphasizes the concept of statistical independence. Those using Quick's data files should load stata123. Others should create a 2 variable data file. The first variable entitled SResult contains 24 ones for making a sale to young people, 16 zeros for not making a sale to young people, 12 ones for making a sale to older people, and 8 zeros for not making a sale to older people. The second variable named age should contain 40 zeros for younger people and 20 ones for older people. All StataQuest users should select Summaries, Tables, and Twoway. Copy SResult into Row variable, Age into Column variable, and choose OK. Chi square is zero and H_{0} is accepted. Making a sale and age are independent. Because the buying percentage of each part of the population (60%) exactly equals the buying percentage of the entire population (60%), chi square is zero. 
Chapter 21 Nonparametric Hypothesis Testing of Ordinal Data Part I Problem I message: Do this run test for randomness by hand. See page PS 128 for the answer. Problem II message: This is a sign test for the difference between two medians. Those using Quick's data files should load stata128. Others should create a data file with 2 variables. The first should contain current median defects and the second should be last year's median of 5 defects. All StataQuest users should select Statistics, Nonparametric tests, and Sign test. Load Current into Data variable #1 and Previous into Data variable #2. Choose OK. The probability that n is greaterthan or equal to 5 is .1094. This is greater than .05, and H_{0} is accepted. Median defects has not increased. Problem III This is a two sample medians test. Those using Quick's data files should load stata129. Others should create a data file with 2 variables. The first should contain the sick days taken by graduates and then nongraduates. The second should contain eleven 0's for graduates and twelve 1's for nongraduates. All StataQuest users should select Statistics, Nonparametric tests, and MannWhitney. Load SickDays into the Data variable box, Group into the Group variable box, and choose OK. P of .0012 is less than .01 and H_{0} is rejected. There is a difference in median sickdays taken by graduates and nongraduates. 
Chapter 22 Nonparametric Hypothesis Testing of Ordinal Data Part II Problem I message: This is a paired sample (difference) test of the median. Those using Quick's data files should load stata133. Others should create a data file with 2 variables. The first should contain employee efficiency ratings before training and the second should contain their rating after training. Be sure to match ratings with employees. All StataQuest users should choose Statistics, Nonparametric tests, and Sign test. Load Before into Data variable #1, After into Data variable #2, and choose OK. The probability that n is greaterthan or equal to 5 is .0625. Accept H_{0} because .0625>.01. Efficiency did not increase. This sample is too small to ever reject H_{0}. Problem II message: This is a KruskalWallis test of several medians. Those using Quick's data files should load stata133. Others should add 2 variables to their page 133 data file; one for weights listed by departments and a second for departments 13. All StataQuest users should select Statistics, Nonparametric tests, and KruskalWallis. Load Weight into the Data variable box and Dept. into the Group variable box. Choose OK. H_{0} is rejected because p = .0056 < .01. These medians are not equal. 

Chapter 23 Correlation Problem I message: This is a correlation study. Those using Quick's data files should load stata148. Others should create a data file with 2 variables. The first should contain employee ages and the second employee sales. All StataQuest users should select Statistics, Correlation, and Pearson (regular). Load Sales and Age into the Data variable box. Choose OK. The coefficient of correlation is .9076. Do other problems by hand. 
Problem I message: This is a regression study. All StataQuest users should load stata148. Select Statistics and Simple regression. Load Sales into the Dependent variable box and Age into the Independent variable box. Choose OK. a = 55.94 and b = 1.07. Do other problems by hand. 
StataQuest Quick Questions Answers

 
Chapter 1  See page QQ 3. 
Chapter 2  1) See page QQ 8. 2A) 6, 14, 17, 23, 26, 27, 31, 33, 34, 37, 38, 38, 44, 46, 48, 54
37, 38, 38, 44, 46, 48, 54
Other answers on page QQ 8. 
Chapter 3  1) See page QQ 14. 2A) 7 2B) 7.5 6A) 5.5 6B) 8.5 Other answers on page QQ 14. 
Chapter 4  2A) 5.14 2B) 2.27 See page QQ 20 for other answers. 
Chapter 5, 6  See page QQ 26 and 27 and QQ 32 and 33 for these answers. 
 
Chapter 7  See page QQ 44 and 45 for these answers. 
Chapter 8  See page QQ 50 and 51 for these answers. 
Chapter 9  1 and 2) See page QQ 56 for these answers. 3A) .0214 3B)
0 is .7738, 1 is .2036,
3 is .0011, 4 is .0000, and 5 is .0000 4A) See page QQ 56. 4B) .1353 4C) .2707 4D) .1804+.0902+.0361+.0120+.0034+.0009+.0002 = .3233 4E) .6767 
Chapter 10  See page QQ 64 for these answers. 
Chapter 11  1 and 2) See page QQ 69 3A) 48.91 61.09 3B) 46.83  63.17 
Chapter 12  1) See page QQ 74. 2A) .6214.8786 2B) .6530.8470 
Part III Inferential Statistics  
Chapter 13 
1) See page QQ 87 and 92. 2A) t = 5, reject H_{0}
, bulbs last <
20,000 hours
2B) t =  2.5, reject H_{0}, weekly earnings changed 
Chapter 14 
1) See page QQ 87 and 92. 2) t = 2.63, reject H_{0}
, sales time is
not the same
3) p = .1185/2 = .059 > .05, accept H_{0} , tires last > or = to 70,000 miles 4) p = .14/2 = .07>.05, accept H_{0} , loan length did not increase 5) See page QQ 92 and 93. 
Chapter 15 
See page QQ 97. 2) p = .2207/2 = .11 > .05, accept H_{0}, P is not < the national average
< the national average
3) p = .0148 < .05, reject H_{0} , defined rentals differ at these stores 
Chapter 16 
See QQ 101. 2) t = 1.73 < 3.143, accept H_{0}, sick days did not decrease
decrease
3) p = .3692/2 = .1846 > .05, accept H_{0}, women did not do better than men 
Chapter 17  See page QQ 106. 
Chapter 18  3) p = .0109 < .05, reject H_{0} , means are not equal. Other answers on pages QQ 112,113. 
Chapter 19  2B) Use mean square data provided by MINITAB to calculate F. See page QQ 118 for other answers. 
Chapter 20  4) p = .0000 < .01, reject H_{0}, course and grades are dependent. See page QQ 124 for other answers. 
Chapter 21 
See page QQ 130. 2) p = .2188 > .05, accept H_{0}, median customers did not change
customers did not change
3) z = .855 which is not beyond 1.96, accept H_{0}, median delivery times are the same 
Chapter 22 
1) p = .2188 > .10, accept H_{0}, absenteeism did not changed
2) Chisquare = 11.795 > 9.21 and p = .0027 < .10, reject H_{0}, median grades are not equal 
 
Chapter 23  4A) r = .8893 4B) .7908 See page QQ 150 and 151 for other answers. 
Chapter 24 
3) a = 1.81 and b = .28 See page QQ 150 and 151 for other answers.
answers.
Answers to these problems begin on page T 35 near the end of Quick Notes Statistics 
14) See page T 35.
5) A bin width of 4 results in a Histogram similar to the one on page T 3637. Do other graphs by hand. 6) Mean = 7, SD = 4.34, Variance = 18.86, Median = 6 (see 50th percentile), and Skewness (not Pearson's) = 1.68. Answer other questions by hand. 
Exam 2 Answers Using StataQuest 
110) Do by hand.
11A) .7351 11B) .0328 11C) See page ST 1. 11D) .0305 12A) .0916 12B) See page T 8081 which was drawn from page ST 2. 13 and 14) Do by hand. 15) Using the StataQuest calculator, the range is 6.914 to 8.286. 16A) .5934 to .8066 16B) Do by hand.
Exam 3 Answers Using StataQuest 
1) t = 1.14 and p = .2608/2 = .1304 > .01,
accept the Null Hypothesis, the mean is greater than or equal to $8.00.
2) t = 1.77 and p = .0771/2 = .0386 < .05, reject the Null Hypothesis, the mean is not greater than or equal to 80%. Too bad, no chocolate flavored shaving cream. 3) Chi square of 8.4 and p = .015 < .05, reject the Null Hypothesis, supplier and defects are dependent 4) t = 4.24 and p = .0240/2 = .0120 < .05, reject the Null Hypothesis, sales performance improved 5) t = 2.97 and p = .0039 < .05, reject the Null Hypothesis, mean service time at these stores is not the same 6) t =  4.0 and p = .0012/2 = .0006 < .01, reject the Null Hypothesis, assembly time has gone down 7) F = 1.58 and p = .5044 < .10, accept the Null Hypothesis, shopping time variability has not increased 8) This problem should be done by hand. 9A) F = 5.77 and p = .0244 < .05, reject the Null Hypothesis, the mean time to assemble computer parts is not equal for these three departments 9B) Do by hand. 10) z = 1.545 and p = .1224 > .01, accept the Null Hypothesis, median assembly times are equal 11) Chi square = 11.23 and p = .0036 < .01, reject the Null Hypothesis, median assembly times for these three methods are not equal. 12) The data set for this problem must have its own file. Therefore, this problem must be done by hand. 13) p = 21.88 > .10, accept the Null Hypothesis, brands are viewed equally by consumers 
Exam 4 Answers Using StataQuest 
1 and 2) Do by hand.
3A) r = .8852 3B) r squared = .7836 3C) Do by hand. 4) Do by hand. 5) The StataQuest fitted model graph is similar to the graph on page T 160. 6A) a = 13.51 and b = 6.54 6C) The StataQuest fitted model graph is similar to the graph on page T 160. Do other questions by hand. 