ST131: Introduction to Statistics Assignment, Semester 1, 2022
Due Date: Monday 6th June, 2022, 11.59pm (Fiji-time) Total Marks: 50 Weight: 10%
1. All questions are compulsory.
2. Please complete this assignment in a group of 3 students.
3. Use MS Excel to complete this assignment, that is all calculations should be done using the MS excel functions.
4. Write your group members information (name, ID, signature, etc.) on a new sheet and rename it as Group Information.
5. Each question should be done in different sheets and rename them as Q1 for Question 1, Q2 for Question 2, and so on.
6. One member from each group to upload the groups assignment solution (MS excel file) via drop box created on Moodle.
7. Plagiarized assignments will be given a mark of 0 (zero) and will be reported for disciplinary action.
Access Fiji Farm Survey data set from MOODLE. And use that for all questions.
Fiji Farm Survey data set is taken from a survey of Sugar cane Farmers in Fiji conducted by a research team from the University of Queensland in 2005.
Q1. (17 marks)
a. Classify the variables Age, No. of children, Farming status and annual profit from farming. (2 marks)
b. Construct a grouped frequency distribution for the variable Age using 7 classes. (4 marks)
c. Produce a histogram for grouped frequency distribution in part a. Give appropriate title and label to
axis. (3 marks)
d. Discuss the Histogram result, is there a serious problem in sugar industry regarding farmers age? If yes then recommend a suitable way to minimize this problem. (2 marks)
e. Produce an ogive for grouped frequency distribution in part a. Give appropriate title and label to axis.
f. Construct a Pie chart for variable Farming status. Any observations? (3 marks)
Q2. (16 marks)
Create a new column titled “Level of education” with following criteria:
• all the farmers Education(years) less than equal to eight label it as primary
• all the farmers Education(years) greater than 8 and less than equal to 13 label it secondary.
• all the farmers Education(years) greater than 13 label it as tertiary.
Now create a new variable ‘cane output per acre’ in a new column. (Cane output per acre = cane output /
cultivated area). (1 mark)
a. Calculate the descriptive statistics using MS-Excel for cane output per acre based for farmers with primary education and for farmers with secondary education. (Hint: ignore non-numeric cells in
computing). Interpret the following statistic (Mean, Median, Mode, Standard Deviation) obtained for
farmers with primary education. (7 marks)
b. What conclusion can be reach from the comparison of descriptive statistics between cane output per acre for farmers with primary education and for farmers with secondary education? Justify why or why note
there is significant difference in output? (2 marks)
c. Construct a box plot for cane output per acre based for farmers with primary education and for farmers
with secondary education. Interpret the boxplots. (4 marks)
Q3. (14 marks)
Construct a contingency table and relative contingency table (using Pivot table tool in Excel) for farming status
in raw and Land Owned in column. (4 marks)
a. What is the probability that a randomly selected farmer does not own the land? (1 mark)
b. What is the probability that a randomly selected farmer is working full time and does not own the land?
c. What is the probability that a randomly selected farmer is working full time or does not own the land?
a. What is the probability that a randomly selected farmer does not own the land given that farming status
is full time? (2 mark)
b. Are the events “does not own the land” and “farming status is full time” independent? (2 mark)
c. What can you conclude from above analysis in regards to land ownership in sugar industry? (2 marks)
Q4. (3 marks)
Assume that the annual profit from farming is approximately normally distributed with a mean of $1940 and standard deviation of $1700. Individual earning less than $2000 is believed to be in poverty. Calculate what proportion of sugar cane farmers are in Fiji are in poverty?