In this homework, you will do some data analysis using R for the Forest Fire Data described https://archive.ics.uci.edu/ml/datasets/forest+fires (Links to an external site.) The dataset is used to find the relationship between the burned area of forest fires and meteorological data.
Please provide your output only in .HTML format. Do not send the .rmd file.
I have already downloaded the forest fires data and added it to the files section.
https://classroom.ucsc-extension.edu/files/1144259/download?download_frd=1
- Import the data into R.
- How many observations are there in the dataset?
- How many observations are there with a fire (i.e., area>0)
- How many observations are there with a rain (i.e., rain>0)
- How many observations are there with both a fire and a rain?
2.Show the columns month, day, area of the all the observations.
3. Show the columns month, day, area of the observations with a fire.
4.How large are the five largest fires (i.e., having largest area)
a.What are the corresponding month, temp, RH,wind, rain area?
b.Add one column to the data indicating whether a fire occurred for each observation (True for area >0 and False for area ==0) (Use Mutate function)
5.Create the following to display the outliers from the below vector.
-plot
– boxplot
Also mention the numbers that are outliers in this vector.
(1,2,50,45,67,200,230,55,56,49)
6. Using the dplyr approach, perform the following actions from ‘iris’
a) select the columns Sepal.Length, Sepal.Width, Petal.Length,Petal.Width
b) filter the iris data for Species = “setosa” or “virginica”