[Getting and Cleaning data] Quiz 3

Question 1

Question 2

Question 3

Question 4

Question 5

More details can be found in the html file here.

Question 1

The American Community Survey distributes downloadable data about United States communities. Download the 2006 microdata survey about housing for the state of Idaho using download.file() from here. And load the data into R. The code book, describing the variable names is here.
Create a logical vector that identifies the households on greater than 10 acres who sold more than $10,000 worth of agriculture products. Assign that logical vector to the variable agricultureLogical. Apply the which() function like this to identify the rows of the data frame where the logical vector is TRUE. which(agricultureLogical)
What are the first 3 values that result?

59, 460, 474

125, 238,262

403, 756, 798

25, 36, 45

# download data
if(!file.exists("./data")) dir.create("./data")
fileUrl "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"
download.file(fileUrl, destfile = "./data/ACS.csv")
# load data into R
acs "./data/ACS.csv")
agricultureLogical 3 & acs$AGS == 6)
which(agricultureLogical)[1:3]

Question 2

Using the jpeg package read in the following picture of your instructor into R
https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg
Use the parameter native=TRUE. What are the 30th and 80th quantiles of the resulting data? (some Linux systems may produce an answer 638 different for the 30th quantile)

-16776430 -15390165

-10904118 -10575416

-15259150 -10575416

10904118 -594524

# download fig
library(jpeg)
fileUrl "https://d396qusza40orc.cloudfront.net/getdata%2Fjeff.jpg"
download.file(fileUrl, destfile = "./data/jeff.jpg", mode = "wb")
# load fig into R
jeff "./data/jeff.jpg", native = TRUE)
# result
quantile(jeff, probs = c(0.3, 0.8))

Question 3

Load the Gross Domestic Product data for the 190 ranked countries in this data set here.
Load the educational data from this data set here.
Match the data based on the country shortcode. How many of the IDs match? Sort the data frame in descending order by GDP rank (so United States is last). What is the 13th country in the resulting data frame?
Original data sources are here and here.

234 matches, 13th country is Spain

190 matches, 13th country is Spain

190 matches, 13th country is St. Kitts and Nevis

189 matches, 13th country is St. Kitts and Nevis

189 matches, 13th country is Spain

234 matches, 13th country is St. Kitts and Nevis

# download data
fileUrl1 "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FGDP.csv"
fileUrl2 "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2FEDSTATS_Country.csv"
download.file(fileUrl1, destfile = "./data/GDP.csv")
download.file(fileUrl2, destfile = "./data/EDU.csv")
# load data into R
gdp "./data/GDP.csv", skip = 4, nrow = 190, stringsAsFactors = FALSE)[,c(1, 2, 4, 5)]
colnames(gdp) = c("CountryCode", "Ranking", "Economy", "GDP")
edu "./data/EDU.csv", stringsAsFactors = FALSE)
# merge data
mergeData "CountryCode")
# result 1
nrow(mergeData)
# result 2
library(dplyr)
arrangeData 13, "Economy"]

Question 4

What is the average GDP ranking for the “High income: OECD” and “High income: nonOECD” group?

30, 37

23, 30

32.96667, 91.91304

23, 45

23.966667, 30.91304

133.72973, 32.96667

# group data
by_income as.factor(Income.Group))
# result
summarise(by_income, meanRank = mean(Ranking))

Question 5

Cut the GDP ranking into 5 separate quantile groups. Make a table versus Income.Group. How many countries are Lower middle income but among the 38 nations with highest GDP?

# cut data into 5 groups
library(Hmisc)
mergeData$GDP 5)
table(mergeData$GDP, mergeData$Income.Group)

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

SPSS Statistics 27에서 "효과량"출력

최근의 학술논문에서는 실험에서 유의한 차이가 있는지 여부를 나타내는 p-값뿐만 아니라 그 차이에 얼마나 효과가 있는지를 나타내는 효과량의 제시가 요구되고 있다. 일반적으로 두 가지 차이점은 효과량을 계산할 때 분산을...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다