Skip to content

Instantly share code, notes, and snippets.

@mGalarnyk
Last active August 26, 2023 17:18
Show Gist options
  • Save mGalarnyk/21695638e94965640c35667e8683642c to your computer and use it in GitHub Desktop.
Save mGalarnyk/21695638e94965640c35667e8683642c to your computer and use it in GitHub Desktop.
R Programming Programming Assignment 3 (Week 4) John Hopkins Data Science Specialization Coursera for the github repo https://github.com/mGalarnyk/datasciencecoursera

R Programming Project 3

github repo for rest of specialization: Data Science Coursera

The zip file containing the data can be downloaded here: Assignment 3 Data

Part 1 Plot the 30-day mortality rates for heart attack (outcome.R)

# install.packages("data.table")
library("data.table")

# Reading in data
outcome <- data.table::fread('outcome-of-care-measures.csv')
outcome[, (11) := lapply(.SD, as.numeric), .SDcols = (11)]
outcome[, lapply(.SD
                 , hist
                 , xlab= "Deaths"
                 , main = "Hospital 30-Day Death (Mortality) Rates from Heart Attack"
                 , col="lightblue")
        , .SDcols = (11)]

Part 2 Finding the best hospital in a state (best.R)

best <- function(state, outcome) {
  
  # Read outcome data
  out_dt <- data.table::fread('outcome-of-care-measures.csv')
  
  outcome <- tolower(outcome)
  
  # Column name is same as variable so changing it 
  chosen_state <- state 

  # Check that state and outcome are valid
  if (!chosen_state %in% unique(out_dt[["State"]])) {
    stop('invalid state')
  }
  
  if (!outcome %in% c("heart attack", "heart failure", "pneumonia")) {
    stop('invalid outcome')
  }
  
  # Renaming Columns to be less verbose and lowercase
  setnames(out_dt
           , tolower(sapply(colnames(out_dt), gsub, pattern = "^Hospital 30-Day Death \\(Mortality\\) Rates from ", replacement = "" ))
  )
  
  #Filter by state
  out_dt <- out_dt[state == chosen_state]
  
  # Columns indices to keep
  col_indices <- grep(paste0("hospital name|state|^",outcome), colnames(out_dt))
  
  # Filtering out unnessecary data 
  out_dt <- out_dt[, .SD ,.SDcols = col_indices]
  
  # Find out what class each column is 
  # sapply(out_dt,class)
  out_dt[, outcome] <- out_dt[,  as.numeric(get(outcome))]
  
  
  # Removing Missing Values for numerical datatype (outcome column)
  out_dt <- out_dt[complete.cases(out_dt),]
  
  # Order Column to Top 
  out_dt <- out_dt[order(get(outcome), `hospital name`)]
  
  return(out_dt[, "hospital name"][1])

}

Part 3 Ranking hospitals by outcome in a state (rankhospital.R)

rankhospital <- function(state, outcome, num = "best") {
  
  # Read outcome data
  out_dt <- data.table::fread('outcome-of-care-measures.csv')
  
  outcome <- tolower(outcome)
  
  # Column name is same as variable so changing it 
  chosen_state <- state 
  
  # Check that state and outcome are valid
  if (!chosen_state %in% unique(out_dt[["State"]])) {
    stop('invalid state')
  }
  
  if (!outcome %in% c("heart attack", "heart failure", "pneumonia")) {
    stop('invalid outcome')
  }
  
  # Renaming Columns to be less verbose and lowercase
  setnames(out_dt
           , tolower(sapply(colnames(out_dt), gsub, pattern = "^Hospital 30-Day Death \\(Mortality\\) Rates from ", replacement = "" ))
  )
  
  #Filter by state
  out_dt <- out_dt[state == chosen_state]
  
  # Columns indices to keep
  col_indices <- grep(paste0("hospital name|state|^",outcome), colnames(out_dt))
  
  # Filtering out unnessecary data 
  out_dt <- out_dt[, .SD ,.SDcols = col_indices]
  
  # Find out what class each column is 
  # sapply(out_dt,class)
  out_dt[, outcome] <- out_dt[,  as.numeric(get(outcome))]
  
  
  # Removing Missing Values for numerical datatype (outcome column)
  out_dt <- out_dt[complete.cases(out_dt),]
  
  # Order Column to Top 
  out_dt <- out_dt[order(get(outcome), `hospital name`)]
  
  out_dt <- out_dt[,  .(`hospital name` = `hospital name`, state = state, rate = get(outcome), Rank = .I)]
  
  if (num == "best"){
    return(out_dt[1,`hospital name`])
  }
  
  if (num == "worst"){
    return(out_dt[.N,`hospital name`])
  }
  
  return(out_dt[num,`hospital name`])

}

Part 4 Ranking hospitals in all states (rankall.R)

rankall <- function(outcome, num = "best") {
  
  # Read outcome data
  out_dt <- data.table::fread('outcome-of-care-measures.csv')
  
  outcome <- tolower(outcome)
  
  if (!outcome %in% c("heart attack", "heart failure", "pneumonia")) {
    stop('invalid outcome')
  }
  
  # Renaming Columns to be less verbose and lowercase
  setnames(out_dt
           , tolower(sapply(colnames(out_dt), gsub, pattern = "^Hospital 30-Day Death \\(Mortality\\) Rates from ", replacement = "" ))
  )
  
  # Columns indices to keep
  col_indices <- grep(paste0("hospital name|state|^",outcome), colnames(out_dt))
  
  # Filtering out unnessecary data 
  out_dt <- out_dt[, .SD ,.SDcols = col_indices]
  
  # Find out what class each column is 
  # sapply(out_dt,class)
  
  # Change outcome column class
  out_dt[, outcome] <- out_dt[,  as.numeric(get(outcome))]
  
  if (num == "best"){
    return(out_dt[order(state, get(outcome), `hospital name`)
    , .(hospital = head(`hospital name`, 1))
    , by = state])
  }
  
  if (num == "worst"){
    return(out_dt[order(get(outcome), `hospital name`)
    , .(hospital = tail(`hospital name`, 1))
    , by = state])
  }
  
  return(out_dt[order(state, get(outcome), `hospital name`)
                , head(.SD,num)
                , by = state, .SDcols = c("hospital name") ])
  
}
@anaidcandido
Copy link

Many thanks for sharing!! I also thought I was the only one struggling with the course but I'm "glad" to see is the course itself.
Now I am able to analyze and compare what I was doing. Thanks a lot!

@yeho-bt
Copy link

yeho-bt commented Jun 13, 2021

Hmmmm, I am not alone feeling like this.

@jdpm93
Copy link

jdpm93 commented Oct 5, 2021

I have the same problem, the gap between the lessons, swirl, and the assigments is horrible and frustrating. some excercises require the use of things never covered in the videos or swirl, I see that many have sent feedback but they are not taking action on this, also the video lessons need to be updated it's 2021 and the videos were recorded in 2015. this is insane. Thank you for sharing.

@codobene
Copy link

truth is, the world is not meant for everyone to be kept alive - it just needs the upper 10% of people that manage this course well. Same is true for jobs in general, stock market, etc etc
get it quickly, or work overtime to compensate for your idiocy, or become social darvinism's fish food.

@codobene
Copy link

i blindly copy code that I find here, fail to reproduce anything, and take drugs. good night. good luck!

@Juanvelz
Copy link

These people should learn how to teach before offering a course. On the other hand, they know very well how to discourage students.

@emakello
Copy link

The essence of taking an online course is to learn skills that can help you in your career or academics. While the courses are stimulating, it makes no sense to bring assignments that discourage rather than encourage learners. Some people like me had never coded before and I spend hours trying to do this thing without success. I wish there was someone who could teach coding from scratch without assuming any prior knowledge.

@cfsobral
Copy link

Thank you for help. I believe that the course already expects us to look for solutions like this one from mGalarnyk, otherwise it would not make sense to pass on an assignment of this complexity for beginners to do, because many like me would get frustrated and give up the course, as I read in some comments made here. Thank you mGalarnyk

@Doc-OmSa
Copy link

Doc-OmSa commented Apr 9, 2023

Thanks mGalarnyk. I have a feeling that this course is definitely not for beginners, unlikely for intermediates as well. The assignments are too advanced. I have been struggling to understand functions. I am a beginner having taken the Google Data Analutics course previously. This is too advanced. Would likely leave. Just wanted to ask if there are better courses more suitable for someone who is a beginner and going to intermediate level? An other platform?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment