Last active
December 4, 2015 21:12
Revisions
-
laurakwiley revised this gist
Dec 4, 2015 . 1 changed file with 11 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,3 +1,4 @@ library(magrittr) library(tidyr) library(dplyr) @@ -15,3 +16,13 @@ data %>% library(stringr) data %>% mutate(numeric_val = str_match(Var1, "= ?([0-9]+\\.?[0-9]?)")[,2] %>% as.numeric()) # In case your data really is a matrix data <- matrix(c("MHSMOCA:MOCATOTAL = 24", "MHSMOCA:MOCATOTAL = 24.5","MHSMOCA:MOCA7TOTAL = 24", "MsdHSMOCA:MOCA7TOTAL = 26.54654"), nrow = 2, ncol = 2) # Note this will make all of your strings factors, data %<>% as.data.frame() %>% tbl_df() # Can use mutate_each to apply as.character() to every column data %>% mutate_each(funs(as.character(.))) -
laurakwiley created this gist
Dec 4, 2015 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,17 @@ library(tidyr) library(dplyr) data <- data.frame(Var1 = c("MHSMOCA:MOCATOTAL = 24", "MHSMOCA:MOCATOTAL = 24.5","MHSMOCA:MOCA7TOTAL = 24")) %>% tbl_df() # Option 1: Use a variable splitting function from tidyr ## tidyr::separate takes the column you want to split on, what you want the new column names to be, what the separating text is, and whether you want to change data types after separation (so here rendering the number as a dbl or int) data %>% separate(col = Var1, into = c("Text","Number"), sep = "=", convert = TRUE) # Option 2: Extract Number using regex ## Here I am using a regex to extract an equals sign, possible whitespace, and then I have a capturing group for [0-9](at least once) possibly followed by a period and more numbers. I used the "[,2]" to get the capturing group from the regex ## Note you can pipe within the function to perform the column type change, otherwise it keeps the number as a string ## library(stringr) data %>% mutate(numeric_val = str_match(Var1, "= ?([0-9]+\\.?[0-9]?)")[,2] %>% as.numeric())