Error In .data.frame Data Name.rf Undefined Columns Selected
Contents |
here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of this site About Us Learn more about Stack Overflow the company Business undefined columns selected error in r Learn more about hiring developers or posting ads with us Stack Overflow Questions Jobs Documentation error in `[.data.frame`(frame, predictors) : undefined columns selected Tags Users Badges Ask Question x Dismiss Join the Stack Overflow Community Stack Overflow is a community of 4.7 million programmers, just like error in `[.data.frame`(y.data, , mediator) : undefined columns selected you, helping each other. Join them; it only takes a minute: Sign up Error in data frame undefined columns selected up vote 1 down vote favorite I've been working on an assignment where I have to read
Undefined Columns Selected R Predict
in some csv files from a directory "specdata". The files are very similar in that there are 332 titled 001.csv - 332.csv. They have consistent columns and headers, if that matters. I believe I'm close but am tripping up with the above error message " Error in [.data.frame(data1, good) : undefined columns selected" I had expected a data frame to load with all the data specified by the subset of files in id parameter. pollutantmean <- shiny undefined columns selected function(directory, pollutant, id = 1:332) { files <- list.files(directory) subsetFiles <- files[id] for (i in subsetFiles) { filepaths <- paste(directory,"/",i, sep='') data1 <- read.csv(filepaths) } data1 good <- complete.cases(data1) data2 <- data1[good] data2 } # test it out and ignore middle parameter for now pollutantmean("specdata", "pass", 1:3) r share|improve this question edited Oct 23 '14 at 13:29 Cerbrus 35.3k66094 asked May 14 '14 at 23:04 Doug Fir 2,732164189 add a comment| 2 Answers 2 active oldest votes up vote 5 down vote accepted Are you meaning? data2 <- data1[good,] With data1[good] you're selecting columns in a wrong way (using a logical vector of complete rows). Consider that parameter pollutant is not used; is it a column name that you want to extract? if so it should be something like data2 <- data1[good, pollutant] Furthermore consider that you have to rbind the data.frames inside the for loop, otherwise you get only the last data.frame (its completed.cases) And last but not least, i'd prefer generating filenames eg with id <- 1:322 paste0( directory, "/", gsub(" ", "0", sprintf("%3d",id)), ".csv") A little modified chunk of ?sprintf The string fmt (in our case "%3d") contains normal characters, which are passed through to the output string, and also conversion specifications which operate on the arguments provided through .... The allowed conversion specifications start with a % and end wi
years ago by chawla • 190 chawla • 190 wrote: > > (Posting it a second time ..) > Hi > I am new to analyzing gpr files error in data frame object not found and I am getting an error in the first stage of reading them. >
Error In View Undefined Columns Selected
Here is the code: > >> library(marray) >> myfiles <- dir(pattern="gpr") >> data<-read.GenePix(myfiles) > where I get the error >> Error
Error In Data.frame Arguments Imply Differing Number Of Rows
in `[.data.frame`(dat, , name.Rf) : undefined columns selected > > > I have also tried >> library(limma) >> myfiles <- dir(pattern="gpr") >> RG <- read.maimages(myfiles, source="genepix") > Where also I get the error >> http://stackoverflow.com/questions/23666611/error-in-data-frame-undefined-columns-selected Error in `[.data.frame`(obj, , columns[[a]]) : undefined columns selected > > > I tried to look for solution but couldn't find any except that the 'gpr files should be used as obtained from genepix'. > I have only these files, is there any way to fix the files or find the cause of error. > > Thanks > Konika Chawla > PhD Student > Department of Biology > NTNU https://support.bioconductor.org/p/51276/ > > > > Konika Chawla PhD Student Department of Biology NTNU [[alternative HTML version deleted]] ADD COMMENT • link • Not following Follow via messages Follow via email Do not follow modified 3.6 years ago by Maciej Jończyk • 720 • written 3.6 years ago by chawla • 190 0 3.6 years ago by Maciej Jończyk • 720 Maciej Jończyk • 720 wrote: > From: konika chawla
post in threaded view ♦ http://r.789695.n4.nabble.com/Selecting-A-List-of-Columns-td4667285.html ♦ | Report Content as Inappropriate ♦ http://osdir.com/ml/science.biology.informatics.conductor/2004-04/msg00037.html ♦ Selecting A List of Columns Dear R Helpers, I need help with a slightly unusual situation in which I am trying to select some columns from a data frame. I know how to error in use the subset statement with column names as in: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) all.cols<-colnames(x) to.keep<-all.cols[1:2] Kept<-subset(x,select=to.keep) Kept However, if I want to select some columns undefined columns selected based on a selection of the most important variables from a random forest then I find myself stuck. The example below demonstrates the problem. library(randomForest) data(mtcars) mtcars.rf <- randomForest(mpg ~ ., data=mtcars,importance=TRUE) Importance<-data.frame(mtcars.rf$importance) Importance MSEImportance<-head(Importance[order(Importance$X.IncMSE, decreasing=TRUE),],3) MSEVars<-row.names(MSEImportance) MSEVars<-data.frame(MSEVars,stringsAsFactors = FALSE) colnames(MSEVars)<-"Vars" NodeImportance<-head(Importance[order(Importance$IncNodePurity,decreasing=TRUE),], 3) NodeVars<-row.names(NodeImportance) NodeVars<-data.frame(NodeVars,stringsAsFactors = FALSE) colnames(NodeVars)<-"Vars" ImportantVars<-rbind(MSEVars,NodeVars) ImportantVars<-unique(ImportantVars) nrow(ImportantVars) ImportantVars<-as.character(ImportantVars) ImportantVars CarsVarsKept<-subset(mtcars,select=ImportantVars) Error in `[.data.frame`(x, r, vars, drop = drop) : undefined columns selected Any help on how to select these columns from the data frame would be most appreciated. --John J. Sparks, Ph.D. ______________________________________________ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guide http://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code. Pascal Oettli-2 Threaded Open this post in threaded view ♦ ♦ | Report Content as Inappropriate ♦ ♦ R
Median", Gf="F532 > Median", Rb="B635 Median", Gb ="532 Median")) > Don't you need Gb ="B532 Median" \Heidi > and got > > Error in "[.data.frame"(obj, , columns$Gb) : > undefined columns selected. > > I have no idea how to interpret these error messages and have to say > that my forays into BioConductor have been a frequent exercise in > frustration because of constant unintelligible error messages. Could > some one please help me in solving these issues. > > I'm running R 1.8.0 on MacOS X and recently updated limma (1.3?) > > thanks > Bryce > [[alternative HTML version deleted]] > > _______________________________________________ > Bioconductor mailing list > Bioconductor-J/1JLT8/XkkyrOtl8ohm9u1GAupnlqi7@xxxxxxxxxxxxxxxx > https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor > Thread at a glance: Previous Message by Date: RE: [BioC] limma and nested factors Hi, thanks for reply. I guess you mean is to use the rank as given by topTable (by p-value) for one contrast for a comparison with another contrast? Since the rank is already given (by topTable) would I've to use a pearson correlation? What I've just tried is a simple pearson (the data look not too far from normal) and spearman correlation separate for up and downregulated genes. > cor.time.spearman down up NEWvsOLD 0.4461673 0.371619276 NEWvsPRG -0.1675682 0.003799389 OLDvsPRG -0.1500851 0.057734972 > cor.time.pearson down up NEWvsOLD 0.723402369 0.49514365 NEWvsPRG -0.035844726 0.01057829 OLDvsPRG 0.005358755 0.04535618 Given are the "estimates". The correlation is based on the coefficients in the eBayes fit from limma. Only those coefficients were included for which either one or the other (e.g. NEW or OLD) provides a coefficient >0 (up) with p<=0.01, or <0 for down. For each correlation between 500 and 1000 genes are included. This is not too far from what I expect, since OLD and NEW have been generated using quite similar experimental protocols. It also tells that th