Bootstrap Standard Error Sample Size
Contents |
Tour Start here for a quick overview of the site Help Center Detailed answers to any questions you might have Meta Discuss the workings and policies of this site About Us Learn more about Stack Overflow the company Business Learn more about hiring developers or bootstrap standard error stata posting ads with us Cross Validated Questions Tags Users Badges Unanswered Ask Question _ Cross Validated
Bootstrap Standard Error R
is a question and answer site for people interested in statistics, machine learning, data analysis, data mining, and data visualization. Join them; it only bootstrap standard error estimates for linear regression takes a minute: Sign up Here's how it works: Anybody can ask a question Anybody can answer The best answers are voted up and rise to the top Determining sample size necessary for bootstrap method / Proposed Method up vote 10
Bootstrap Standard Error Matlab
down vote favorite 11 I know this is a rather hot topic where no one really can give a simple answer for. Nevertheless I am wondering if the following approach couldn’t be useful. The bootstrap method is only useful if your sample follows more or less (read exactly) the same distribution as the original population. In order to be certain this is the case you need to make your sample size large enough. But what is large enough? If my premise is bootstrap standard error formula correct you have the same problem when using the central limit theorem to determine the population mean. Only when your sample size is large enough you can be certain that the population of your sample means is normally distributed (around the population mean). In other words, your samples need to represent your population (distribution) well enough. But again, what is large enough? In my case (administrative processes: time needed to finish a demand vs amount of demands) I have a population with a multi-modal distribution (all the demands that are finished in 2011) of which I am 99% certain that it is even less normally distributed than the population (all the demands that are finished between present day and a day in the past, ideally this timespan is as small as possible) I want to research. My 2011 population exists out of enough units to make $x$ samples of a sample size $n$. I choose a value of $x$, suppose $10$ ($x=10$). Now I use trial and error to determine a good sample size. I take an $n=50$, and see if my sample mean population is normally distributed by using Kolmogorov-Smirnov. If so I repeat the same steps but with a sample size of $40$, if not repeat with a sample size of $60$ (etc.). After a while I conclude that $n=45$ is the absolute minimum sample size to get a more or less good representation of my 2011 population. Since I know
on statistics Stata Journal Stata Press Stat/Transfer Gift Shop Purchase Order Stata Request a quote Purchasing FAQs Bookstore Stata Press books Books on Stata Books on statistics Stat/Transfer Stata Journal Gift Shop
Bootstrap Standard Error Heteroskedasticity
Training NetCourses Classroom and web On-site Video tutorials Third-party courses Support Updates Documentation Installation
Bootstrap Standard Error In Sas
Guide FAQs Register Stata Technical services Policy Contact Publications Bookstore Stata Journal Stata News Conferences and meetings Stata Conference Upcoming meetings standard error and sample size correlation Proceedings Email alerts Statalist The Stata Blog Web resources Author Support Program Installation Qualification Tool Disciplines Company StataCorp Contact us Hours of operation Announcements Customer service Register Stata online Change registration Change address Subscribe http://stats.stackexchange.com/questions/33300/determining-sample-size-necessary-for-bootstrap-method-proposed-method to Stata News Subscribe to email alerts International resellers Careers Our sites Statalist The Stata Blog Stata Press Stata Journal Advanced search Site index Purchase Products Training Support Company >> Home >> Resources & support >> FAQs >> Guidelines for bootstrap samples The following question and answer is based on an exchange that started on Statalist. How large should the bootstrapped samples be relative to the http://www.stata.com/support/faqs/statistics/bootstrapped-samples-guidelines/ total number of cases in the dataset? Title Guidelines for bootstrap samples Author William Gould, StataCorpJeff Pitblado, StataCorp Note: This FAQ has been updated for Stata 14. bootstrap is based on random draws, so results are different from previous versions because of the new 64-bit Mersenne Twister pseudorandom numbers. Question: I am running a negative binomial regression on a sample of 488 firms. For various reasons [...], I decided to use the bootstrapping procedure in Stata on my data. Are there general guidelines that have been proposed for how large the bootstrapped samples should be relative to the total number of cases in the dataset from which they are drawn? Answer: When using the bootstrap to estimate standard errors and to construct confidence intervals, the original sample size should be used. Consider a simple example where we wish to bootstrap the coefficient on foreign from a regression of weight and foreign on mpg from the automobile data. The sample size is 74, but suppose we draw only 37 observations (half of the observed sample size) each time we resample the data 2,000 times. . sysuse auto, clear . set seed 3957574 . bootstrap _b[foreign], size(37) reps(2000) dots: regress mpg weight foreign (running regress on estim
programs The R program (as a text file) for the code on this page. In order to see more than just the results from the computations of the functions (i.e. if you want to see http://www.ats.ucla.edu/stat/r/library/bootstrap.htm the functions echoed back in console as they are processed) use the echo=T option in the source function when running the program. source("d:/stat/bootstrap.txt", echo=T) Introduction Bootstrapping can be a very useful tool in statistics and it is very easily implemented in R. Bootstrapping comes in handy when there is doubt that the usual distributional assumptions and asymptotic results are valid and accurate. Bootstrapping is a nonparametric method which lets us compute estimated standard error standard errors, confidence intervals and hypothesis testing. Generally bootstrapping follows the same basic steps: 1. Resample a given data set a specified number of times 2. Calculate a specific statistic from each sample 3. Find the standard deviation of the distribution of that statistic The sample function A major component of bootstrapping is being able to resample a given data set and in R the function which does this is the sample function.
bootstrap standard error sample(x, size, replace, prob) The first argument is a vector containing the data set to be resampled or the indices of the data to be resampled. The size option specifies the sample size with the default being the size of the population being resampled. The replace option determines if the sample will be drawn with or without replacement where the default value is FALSE, i.e. without replacement. The prob option takes a vector of length equal to the data set given in the first argument containing the probability of selection for each element of x. The default value is for a random sample where each element has equal probability of being sampled. In a typical bootstrapping situation we would want to obtain bootstrapping samples of the same size as the population being sampled and we would want to sample with replacement. #using sample to generate a permutation of the sequence 1:10 sample(10) [1] 4 8 3 5 1 10 6 2 9 7 #bootstrap sample from the same sequence sample(10, replace=T) [1] 1 3 9 4 10 3 5 1 6 4 #boostrap sample from the same sequence with #probabilities that favor the numbers 1-5 prob1 <- c(rep(.15, 5), rep(.05, 5)) prob1 [1] 0.15 0.15 0.15 0.15 0.15 0.05 0.05 0.05 0.05 0.05 sample(10, replacbe down. Please try the request again. Your cache administrator is webmaster. Generated Thu, 06 Oct 2016 19:38:32 GMT by s_hv987 (squid/3.5.20)