Bootstrapping t-values

derfuss · Post by **derfuss** » Mon Jan 30, 2006 1:28 pm

Hi everybody,

I have a problem regarding the bootstrapping output: Everytime I re-calculate the bootstrap I get different t-values. How is this possible? And how can I determine the significance of loadings and pathcoefficients, if the t-values keep changing?

I already adjusted the number of cases per sample to 75 (also my N) and I also set the number of samples drawn to 100, the problem remains the same.

Thanks in advance for your help.

Klaus

cringle · Post by **cringle** » Mon Jan 30, 2006 8:36 pm

Hi,

I usually create 500 subsamples when I use the bootstrapping procedure. However, since each case of every subsample is randomly drawn from your overall set of data, it just make sense that the results for the t-values are never the same.

Best
Christian

stefanbehrens · Post by **stefanbehrens** » Mon Jan 30, 2006 8:51 pm

Klaus,

you will be able to answer your questions on your own once you understand how the bootstrap procedure works:

For each boostrap-run (100 runs in your case), the program selects a specified number of cases (75 in your case) at random from your sample with the possibility of drawing the same case more than once ("mit Zurücklegen").

As the cases for each run are drawn at random, each run will likely produce different estimates for loadings and path coefficients. The totality of the estimates obtained this way are used to determine the distribution parameters (mean, stdev) which are used to calculate the t-values. Obviously, the set of estimates obtained (and thus the distribution parameters) is likely to be different every time you run the bootstrap procedure (although not too different). Thus, your t-values will also change every time you run a bootstrap.

However, in my experience, the t-values obtained with a bootstrap of 500 or so runs tend to be quite similar each time the bootstrap is performed. Only rarely does the significance-level of an estimate change in between different bootstrap runs.

Hope this helps,

Stefan

derfuss · Post by **derfuss** » Tue Jan 31, 2006 9:09 am

Hi,

thanks a lot for your help, in fact, I already thought about an explanation like yours, but wasn't too sure, whether I was right or hitting on another problem. Anyway, the question is whether the creation of 500 samples is really feasible with 75 cases in the original dataset, or whether this would be an over-reliance on too few cases.

Another question reagrding the t-values automatically generated by the program: Against which criterion do I evaluate them to determine significance? As far as I could learn from the forum entries (if I remember them in the right way), they are more or less comparable to the t-values derived from OLS-regression. So do I judge them against an ordinary t-statistic table, using my number of cases to determine the df?

Regards,
Klaus

stefanbehrens · Post by **stefanbehrens** » Tue Jan 31, 2006 11:45 am

Hi Klaus,

the number of cases in the original data set is (almost) irrelevant, since what you are trying to approximate with the bootstrap is the distribution of the parameter estimates.

Significance is determined against an ordinary t-statistic table using the number of bootstrap runs as the df (i.e. 500). For a typical one-sided test, the following t-values correspond to a given level of significance (df=500):
3.107 ~ p<0.001
2.334 ~ p<0.010
1.648 ~ p<0.050
1.283 ~ p<0.100

Cheers,
Stefan

derfuss · Post by **derfuss** » Wed Feb 01, 2006 9:28 am

Hi,

maybe my second question was a bit imprecise, but in fact I was aiming at the evaluation of the t-values in the pls-calculation output (those reported alongside the path coefficients, loadings, r-squares etc.) not at those in the bootstrapping output. For the latter its pretty clear from the literature. Only the former and their evaluation are new to me (I only used PLS-Graph before).

Hope this clarifys my question a bit.

Regards
Klaus

stefanbehrens · Post by **stefanbehrens** » Wed Feb 01, 2006 9:38 am

From what I understand, those t-values reported directly in the PLS output are an experimental feature that can be safely ignored. I'm not aware of any useful application of these values.

Cheers,
Stefan

cringle · Post by **cringle** » Wed Feb 01, 2006 9:54 am

Hi,

please check this link:
viewtopic.php?t=136

Best
Christian

saab · Post by **saab** » Wed Feb 01, 2006 3:06 pm

Hi all,

this discussion is quite interesting. I was wondering if anybody knows how to estimate an optimal/reasonable no. of cases as well as a reasonable no. of samples.
I've heard or read that choosing an equal no. of cases as the original sample size does not really make sense, because each time ALL of the available cases are drawn. Thus, 1/2-size or 2/3-size might be appropriate. What do you think?

I also got the feeling that choosing a high no. of samples tends to lead to more "significant" results. On the other hand, what is the argument in choosing 500 and not 1000 samples or more?

Best regards,
Samy

stefanbehrens · Post by **stefanbehrens** » Wed Feb 01, 2006 3:40 pm

Hi Samy,

well, interesting points. Here's what I think:

1) Bootstrap Sample Size
I have not come across any recommendations regarding this parameter in the literature. However, your rationale for choosing an N smaller than the original sample doesn't convince me. Due to the random drawing of cases for the bootstrap sample, even a small sample may contain duplicates (the same case drawn more than once) and even very large samples do not necessarily contain all cases. I would even argue that small bootstrap sample sizes tend to produce greater variance in the parameter estimates of the individual bootstrap runs and thus "deflate" your t-values. Consequently, I would recommend a bootstrap sample size close to or even larger than the original sample.

2) Number of Bootstrap Runs
The bootstrap procedure is trying to approximate the distribution of the parameter estimates. Thus, you need to do "enough" runs (>200) in order to provide a "reasonable approximation" of these distributions. However, beyond 500, additional runs only marginally improve the approximation of the "true" distributions and thus have very limited impact on the t-values. Of course you could argue that more is always better, but there is little point of incurring the extra wait time for another 500+ runs.

Happy to hear everybody's thoughts in particular on the first issue.

Cheers,
Stefan

saab · Post by **saab** » Thu Feb 02, 2006 8:42 am

Dear Stefan,

thank you!

Regarding the sample size i've identified my error in reasoning. Somehow I thought that each draw contains of the whole population (e.g. 70 cases out of 100) and not each single case. Of course only the latter makes sense at all. How embarrassing ;-)

Regarding the number of bootstrap runs "waiting time" shouldn't really count nowadays (especially when I think of my Commodore C64...back then ;-).
Choosing a high number of runs (> 1000) could be reasonable because then you could take the t-values from the t-distribution table where df = infinite (e.g. 1.645, 1.960 etc.) which seem to me as the "standard" values that are used by most of the researchers.

Best Regards,
Samy

stefanbehrens · Post by **stefanbehrens** » Thu Feb 02, 2006 9:24 am

Hi Samy,

I'm glad that my comments made sense. However, the issue you raised regarding the bootstrap sample size kept bugging me.

So I ran some tests that basically confirmed my hypothesis regarding the relationship between bootstrap sample size and stdev of bootstrap estimates (i.e. t-values): The larger the bootstrap sample, the larger the t-values.

But if the t-values (and thus the significance of the path estimates) can be manipulated this way, a fixed anchor is required in order to make this method of significance-testing meaningful. As the original sample size is the only reference point available, I would strongly argue (contrary to my statement above) for ALWAYS using the original N as the bootstrap sample size.

Any thoughts from the experienced PLS-community?

Cheers,
Stefan

saab · Post by **saab** » Thu Feb 02, 2006 9:43 am

Dear all,

I'd like to quote this internet page (see http://www.stata.com/support/faqs/stat/reps.html). The author basically recommends to:

1) choose a sample size which is equal to the number of cases in the dataset, because "the standard error estimates are dependent upon the number of observations in each replication".

2) "In terms of the number of replications, there is no fixed answer such as “250” or “1,000” to the question. The right answer is that you should choose an infinite number of replications because, at a formal level, that is what the bootstrap requires. The key to the usefulness of the bootstrap is that it converges in terms of numbers of replications reasonably quickly, and so running a finite number of replications is good enough—assuming the number of replications chosen is large enough."

Any comments?

Regards,
Samy

cringle · Post by **cringle** » Thu Feb 02, 2006 10:28 am

Hi,

quite interesting discussion. I think Samy’s citation puts it all together. Regarding the first point, I think this argument is generally stated in literature. The second point just makes sense, because we do not have distributional assumptions for the path coefficients. Such a distribution for the path coefficients is created using the bootstrapping technique. Thus, it is reasonable to create large numbers of subsamples.

Best
Christian

derfuss · Post by **derfuss** » Fri Feb 10, 2006 9:51 am

Hi,

returning to my question about the t-values in the pls-procedure-output (not those in the bootstrapping output) I just evaluated them against the t-table using the rationale given by Backhaus et al., 2003, 10. edition, p. 73-75. They use K - J - 1 = df, with K = number of observations, J = No. of independent variables, to determine the critical t-value for an OLS regression equation.

I found results that differ from each other, comparing the calculation-output-t-values with the bootstrapping-t-values (500 samples): For p<0,05 the same pathes are significant. But for p<0,10 the path found to be significant regarding the t-values in the calcuilation output was not significant using the bootstrapping procedure.

Therefore, at present it seems really safer to disregard the values in the calculation output and using the bootstrapping t-values.

Greetings
Klaus

forum.smartpls.com

Bootstrapping t-values

Bootstrapping t-values

number of cases and samples when running bootstrap