1 Introduction

The following document includes the descriptive Bayesian meta-analysis result for Written Corrective Feedback (WCF) project.

While for this type of data, regression-based models can better account for complexity of the data structure, those models don’t produce descriptive results which are commonly reported in the field.

The following include both tables and results. These results are based on the moderators that we agreed in August 2020, and XXXX implemented all of them via programming.

The 15 refined and agreed upon moderators include:

##  [1] "setting"     "cf.type"     "error.key"   "error.type"  "ed.level"   
##  [6] "profic"      "cf.oral"     "cf.training" "cf.scope"    "Length"     
## [11] "instruction" "graded"      "genre"       "cf.revision" "timed"

The very final changes to the categories included:

profic: replace 77s with 99s

cf.type: replace 15s with 99s

Length: replace 0s with 1s and replace 3s with 2s (to end up with three categories: 1 & 2 & 99)

graded: replace 88s with 99s

2 Inter-rater reliability (IRR) analyses

Following Norouzian (in press), two raters rated 12% of the total study pool. The following shows the inter-rater reliability results.

Sindex	lower	upper	conf.level	row.comprd	min.cat	n.coder	study.level
0.63	0.44	0.78	0.95	43	4	2	No
1.00	1.00	1.00	0.95	43	–	2	No
0.58	0.35	0.81	0.95	43	1	2	No
0.78	0.33	1.00	0.95	6	1	2	Yes
0.95	0.86	1.00	0.95	43	2	2	No
0.75	0.25	1.00	0.95	6	0	2	Yes
1.00	1.00	1.00	0.95	6	–	2	Yes
0.88	0.77	0.97	0.95	43	0	2	No
1.00	1.00	1.00	0.95	6	–	2	Yes
0.56	0.11	1.00	0.95	6	0	2	Yes
0.94	0.84	1.00	0.95	43	1	2	No
1.00	1.00	1.00	0.95	6	–	2	Yes
0.25	-0.25	0.75	0.95	6	1	2	Yes
1.00	1.00	1.00	0.95	6	–	2	Yes
0.67	0.00	1.00	0.95	6	0	2	Yes

Inter-Rater Reliability Results (raters = 2)
	Sindex	lower	upper	conf.level	row.comprd	min.cat	n.coder	study.level
cf.oral	0.628	0.442	0.783	0.95	43	4	2	No
cf.revision	1.000	1.000	1.000	0.95	43	–	2	No
cf.scope	0.581	0.349	0.814	0.95	43	1	2	No
cf.training	0.778	0.333	1.000	0.95	6	1	2	Yes
cf.type	0.946	0.864	1.000	0.95	43	2	2	No
ed.level	0.750	0.250	1.000	0.95	6	0	2	Yes
error.key	1.000	1.000	1.000	0.95	6	–	2	Yes
error.type	0.884	0.767	0.971	0.95	43	0	2	No
genre	1.000	1.000	1.000	0.95	6	–	2	Yes
graded	0.556	0.111	1.000	0.95	6	0	2	Yes
instruction	0.938	0.845	1.000	0.95	43	1	2	No
Length	1.000	1.000	1.000	0.95	6	–	2	Yes
profic	0.250	-0.250	0.750	0.95	6	1	2	Yes
setting	1.000	1.000	1.000	0.95	6	–	2	Yes
timed	0.667	0.000	1.000	0.95	6	0	2	Yes

The comprehensive IRR analyses above helped redefine, or merge a few categories in the our coding scheme improving the replicability of our meta-analytic data set and the subsequent results by other WCF researchers (see Norouzian, in press).

3 The methodology

3.1 The effect size: dint

We calculated a within-group Cohen’s d effect size for each study group (i.e., control and tretments) in a controlled study as:

\[ \tag{1} d_{D} = \frac{m_{post} - m_{pre}}{sd_D} \] where \(m_{post}\) and \(m_{pre}\) are a study group’s means at the pre- and post-testing occasions, and \(sd_D\) is the standard deviation of their difference. \(d_D\) has a purely algebraic relation to a corrosponding paired t-value targeting the same comparison:

\[ \tag{2} d_{D} = \frac{t_{paired}}{\sqrt{n}} \]

where \(n\) is the number of a study group’s participants. As a result, in several studies (e.g., Jhowry, 2010; Mubarak, 2013) that had reported paired t-values, \(d_D\) was directly recovered for the relavant study group.

In other cases where only \(sd_D\) was instead reported (e.g., Karim & Nassaji, 2014; Nakazawa, 2006), equation (1) was used to obtain \(d_D\). Finally, in cases where neither \(t_{paired}\) nor \(sd_D\) for a study group’s change between the two testing occasions were reported, we obtained \(sd_D\) using one of the following methods: (a) by imputation using a pre-post correlation (\(r_D\)) obtainable from another group in the same study or (b) assuming a medium-high correlation (\(r_D = .6\)) following the recommendations by Ray and Shadish (1996) and Viechtbauer (2007) for the missing correlation and then using it in:

\[ \tag{3} sd_D =\sqrt{sd_{pre}^2+sd_{post}^2-2(r_D)sd_{pre}sd_{post}} \]

Once \(sd_D\) was obtained, then \(d_{D}\) using equation (1) was calculated.

It is well-known that \(d_{D}\) is a biased estimator of its real population value (Becker, 1988; Morris, 2000; Morris & DeShon, 2002; Viechtbauer, 2007). Therefore, to address the bias, we used Hedges’ (1981) correction factor (\(cf\)):

\[ \tag{4} cf = \frac{\Gamma\bigg(\frac{df}{2}\bigg)}{\sqrt{\frac{df}{2}} \Gamma\bigg(\frac{df-1}{2}\bigg)} \approx 1- \frac{3}{4df-1} \] where \(\Gamma\) is a mathemtical function available in many software programs (e.g., R or Excel) and \(df\) is simply \(n-1\). The equation after the \(\approx\) sign is an approximation for the exact correction factor. In our study, the bias-corrected effect size, \(g_{D}\), was computed as:

\[ \tag{5} g_{D} = cf \times d_{D} \]

The unbiased estimator of the sampling variance, \(V(g_{D})\), was computed (see Viechtbauer, 2007; equation 26) as: \[ \tag{6} V(g_{D}) = \frac{1}{n} + \bigg(1-\frac{df-2}{df\times cf^2}\bigg)\times g_{D}^2 \]

Once the unbiased estimate of effect size (\(g_{D}\)) and its unbiased sampling variance (\(V(g_{D})\)) for each pair of treatment and control groups were obtained, we then calculated their difference (see Morris & DeShon, 2002) to which we refer as a \(dint\):

\[ \tag{7} dint = g_{D_T} - g_{D_C} \] where the \({_T}\) and \({_C}\) subscripts denote the treatment and the control groups, respectively. The unbiased sampling variance of a \(dint\) is computed as:

\[ \tag{8} V(dint) = V(g_{D_T}) + V(g_{D_C}) \]

3.2 Why this effect size

This effect size, we believe, closely matches the study designs in the WCF literature which often follow a nonequivalent groups design (NEGD) shown as:

N O X O O

N O – O O

where \(N\) is non-random assignment of participants, \(O\) is an observation collected at a testing occasion, and \(X\) is a form of WCF (i.e., treatment).

Given this design, a \(dint\) measures the change within a treatment group across any two times (e.g., pre- to immediate post-test) and compares it to that within the control group. For example, a \(dint\) of \(+0.2\) for a WCF treatment would indicate that the treatment group’s standardized mean change is \(+0.2\) higher than that of the control group in the time period considered (e.g., pre- to immediate post-test).

Therefore, \(dint\) can be thought of as measuring the simple-effect of study group membership (e.g., treatment vs. control) at each time interval in the primary studies. We believe this effect size is suited to the longitudinal but also casual aspects of WCF studies. This is because measuring a \(dint\) is also equivalent to measuring difference in differences (DID), an approach known to mitigate some extraneous factors and selection biases arising from non-random assignment of study participants to the study groups (Angrist & Pischke, 2008).

3.3 Handling dependency among effect sizes

We included all complex-structure WCF studies including multi-outcome, multi-group, and multiple-posttest studies. This stage led to 524 within-group Cohen’s d effect sizes, and 262 dints. Adopting an open-science approach, our full dataset for this study is publicly available HERE (this is another new, updated shortened dataset for this publication only).

20 of the total 52 studies contained one or more negative dints before averaging multiple \(dints\) in each study. The unbiased sampling variance of the averaged \(dint\) in each study, \(V(\frac{1}{m}\sum_{i=1}^m dint_i)\), was computed using (see Borenstein, et al., 2009, Chapter 24):

\[ \tag{9} V\bigg(\frac{1}{m}\sum_{i=1}^m dint_i \bigg) = \bigg(\frac{1}{m}\bigg)^2 \times\bigg(\sum_{i=1}^m V_i+\sum_{i\neq j} (r_{ij} \times \sqrt{V_i} \times \sqrt{V_j} ) \bigg) \] where \(m\) is the number of \(dints\) in each study, \(r_{ij}\) denotes the assumed correlation between any two \(dints\) from the same study (we asummed \(r_{ij}\) = .5), and \(V_i\) and \(V_j\) are their individual unbiased sampling variances as obtained in equation 8.

Eight average effect sizes from 8 studies (i.e., Bitc_Yng_Cmrn; Diab_b; Eki_diGnro; Fazio; Jhowry; Trscott_Hsu; VanBe_Jng_Ken; Zhang) became negative. See the Outlier control and publication bias section for a discussion of how outlying effect sizes were treated.

4 Meta-analysis model

The model of our meta-analysis study is Bayesian random-effects designed to estimates 2 parameters which are diagrammatically shown in the following Figure:

In the above figure, parameter 1 is the mean (\(\mu\)) or overall effect size, and parameter 2 is Tau (\(\tau\)) or standard deviation of the distribution of the study effect sizes.

As explained in Norouzian et al., (2018, 2019), in Bayesian inference, all parameters of a model (e.g., \(\mu\), and \(\tau\)) are estimated such that a direct statement about their population values can be conveyed to the research audience.

This is possible because Bayesian meta-analyses provide a range of credible estimates of the true mean effect (\(\mu\)) of effect sizes, their standard deviation (\(\tau\)) based on the individual effect sizes and sampling variances of effect sizes fit to it. Thus, for each meta-analytic parameter, we often speak of a (marginal) posterior distribution which may be described using familar summary statistics (e.g., mean, sd etc.) as well as Highest Density Intervals (HDI) that cover 95% of it (see Norouzian et al., (2018, 2019) for more details).

5 Outlying effect sizes

Following the outlier detection method of removing the effect sizes that fall beyond \(\pm 3\) standard deviations from their mean (Lipsey & Wilson, 2001), five effect sizes were eliminated from the dataset as shown below:

Outlying Effect Sizes
study.name	row	dint
Nemati_etal	152	5.093062
Nemati_etal	150	4.613584
Al_Ajmi	6	4.376457
Seiff_ElSak	186	4.009361
Al_Ajmi	5	3.444227

Removal of the above effect sizes led to the removal of the following studies:

## [1] "Al_Ajmi"     "Seiff_ElSak"

The positively skewed distribution of our effect sizes before and the symmetric distribution of effect sizes after the ouliers’ removal is captured in the figures below.

As noted earlier, a number of effect sizes were also negative. The full list of these effect sizes is as follows:

Negative Effect Sizes
study.name	dint	SD
Al.Ahm_Al.Jar	-0.1553864	0.4272827
Bitc_Yng_Cmrn	-0.4935793	0.3802870
Bitc_Yng_Cmrn	-0.5156586	0.3467887
Bitc_Yng_Cmrn	-0.6975506	0.3574931
Brown	-0.1324652	0.2208191
Brown	-0.2091514	0.2219060
Diab_b	-0.0241685	0.3318732
Diab_b	-0.6605675	0.3429860
Eki_diGnro	-0.5477378	0.3487669
Eki_diGnro	-0.2352528	0.3435118
Elis_Sh_Mur_Tak	-0.0130216	0.4873301
Fazio	-0.1308920	0.3642787
Fazio	-0.5467906	0.3915234
Hartshorn	-0.2508190	0.2994754
Jhowry	-0.0618444	0.4750960
Karim_End.	-0.3485250	0.3275155
Karim_End.	-0.2824068	0.3230654
Mubarak	-0.1189049	0.4172375
Mubarak	-0.0883446	0.3988416
Mubarak	-0.4926680	0.4258486
Mubarak	-1.3695479	0.5306183
Mubarak	-0.1587712	0.3887653
Mubarak	-0.3610557	0.4024929
Munoz	-0.1428038	0.3358399
Nusrat.etal.	-0.2210411	0.2840090
Nusrat.etal.	-0.1870419	0.2983482
Parreno	-0.2291886	0.3701226
Pash.	-0.0758471	0.3323983
Sheen_etal	-0.3736234	0.3363308
Sheen_etal	-0.1539893	0.3416138
Sheen_etal	-1.1416347	0.5460815
Sheen_etal	-0.9534229	0.5584261
Sheen_etal	-0.6841150	0.4297241
Sheen_etal	-0.6391715	0.5002314
Sheen_etal	-0.7079216	0.4999521
Stefan_Revesz	-0.4877348	0.2737911
Stefan_Revesz	-0.1259811	0.2674519
Trscott_Hsu	-0.0827458	0.3412078
VanBe_Jng_Ken	-0.8911023	0.3758384
VanBe_Jng_Ken	-1.3161183	0.4266433
VanBe_Jng_Ken	-0.1756171	0.4101802
VanBe_Jng_Ken	-1.0457147	0.3657065
VanBe_Jng_Ken	-0.2850271	0.3514128
VanBe_Jng_Ken	-0.7674492	0.3761727
VanBe_Jng_Ken	-0.0067616	0.3622921
VanBe_Jng_KenB	-0.0583938	0.2515945
VanBe_Jng_KenB	-0.0023592	0.2552378
Zhang	-0.8209664	0.3198808
Zhang	-0.7342465	0.3112923

5.1 Outlier control

From our initial 52 studies, two studies (Al_Ajmi and Seiff_ElSak) were excluded because (a) they included extremely large effect sizes (e.g., 5.09) that had extreme effect on the overall effect, (b) a close inspection of the studies’ treatments, settings, operational definitions and detailed procedures did not suggest any notable difference to justify such drastically large effects relative to other studies in the WCF literature and (c) they made the funnel plot of the study effect sizes extremely asymmetric causing the egger’s test of funnel plot symmetry to become highly significant (p < .001).

Egger’s Test Before Outliers Removal

b1	z.value	p.value	b1.lower	b1.upper	result
4.5877	4.4219	0	2.5542	6.6211	***

For the sake of transparency, our dataset before the treatment of outlying effect sizes is publicly available HERE and our dataset after the treatment of outlying effect sizes is also publicly available HERE.

6 Publication bias

6.1 Funnel plot

After the treatment of the outlying effect sizes as described in the previous section, our first method of detecting publication bias was a visual funnel plot. In the funnel plot below, while not perfectly symmetrical, average effects sizes from each study appear to be moderately symmetrical on both sides of the overall mean effect (the middle dasshed line). It is worth noting that the gray points in the plot are the shrinkage estimates i.e., more exreme (in absolute value) and less precise (those with larger SE) effect sizes are shrunk more toward the mean effect via random-effects model and Bayesian shrinkage i.e., use of a prior on the mean effect size to blunt extreme effect sizes from studies.

Because, the detection of publication bias via a funnel plot is mainly a visual method, it may sometimes turn out to be subjective. As a result, we used several other statistical methods to examine publication bias in the WCF literature.

6.2 Egger’s Test

We also conducted an Egger’s test (Egger, Smith, Schneider, & Minder, 1997) of funnel plot symmetry. Using this test, we examined the extent to which the standard error (i.e., precision) of the effect sizes collected from the WCF literature related to the effect sizes’ magnitude. If such a relationship rises to a statistically significant level, that could suggest publication bias and asymmetry in the funnel plot of effect sizes.

In our case, given that the p-value for the egger’s test was larger than .05, we concluded that our funnel plot is symmetric and the likelihood of publication bias in the collected sample of WCF studies is small.

Egger’s Test After Outliers Removal

b1	z.value	p.value	b1.lower	b1.upper	result
1.6447	1.647	0.0996	-0.3125	3.6019	:-)

6.3 Trim and Fill Method

We also assessed the risk of pubication bias via a third method known as Trim and Fill (Duval, 2005; Duval & Tweedie, 2000). In this non-parameteric method, the goal is to impute (i.e., fill) the possible effect sizes missing, due to a particular selection mechanism, from the literature to achieve a more symmetric funnel plot of effect sizes on either side of the plot.

As shown in the follwing figure, in our case, the Trim and Fill method suggested that no fill-studies to either side of the plot are needed to achieve a more symteric view of the WCF literature. Therefore, these results concur with with the previous egger’s test result as regards the issue of publication bias.

6.4 Vevea and Hedges Method

We also used a fourth confirmatory method to detect publication bias. Vevea and Hedges (1995) proposed a method where the original meta-analysis and a bias-assumed meta-analysis using weights associated with the p-value intervals for effect sizes are compared to one another (see Vevea & Hedges 1995 and Vevea & Woods, 2005 for details).

Once both the original and bias-assumed meta-analytic models are obtained, a likelihood-ratio test compares the fit of the two models to the effect sizes. A statistically significant likelihood-ratio test would indicate that the bias-assumed model better fits the study effect sizes, and hence the presence of publication bias. Otherwise, the risk of publicatio bias is minimal.

Vevea & Hedge Method for Publication Bias
	Df	X^2	p.value
Likelihood Ratio Test:	1	0.1079	0.7426

6.5 Orwin’s Fail-safe N method

Finally, we conducted the Orwin’s (1983) Fail-safe N method. Using this method, we examined the number of missing WCF studies needed to decrease our meta-analytic WCF effect to the point of triviality. We set the triviality of our meta-analytic effect to be \(0.05\).

The likelihood of publication bias increases, if only a handful of studies would be needed to deem our meta-analytic effect trivial. This is because it is possible that a handful of small-effect studies might not have made their way into the published WCF literature. Conversely, the likelihood of publication bias decreases, if a relatively large number of studies would be needed to deem our meta-analytic effect trivial. This is because it is less likely that such a large number of small-effect studies constituting a trend in a domain of research might not have made their way into the published WCF literature.

Fail-safe N Method for Publication Bias
studies.needed.to.tiviality	trivial.effect.size
478	0.05

Once again, based on the relatively large number of missing studies suggested by the Fail-safe N Method to deem our meta-analytic effect trivial, the likelihood of publication bias seems to be low.

As can be seen, the methods of publication bias often complement each other each viweing bias from a unique perspective. But overall, our visual (funnel plot), and statistical methods (Egger’s test, Trim and Fill, Vevea and Hedges’ test, and Orwin’s Fail-safe N) collectively suggested no major indication of publication bias in the present pool of studies.

7 The Effect of WCF

7.1 Definition of tabular output

In the following tables, Mu denotes the posterior mean for the mean effect in a Bayesian random-effects meta-analysis conducted in each row of the table.
In the following tables, Low and Up denote the 95% Bayesian high-density credible intervals for the posterior mean effect. They give us an indication of the 95% most credible effect sizes had we collected the entire population of WCF studies ever conducted.
In the following tables, Perc.mu denotes the percentage interpretation of the mean effect in each row of the table. Any percentage larger than 0% shows an improvement for treatment groups over control groups in the WCF literature in each row of the table
In the following tables, K denotes the number of studies. However, because a single study could possess group-level features (e.g., some groups receiving oral feedback, but not others), it may be counted multiply in the K column. Therefore, the K column total could be larger than 50, the total number of studies, for group-level moderators. For study-level moderators (e.g., proficiency), K column total always equals 50.
In the following tables, BF01.mu denotes a Bayes Factor (see Norouzian et al, 2019). It tells us if the mean effect in each row of the table could provide evidence in favor of the null hypothesis that mean effect = 0. The smaller the value, the larger the evidence against the null hypothesis (and in favor of the alternative hypothesis i.e., mean effect \(\neq\) 0).
In the following tables, BF01.tau denotes a Bayes Factor (see Norouzian et al, 2019). It tells us if the heterogeneity between studies’ mean effects (\(\tau\)) in each row of the table could provide evidence in favor of the null hypothesis that heterogeneity = 0. The smaller the value, the larger the evidence against the null hypothesis (and in favor of the alternative hypothesis i.e., heterogeneity \(\neq\) 0).
Closely related to tau (heterogeneity in effect sizes in sd unit), is I2 (Higgins and Thompson 2002; Higgins, Thompson, Deeks, & Altman, 2003). I2 denotes the percentage of the total variability in a set of effect sizes due to between-study heterogeneity. Higgins and Thompson (2002) proposed a tentative classification of I2 values where percentages of around 25% (I2 = 25), 50% (I2 = 50), and 75% (I2 = 75) would denote low, medium, and high heterogeneity, respectively.

7.2 WCF Overall Effect (disregarding time)

7.3 Overall effectiveness (with time to posttest)

code	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
time.1	22	0.7143	0.5296	0.8986	0.4219	0.2653	0	0.8955	26.25%
time.2	25	0.4645	0.2076	0.7245	0.813	0.5537	0.0558	0	17.89%
time.3	20	0.5586	0.2647	0.857	0.7938	0.5568	0.0372	1e-04	21.18%
time.4	12	0.4744	0.1574	0.7905	0.6359	0.4034	0.2625	0.1546	18.24%

Bayesian Meta-Analytic Summaries
	code	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
immediate	time.1	22	0.7143	0.5296	0.8986	0.4219	0.2653	0	0.8955	26.25%
short	time.2	25	0.4645	0.2076	0.7245	0.813	0.5537	0.0558	0	17.89%
medium	time.3	20	0.5586	0.2647	0.857	0.7938	0.5568	0.0372	1e-04	21.18%
long	time.4	12	0.4744	0.1574	0.7905	0.6359	0.4034	0.2625	0.1546	18.24%

7.4 Setting

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
SL	16	0.4528	0.195	0.7143	0.6183	0.3805	0.0957	0.1268	17.47%
FL	34	0.5276	0.3331	0.7234	0.773	0.4819	4e-04	0	20.11%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
SL vs FL	-0.3963687	0.2504415	FALSE

7.5 Educational level

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
high School	6	0.5128	-0.1536	1.1917	0.8749	0.6623	1.5846	0.0113	19.6%
university	39	0.4871	0.345	0.6299	0.6011	0.3318	0	4e-04	18.69%
institute	4	0.8757	-0.3399	2.0819	0.9096	1.0696	1.0695	0	30.94%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
high School vs university	-0.6534242	0.7182970	FALSE
high School vs institute	-1.7192808	1.0267806	FALSE
university vs institute	-1.5898565	0.8506671	FALSE

7.6 Proficiency

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
beginner	3	0.9819	-0.4869	2.4201	0.9091	1.1294	1.021	2e-04	33.69%
advanced	32	0.5172	0.3378	0.6975	0.7027	0.409	2e-04	0	19.75%
intermediate	6	0.5789	0.1947	0.9651	0.2464	0.1967	0.2262	3.5024	21.87%
mixed/not reported	9	0.2711	-0.0622	0.5974	0.6705	0.3521	2.8172	0.5472	10.68%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
beginner vs advanced	-1.0403291	1.8880292	FALSE
beginner vs intermediate	-1.1362802	1.8650883	FALSE
beginner vs mixed/not reported	-0.8149559	2.1581149	FALSE
advanced vs intermediate	-0.4859261	0.3631019	FALSE
advanced vs mixed/not reported	-0.1222468	0.6236818	FALSE
intermediate vs mixed/not reported	-0.1959408	0.8161790	FALSE

7.7 WCF Type

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
direct	29	0.4479	0.2329	0.6627	0.744	0.4827	0.0154	0	17.29%
location	16	0.5277	0.1365	0.9176	0.8466	0.6769	0.378	0	20.11%
error coding	17	0.4804	0.2815	0.6895	0.4709	0.2603	0.004	0.7905	18.45%
metalinguistic	9	0.4953	0.0978	0.9133	0.6274	0.4269	0.4983	0.3534	18.98%
direct+metalinguistic	13	0.7206	0.4038	1.0454	0.6695	0.4318	0.0113	0.1214	26.44%
other	2	0.1971	-0.863	1.2455	0.5253	0.3791	5.2457	1.9406	7.81%

7.8 Scope

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
unfocused	19	0.3448	0.1006	0.5903	0.6738	0.4118	0.4337	0.0026	13.49%
mid-focused	12	0.3478	0.0941	0.6117	0.5842	0.2926	0.5068	1.2041	13.6%
focused	23	0.6979	0.4569	0.9402	0.7114	0.4669	3e-04	7e-04	25.74%

7.9 Error Type

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
mixed	26	0.3013	0.1145	0.4873	0.6331	0.3571	0.2287	0.0045	11.84%
articles	16	0.5772	0.3361	0.8214	0.6535	0.3673	0.0063	0.0262	21.81%
prepositions	5	0.5581	-0.1285	1.2307	0.7994	0.571	1.2586	0.0787	21.16%
verb tense	5	0.86	-0.2769	1.988	0.9152	1.1417	1.0677	0	30.51%
other	6	0.4233	-0.1065	1.0127	0.734	0.4699	1.6878	0.7215	16.4%

7.10 Key

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
no key	4	0.4068	-0.1669	1.0319	0.4737	0.3078	2.028	2.0862	15.79%
key provided	16	0.4665	0.2652	0.6771	0.4705	0.2553	0.0086	0.9161	17.96%
N/A	40	0.5026	0.3016	0.704	0.8121	0.5586	0.001	0	19.24%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
No key vs key provided	-0.6592533	0.6011986	FALSE
No key vs NA	-0.6918739	0.5648004	FALSE
key provided vs NA	-0.3184139	0.2562545	FALSE

7.11 Revision

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
not required	6	0.6749	0.2274	1.1468	0.5973	0.3468	0.1983	1.3845	25.01%
required	31	0.4863	0.307	0.6685	0.6838	0.3924	5e-04	0.0032	18.66%
feedback reviwed	9	0.427	-0.0982	0.9533	0.8305	0.6585	1.9778	0	16.53%
not reported	5	0.4685	-0.2003	1.1203	0.8185	0.5612	1.8635	0.0337	18.03%

7.12 Oral supplemental CF

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
not provided	50	0.494	0.3408	0.6479	0.7349	0.4523	0	0	18.93%
provided	5	0.7662	0.0558	1.5169	0.712	0.5723	0.4869	0.5217	27.82%

7.13 Genre

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
academic	16	0.3712	0.0974	0.6461	0.7389	0.4372	0.5088	0.0021	14.48%
visual description	19	0.6341	0.4117	0.8574	0.5642	0.3454	8e-04	0.1541	23.7%
narrative/journal	9	0.3778	0.0486	0.7038	0.5864	0.3323	0.9299	0.6502	14.72%
reading summary	2	0.6306	-0.5663	1.7672	0.815	0.5276	1.3293	0.9064	23.58%
not reported	3	0.8049	-0.7244	2.3157	0.935	1.2202	1.5145	0	28.96%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
academic vs visual description	-0.6143674	0.0890175	FALSE
academic vs narrative/journal	-0.4286789	0.4201417	FALSE
academic vs reading summary	-1.3964923	0.9867740	FALSE
academic vs not reported	-1.9497568	1.1388604	FALSE
visual description vs narrative/journal	-0.1347535	0.6523789	FALSE
visual description vs reading summary	-1.1250137	1.2422710	FALSE
visual description vs not reported	-1.6792161	1.3946423	FALSE
narrative/journal vs reading summary	-1.4011992	1.0023588	FALSE
narrative/journal vs not reported	-1.9525921	1.1538611	FALSE
reading summary vs not reported	-2.0572747	1.7039127	FALSE

7.14 Timed

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
no limit	8	0.4182	-0.0276	0.8643	0.6983	0.4656	1.4257	0.1171	16.21%
limited	41	0.5225	0.3511	0.6951	0.7531	0.462	0	0	19.93%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
timed 0 vs timed 1	-0.580076	0.3714433	FALSE

7.15 Grading

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
not graded	20	0.4684	0.284	0.6543	0.5657	0.2908	0.0033	0.1714	18.03%
graded	10	0.2554	-0.0928	0.6008	0.7759	0.4317	3.7852	0.0217	10.08%
not reported	21	0.6656	0.3703	0.963	0.7695	0.5722	0.0065	0	24.72%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
not graded vs graded	-0.1756498	0.6066272	FALSE
not graded vs not reported	-0.5466881	0.1499102	FALSE
graded vs not reported	-0.8660679	0.0404682	FALSE

7.16 Length of writing

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
short	17	0.5076	0.3056	0.7073	0.5596	0.2902	0.0057	0.3134	19.41%
long	12	0.3111	-0.0215	0.6436	0.7136	0.4455	2.1368	0.0073	12.21%
not reported	21	0.6315	0.3325	0.9347	0.813	0.5908	0.0115	0	23.61%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
short vs long	-0.1908825	0.5827446	FALSE
short vs not reported	-0.4888517	0.2325511	FALSE
long vs not reported	-0.7693140	0.1235888	FALSE

7.17 Instruction on grammar

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
no instruction	22	0.5924	0.3439	0.8434	0.8046	0.495	0.0042	0	22.32%
targeted instruction	10	0.5004	0.1498	0.8857	0.6035	0.387	0.2585	0.8404	19.16%
untargetd instruction	3	-0.0506	-1.1374	1.0305	0.8871	0.736	5.6187	0.0266	-2.02%
not reported	16	0.5375	0.2729	0.8023	0.6112	0.3913	0.0291	0.0789	20.45%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
no instruction vs targeted instruction	-0.3722645	0.5159173	FALSE
no instruction vs untargetd instruction	-0.4628794	1.7539212	FALSE
no instruction vs not reported	-0.3062954	0.4181211	FALSE
targeted instruction vs untargetd instruction	-0.5800523	1.6935815	FALSE
targeted instruction vs not reported	-0.4690977	0.4352727	FALSE
untargetd instruction vs not reported	-1.7007655	0.5197469	FALSE

7.18 Training students in responding to feedback

Bayesian Meta-Analytic Summaries
	K	mu	low	up	I2	tau	BF01.mu	BF01.tau	perc.mu
no training	12	0.7112	0.2238	1.2013	0.8737	0.7346	0.192	0	26.15%
received training	7	0.4673	0.1892	0.7522	0.4014	0.2001	0.1751	2.5393	17.99%
not reported	31	0.4337	0.2552	0.6132	0.66	0.3885	0.0022	2e-04	16.77%

Bayesian Meta-Analytic Difference
	lower	upper	diff.
no training vs received training	-0.3185255	0.8061647	FALSE
no training vs not reported	-0.2403194	0.7978463	FALSE
received training vs not reported	-0.2952711	0.3689505	FALSE

Written Corrective Feedback: A Bayesian Meta-Analysis

Webpage created and maintained by Author © 2020-present

Authors

XXX University——(all authors contributed equally.)