Academic literacy of South African higher education level students : Does vocabulary size matter ?

This study explores the extent to which vocabulary size matters in academic literacy. Participants (first-year students at North-West University) were administered the Vocabulary Levels Test (Schmitt, Schmitt and Clapham 2001). Scores from the test were used to estimate students’ vocabulary size and were subsequently mapped onto the levels distinguished by the Test of Academic Literacy Levels (TALL). Estimates show that, on average, the vocabulary size of first-year students at North-West University is approximately 4,500 word families, a size large enough to allow them to follow lectures in English. Furthermore, students with large vocabularies were found to have higher academic literacy proficiency, which establishes a strong relationship between vocabulary size and academic literacy. This relationship was also observed at the different word frequency bands the Vocabulary Levels Test consists of. These results support previous findings which established a relationship between vocabulary size and reading (cf. Nation 2006), and between vocabulary size and overall language proficiency (cf. Beglar 2010, Meara and Buxton 1987, Meara and Jones 1988, Nation and Beglar 2007), which could be extended to academic literacy. Furthermore, a stronger relationship between vocabulary size and academic literacy was found towards more infrequent word bands, indicating that infrequent word bands may best predict academic literacy. On the basis of these findings, we discuss possible strategies to adopt in order to assist some first-years with expanding their vocabularies.


Introduction
Over the past three decades, interest in vocabulary research has been characterised by testing its growth.Specifically, measuring vocabulary size, i.e. how many words a student knows (Henriksen 1999, Meara 1996), has resulted in considerable implications for both teaching and research.For instance, recent estimates suggest that 8,000 word families1 is the number needed for reading authentic texts and newspapers (Nation 2006), while 4,000-5,000 word families are required to follow lectures at undergraduate level (Laufer and Ravenhorst-Kalovski 2010, Nation 2006, Schmitt 2008, 2010, Schmitt and Schmitt 2012).Lists of academic words that frequently appear in academic contexts have also been compiled, which include Coxhead's (2000) Academic Word List (AWL), Gardner and Davies' (2013) Academic Vocabulary List (AVL), and Simpson-Vlach and Ellis' (2010) Academic Formula List (AFL).The AWL especially constitutes an excellent general reference tool for both teaching and research (Coxhead 2011, Gardner and Davies 2013, Hyland and Tse 2007, Simpson-Vlach and Ellis 2010).
Internationally, vocabulary tests on the basis of which the above thresholds were determined are used today as placement indicators, i.e. used to determine linguistic proficiency.These include the Vocabulary Levels Test (Nation 1990, Schmitt et al. 2001) and the Vocabulary Size Test (Nation and Beglar 2007).The general tendency today is to test, among others, first-year students' knowledge of vocabulary, overall proficiency or academic literacy; the latter concept is discussed in section 2.3, but it should be noted already that vocabulary is considered to be one of the fundamentals comprising academic literacy.This is particularly intended for placement purposes and academic support some students may be in need of (Scholtz 2012, Weideman 2006).
Within this line of thought, first-year students at North-West University annually sit the Test of Academic Literacy Levels (TALL) and its Afrikaans counterpart "Toets van Akademiese Geletterdheidsvlakke" (TAG) -cf.Van Dyk and Weideman (2004a) for an explication of the construct of this test.The scores are used as placement indicators and assist in assigning students to different groups, with two modules (AGLA/E 111 and AGLA/E 121) suggested according to students' performance on the test(s).AGLA/E 111 is an awareness-raising module offered to students who are at risk of not completing their studies in the desired time.Matters such as study skills, listening and note-taking strategies, and an introduction to academic reading and writing are addressed in this module.AGLA/E 121 is a compulsory module taken by all first-year students.The focus of this module is on basic research skills, critical thinking, finding and using applicable sources for different purposes in an acceptable manner, the development of a lucid written (and to a lesser extent spoken) argument, as well as information and computer literacy development.It is credit bearing and is embedded in the curriculum.Note that there is close collaboration with discipline experts, and knowledge and skills acquired and developed in these modules are applicable to different fields of study.
The respective AGLA/E modules taught at North-West University introduce students to a number of interrelated skills and a set of knowledge to assist them in becoming academically literate and make overt what is usually covert, i.e. enhancing epistemological access.Developing academic vocabulary in particular is one of the aims of these modules, which is achieved by presenting the AWL to students.However, the AWL covers only about 10% of a running text whereas higher education second language (L2) and foreign language (FL) students need at least a size of 4,000-5,000 word families (cf.Laufer andRavenhorst-Kalovski 2010, andNation 2006).While firstyear students at North-West University, as in any other higher education institution, could be expected to master the AWL (Nation 1990), they need a vocabulary size large enough to allow them to follow lectures and to read, and respond to, academic textbooks.However, the exact vocabulary size of these students remains unknown.Furthermore, the available literature indicates that vocabulary size predicts reading comprehension, which is an important component of academic literacy (cf.Nation 2001Nation , 2006 among others) among others).It remains to be seen, however, whether this relationship holds for overall academic literacy, particularly since adequate levels of academic literacy are important to persist and prosper with studies in higher education (Carstens 2013, Van de Poel and Van Dyk 2014, Van Dyk and Van de Poel 2013).Students' academic performance is seemingly mediocre despite different efforts to address their inadequate levels of preparedness.In many cases, no tangible results of these efforts are evident.Although this is a global issue, South Africa is a case in point where a recent report by the Council on Higher Education (2011) has shown that approximately 49% of registered students for a three-year degree programme managed to graduate only after five years.It is our opinion that this gap (a possible relationship between academic literacy and vocabulary size) should be bridged, which the present study sets out to do.It focuses on students who study through the medium of English but who are L2 users of English (henceforth ESL students).It tests first-year ESL students' vocabulary size in relation to their academic literacy level in an attempt to answer the following questions: (i) What is the vocabulary size of first-year ESL students at North-West University?(ii) Is there any relationship between ESL students' vocabulary size and their academic literacy levels?(iii) Is the relationship between vocabulary size and academic literacy levels (if there is one) maintained at the different word bands?And, which of the frequency band(s) could best predict academic literacy?

Dimensional approach to vocabulary knowledge
The 1970s saw increasing research attention accorded to vocabulary (Read 1997), with a very influential classification of what knowing a word entails coming from Richards (1976).
According to Richards (1976), researching vocabulary should consider all the aspects that knowing a word entails.These aspects include form, meaning, the syntactic behaviour of a word, and the word's associations.However, not only was this approach considered to be too theoretical (Nation 2001), but studying the many aspects involved was also perceived as a daunting task and even impracticable (Meara 1996).Consequently, Meara (1996) suggested studying vocabulary knowledge through a small number of manageable and thus measurable features that could reflect the lexicon at a more global level.He proposed two dimensions: size and organisation.While these two dimensions were (and still are) considered the foremost qualities of vocabulary knowledge, Henriksen (1999: 304) added a third, but adopted a different terminology.Meara's "size" was referred to as the "partial-precise knowledge" dimension, while "organisation" was referred to as the "depth of knowledge" dimension.Today, other terms which are used interchangeably in the literature include "breadth", or "vocabulary size" when referring to the first dimension, and "depth", or "depth knowledge", "deep word knowledge" or "quality of vocabulary knowledge" to refer to the second dimension.The third dimension Henriksen proposed is known as the "receptive-productive" dimension.
The dimensional approach to vocabulary gained popularity and was adopted by L2/FL scholars as a construct of word knowledge."Vocabulary size" refers to how many words one knows irrespective of how well they are known (cf.Anderson and Freebody 1981, Henriksen 1999, Meara 1996, Nation 1990, Read 1993, 2000), whereas "depth knowledge" refers to how well a word is known (cf.Greidanus and Nienhuis 2001, Henriksen 1999, Meara 1996, Qian and Schedl 2004, Read 1993, 2000, Vermeer 2001, Wesche and Paribakht 1996, Wolter 2001)."Depth" refers to those associates of a word at the paradigmatic (synonyms or close in meaning, e.g.puppy and dog), analytic (one word being a key word of the dictionary definition of the other, e.g.canine and dog), and syntagmatic (collocations, e.g.own/hunting and dog) levels (Read 1993).The "receptive-productive" distinction is based on the premise that word comprehension does not necessarily imply its correct use (Gairns and Redman 1986, Laufer and Paribakht 1998, Van de Poel and Swanepoel 2003, Zareva, Schwanenflugel and Nikolova 2005) and some scholars consider it as a bridge between lexical competence and performance (cf.Melka 1997).

Vocabulary size: Threshold and text coverage
Increased research attention given to vocabulary over the past few decades has brought the latter to the forefront of L2/FL teaching.Schmitt (2008: 329), among others, acknowledged that "one thing that students, teachers, materials writers, and researchers can all agree upon is that learning vocabulary is an essential part of mastering a second language".With empirical evidence pointing to a significant relationship between vocabulary size and overall language proficiency, today most L2 and FL practitioners agree that the number of words learners know -vocabulary sizecharacterises their language proficiency (also note here the link between language proficiency and academic literacy -cf.Van Dyk and Van de Poel 2013).Studies that explored this relationship include Beglar (2010), Meara (1996), Meara and Buxton (1987), Meara and Jones (1988), Nation (1983Nation ( , 1990)), Nation and Beglar (2007), and Schmitt et al. (2001).All of these studies point to the same conclusion: the larger the vocabulary size of L2/FL students, the more proficient the students.In Meara's (1996: 37) terms: all other things being equal, learners with big vocabularies are more proficient in a wide range of language skills than learners with smaller vocabularies, and there is some evidence to support the view that vocabulary skills make a significant contribution to almost all aspects of L2 proficiency.
Meara's statement emphasises the role a larger vocabulary size plays in the different aspects of overall L2/FL proficiency.Vocabulary size tests developed and used to demonstrate this relationship can serve as placement indicators and can also be used for diagnostic purposes.Determining words that L2/FL learners need in order to follow lectures, watch a movie, read newspapers, etc. without external support (Nation 2006), and considering this as a learning goal (Schmitt 2010), is a direct pedagogical consequence that arose from this development.Nation (2006) and Schmitt (2008) suggested considering text coverage as a decisive factor in determining the learning goals.Even though scholars still do not have commonly agreed-upon thresholds for reading authentic texts comfortably and following lectures, recent estimates have suggested a minimal one which should be 4,000-5,000 word families (Laufer and Ravenhorst-Kalovski 2010, Nation 2006, Schmitt 2008, 2010, Schmitt and Schmitt 2012).This vocabulary size allows text coverage of about 95% of a running text.Some scholars suggest that students should also master the AWL, in addition to this threshold, in order to meet the demands that higher education may pose (cf.Nation 1990).If text coverage of 98% is the aim, Laufer and Ravenhorst-Kalovski (2010) propose an optimal threshold of 8,000 word families, echoing Nation's (2006) suggestion.Nation (2006: 59) argues that: [i]f 98% coverage of a text is needed for unassisted comprehension, then a 8,000 to 9,000 word-family vocabulary is needed for comprehension of written text and a vocabulary of 6,000 to 7,000 for spoken text.
This view was supported by Schmitt (2008), according to whom a large vocabulary is necessary in order to function in English, which should be 8,000-9,000 word families for reading and as many as 5,000-7,000 word families for oral comprehension.Both Schmitt (2008) and Nation (2006) made a distinction between written and oral texts and, as can be seen from their respective suggestions, oral text uses more frequent words.
On the basis of the most widely referred to text coverage classification (that of Nation (2006)), Schmitt and Schmitt (2012: 1) revisited the notion of 'frequency' and suggested three word boundaries: high-frequency vocabulary (3,000 words), low-frequency vocabulary (9,000 +), and the vocabulary between high-frequency and low-frequency, which they refer to as "midfrequency vocabulary" for pedagogical purposes.They agreed with Nation's suggestion to teach explicitly the high-frequency words, which they raised to the 3,000-word level instead of the 2,000-word level in Nation's suggestion.They also agreed with Nation (2006) and Schmitt (2008) that the mid-frequency vocabulary is needed for proficient language use while acknowledging that words from this vocabulary pose a pedagogical challenge that should be addressed.Schmitt (2008) added that the goal of 8,000 words seems to be realistic.This is a view confirmed by Schmitt, Ng and Garras (2011).
Given the general observation that, even at an advanced level, L2 and FL speakers of English usually achieve a vocabulary size of less than 5,000 word families (Waring and Nation 1997), the present study tests the minimal vocabulary size of participants up to the 5,000-word band.

Academic literacy
The concept of 'academic literacy' has been approached from many different angles over the last couple of decades.Moreover, many scholars have given definitions and outlined what they view are the major components of academic literacy.These definitions were informed by rationales popular at that moment in time and usually addressed limitations in previous attempts to define the concept.Of importance today are those that shaped our understanding of the concept and are still applicable to, and useful in, our contexts.They include the New Literacy Studies movement (including the Study Skills movement), the Academic Literacies movement, and the more linguistically inclined movement (inclusive of studies in English for Academic Purposes with a particular focus on the target language use context, Systemic Functional Linguistics, and Corpus Linguistics).As it is not the purpose of this article to provide an exhaustive definition of academic literacy; readers are referred to Van Dyk and Van de Poel ( 2013) for an extensive and recent account of what this concept entails.It should be noted though that being academically literate is considered nowadays to be more than just being able to read and write.It is about being multiliterate and combining a range of abilities that are conducive to making meaning as well as mediating and negotiating knowledge (Carstens 2012) -note again the way in which the AGLA/E modules are structured at North-West University.Becoming multiliterate, through a process of acculturation and integration, enables students to understand and transfer knowledge and skills from one context to another and to move between different discourse communities.We can then broadly define academic literacy as "the knowledge and skills required to communicate and function effectively and efficiently in different academic communities and achieve well-defined academic goals" (Van Dyk and Van de Poel 2013: 47).Implied in this attempt of a definition is the notion that academic literacy has three different dimensions: a social (exchange information), cognitive (understand, organise and reason about information) as well as a linguistic (language) dimension.
As previously indicated, due to limited levels of preparedness, assessing academic literacy of first-year students in particular has become imperative at universities.This practice seems to have become a norm worldwide and is meant for placement and diagnostic purposes, especially for at-risk students (Scholtz 2012, Weideman 2006).According to Scholtz (2012), the main tests used to this end in the South African context are:  The Test of Academic Literacy Levels (TALL) or its Afrikaans counterpart "Toets van Akademiese Geletterdheidsvlakke" (TAG)2 ;  The National Benchmark Test (NBT);  The Placement Test in English for Educational Purposes (PTEEP);  The Standardised Test for Access and Placement (SATAP);  The English Literacy Skills Assessment for Higher Education and Training (ELSA Plus);  The Assessment Access Battery (AAB).
It is worth noting that some of these tests are used on a small scale only.The TALL/TAG and the NBT are the two most frequently used.Note that all of the tests mentioned above are based on roughly the same construct or definition of "academic literacy" (Scholtz 2012, Weideman 2006), which Van Dyk and Weideman (2004a: 10) summarised in the following ten competencies: Understand a range of academic vocabulary in context; (ii) Interpret and use metaphor and idiom, and perceive connotation, word play and ambiguity; (iii) Understand relations between different parts of a text, be aware of the logical development of (an academic) text, via introductions to conclusions, and know how to use language that serves to make the different parts of a text hang together; (iv) Interpret different kinds of text type (genre), and show sensitivity for the meaning that they convey, and the audience that they are aimed at; (v) Interpret, use and produce information presented in graphic or visual format; (vi) Make distinctions between essential and non-essential information, fact and opinion, propositions and arguments; distinguish between cause and effect, classify, categorize and handle data that make comparisons; (vii) See sequence and order, do simple numerical estimations and computations that are relevant to academic information, that allow comparisons to be made, and can be applied for the purposes of an argument; (viii) Know what counts as evidence for an argument, extrapolate from information by making inferences, and apply the information or its implications to other cases than the one at hand; (ix) Understand the communicative function of various ways of expression in academic language (such as defining, providing examples, arguing); and (x) Make meaning (e.g. of an academic text) beyond the level of the sentence.
These competencies are what Butler (2013) calls a "functional oriented definition" as they summarise the abilities students are expected to have in higher education, an observation that echoes Van de Poel and Van Dyk's (2014) overall characterisation of academic literacy as students' "linguistic ability".The above construct of academic literacy inspired the design of the TALL/TAG which "is to a large extent determined and required by the higher education context of South Africa, in which larger numbers of potentially underprepared students have found their way into tertiary studies" (Weideman 2006: 81).As Van der Slik and Weideman (2008) and Van Dyk (2010) put it, the test was proven to be efficient, valid, and highly reliable.For instance, while Van der Slik and Weideman (2008) report an average Cronbach's Alpha of .90 for the 2005 and 2006 data, ICELDA3 (2014) reports an average Alpha of .93.The test is a placement indicator, which allows institutions using it to allocate students "into appropriate academic literacy support courses" (Weideman 2006: 82).

Test of Academic Literacy Levels
At North-West University, the Test of Academic Literacy Levels (TALL) and its Afrikaans counterpart "Toets van Akademiese Geletterdheidsvlakke" (TAG) are administered annually to assess the academic preparedness of first-year students.The tests consist of the following seven sections that are described in Van Dyk and Weideman (2004aWeideman ( , 2004b)), which Weideman (2006: 85-86) put in the following terms:  Section 1: Scrambled text (in which a scrambled paragraph is presented which students have to restore to its original order). Section 2: Interpreting graphs and visual information (which tests, among other things, the student's ability to interpret either a graph or a diagram, and to demonstrate a capacity for quantitative literacy [numeracy] related to academic tasks). Section 3: Text type.Here the students are presented with a number of sentences or phrases taken from a variety of text types or genres, which they have to match with a list of sentences or phrases from the same text types. Section 4: Academic vocabulary.Even though academic vocabulary is tested separately (and fairly traditionally) here, there are also vocabulary questions in some of the other sections. Section 5: Understanding texts.This section normally consists of one or more extended reading passage or passages, followed by questions focusing on critically important aspects of the construct, such as distinguishing between essential and non-essential information, or cause and effect, as well as inferencing, sequencing, defining, handling metaphor and idiom, and so forth. Section 6: Text editing.This part of the test, which relies on cloze procedure, normally has three sub-sections, though the text continues from the first to the last […].In the first, a word is omitted, and students have to indicate the place where it is missing.In the second, the place where the missing word has been taken out is indicated, and students have to choose the appropriate word.In the third and final part, students have to indicate both place and missing word. Section 7: Writing.This section is used to test the ability of the student to make a short argument, which is normally connected to the theme of the text(s) that the student has read, as well as the topic of the scrambled paragraph and the text edit question.The test therefore contains (academic) information that is potentially useful in completing the writing section.
On the basis of the test scores achieved, students are grouped according to their level of risk, of which there are five, with level 1 being the most at-risk group while level 5 is the group the least, if at all, at risk: 1. Extremely high risk 2. High risk 3. Borderline case (after a second assessment, either identified as At risk, or Low risk) 4. Low risk 5. Low to no risk4

The Vocabulary Levels Test
The Vocabulary Levels Test (VLT; Nation 1983Nation , 1990) is a receptive vocabulary test which involves word-definition or definition-word matching.It measures vocabulary size and was designed on the basis of four word-frequency levels -i.e.2,000-word, 3,000-word, 5,000-word, and 10,000-word levels -and the University Word List (UWL) that evolved into today's Academic Word List (AWL; Coxhead 2000).At each level, 18 words are randomly selected (proper and compound nouns excluded) and presented with their corresponding definitions in clusters of six words and three definitions.Test-takers are instructed to match a word with its definition.In order to avoid giving clues to test-takers, care is taken so that all the words in the same group belong to the same class.The test has been revised specifically for validation purposes (see Beglar andHunt 1999, Schmitt et al. 2001).
The VLT is a matching test, the major advantage of which is that it is easy to design (Nation 1983(Nation , 2001;;Read 2000), take, mark and interpret (Nation 2001).The test indeed provides information about how many words learners know at each level (Schmitt 1994).Today, the VLT is the most widely used vocabulary size test (Ishii and Schmitt 2009, Read 2007, Schmitt et al. 2001).Schmitt et al.'s (2001) VLT, like other tests of this nature, is a receptive test and was used for this study.It is worth noting that it uses 30 items instead of 18 at each of the five frequency bands involved in the test design.Furthermore, two versions of the test exist, with version B being adopted in this study.

Participants
Participants in this study came from different faculties and institutes of North-West University's Potchefstroom Campus5 (N = 345).In terms of their study subjects, they constituted a diverse population from different fields of study such as commerce, law, engineering, natural sciences, etc. in their first year.All of them enrolled for both the AGLE 111 and 121 academic literacy modules, both subjects being taught in English in this case.(Note that being enrolled for both the modules is an indication of high risk as measured by the TALL.)Their other subjects are taught either in English or Afrikaans.While English is the L2 for most of the participants, their native languages are mainly Afrikaans and Setswana.The participants sat the VLT test at the beginning of the second semester in 2012.

Vocabulary size of first-year students
The first research question addressed in this study estimates the vocabulary size of students in their first year of study, which was measured at the 5,000-word band.Laufer and Ravenhorst-Kalovski's (2010) formula was used to this end.Laufer and Ravenhorst-Kalovski (2010: 21) state that the score at the 1,000-word band is the same as that at the 2,000-word band, and that the score at the 4,000-word band is obtained by averaging scores at the 3,000-and 5,000-word bands.They agree with Laufer's (1998) suggestion that the AWL be excluded from the calculation because this list consists of words that belong either to the 4,000-or 5,000-word band.For the total score, they state that, "[s]ince each frequency level has 30 items, the maximum score, which represents knowledge of 5,000 words, would be 30×5 =150".For instance, in the data analysed for this study, a student who scored 30 at the 2,000-word band, 28 at the 3,000-word band, and 21 at the 5,000-word band, would have the following score: 30+30+28+24.5+21= 133.56 .This student's vocabulary size would then be 4,450 word families (133.5×5,000÷150= 4,450).
Results indicate that, on average, the vocabulary size of participants in this study is approximately 4,500 word families.This finding answers the first research question and shows that first-year students have achieved the required vocabulary size for following lectures in English.

The relationship between academic literacy and vocabulary size
The second research question explores the relationship between vocabulary size of first-year students and their level of academic literacy.In order to answer this question, scores from the TALL and the size estimates from the VLT were compared.The classification established by the TALL scores was used as a baseline onto which the size estimates were mapped.As indicated in section 3.2, the TALL distinguishes between five different "risk" levels.A oneway ANOVA of the size estimate was performed in order to see if the same groups were reflected in the vocabulary size.As Figure 1 shows, the levels identified by the TALL scores7 (2-5) are also reflected in the VLT.

Figure 1. Vocabulary size mapped onto TALL levels
Means of size estimates and standard deviations of the different groups identified by the TALL are presented in Table 2.The means are 4,242.11 in level 2; 4,360.53 in level 3; 4,797.51 in level 4; and 4,906.25 in level 5.This clearly indicates that the size increases from one level of academic literacy to another.Therefore the implication is that the more the students tend to be categorised as Extremely high risk, the smaller their vocabulary size, while the more towards Low to no risk the students tend to be categorised, the larger their vocabulary size.This is yet another piece of evidence that could be added to the validation argument of the TALL.The observed differences between groups were found to be statistically significant as determined by a one-way ANOVA [F(3,269) = 76.01,p = 0.000] which was run to this end.The data were analysed further by running multiple comparison tests with the Scheffe post-hoc test.These comparison tests aimed to complement the one-way ANOVA by showing where significant differences occurred, i.e. which two groups could be considered different in terms of their vocabulary size.The results are presented in Appendix A and summarised in Table 3 below.
Students identified as High risk (risk level 2 according to the TALL, N = 83) and Borderline (risk level 3 according to the TALL, N = 76) were found to belong to the same group as far as their vocabulary size was concerned, while those identified as Low risk (risk level 4 according to the TALL, N = 127) and Low to no risk (risk level 5 according to the TALL, N = 32) were found to belong to another group.This implies that even though Borderline students have a larger vocabulary size than High risk students, the difference between the two groups is not statistically significant.
The same holds for Low to no risk students who have a larger vocabulary size than Low risk students, yet with a difference that is not statistically significant.What this means is that the vocabulary size of risk levels 2-3 (TALL) does not differ significantly, with only significant differences starting to emerge at level 4. A Pearson correlation between performance on the TALL and the VLT was also performed.As Figure 2 indicates, the correlation is positive and linear with a correlation coefficient of .513** , statistically significant with a p-value of 0.000, significant at the 0.01 level, 2-tailed (cf.Appendix B).According to Cohen (1988), the strength of correlation might be interpreted as either (i) small (.10 to 29), (ii) medium (.30 to .49) or (iii) large (.50 to 1), and the higher the better.In this case, the correlation is large, which is another argument in favour of a strong relationship between vocabulary size and academic literacy.These findings indicate that students with higher academic literacy levels also have a larger vocabulary size, therefore answering the second research question from yet another angle.The findings also confirm the validity arguments of both the TALL and the VLT.

The relationship between academic literacy and vocabulary size at word bands
In order to gain more insights into the link between vocabulary size and academic literacy (related to the second research question), the relationship between vocabulary size and academic literacy was explored further by mapping each frequency word-band score onto TALL levels.The aim was to examine the relationship between vocabulary size and academic literacy at the level of word bands, and to find out which frequency word bands were completely mastered by the students.This was achieved by running a one-way ANOVA at each of the frequency word bands.The mean scores and standard deviations are presented in Table 4, while Table 5 presents the ANOVA results.

Table 4: Mean scores at word bands
As can be seen from Table 4, scores at each word band vary from one level to another, which entails that the same levels identified by the TALL are also identified by the VLT scores.Furthermore, on average, the mean scores achieved out of 30 are satisfactory -we found a mean score of 28.05 at the AWL, 29.03 at the 2,000-word, 28.20 at the 3,000-word, and 25.26 at the 5,000-word bands.These scores were weighed against Schmitt et al.'s (2001) cut-off point.
According to Schmitt et al. (2001), a vocabulary band is mastered if the score at that band is at least 24 out of 30.Considering the above scores, it appears that a huge majority of the participants has achieved the suggested threshold.Note, however, that as many as 11.30% of the students (i.e.39 students out of 345 participants) did not reach the minimum required score.
The number of students below 5,000 word families dramatically increases when each word band is analysed separately.For instance, the score at the 5,000-word band analysed separately shows that as many as 29.85% of the participants (i.e. a total of 103 students out of 345) do not seem to master this frequency band.At the 2,000-and 3,000-word bands respectively, it was found that 1.44% (5 participants) and 3.46% (12 participants) of the students do not seem to have complete mastery of these bands.About 6% of the students (i.e.21 participants out of 345) were found to be below the threshold at the AWL.As Table 5 indicates, there is a statistically significant difference at each of the frequency bands [F(3,271) = 24.60,p = 0.000, for the AWL; F(3,271) = 6.05, p = 0.001, for the 2000-word; F(3,270) = 32.03,p = 0.000, for the 3000-word; F(3,270) = 80.62, p = 0.000, for the 5000word)].This entails that there is a significant effect of academic literacy on performance on vocabulary size at each frequency word band.However, these results do not specify which two academic literacy levels differ significantly.To this end, post-hoc comparisons were performed using the Scheffe test, and the results are presented in Appendix C and summarised in Appendix D for each of the frequency word bands in the order AWL, 2,000-word, 3,000-word, and 5,000word band levels.As the Scheffe test results show, all risk levels (except levels 4 and 5) identified by the TALL performed significantly differently at the AWL and 3,000-word band level, at a p value of 0.05.The Scheffe test thus classifies the levels in three clearly different groups -group one: High risk, group two: Borderline, and group three: Low risk and low to no risk.At the 2,000-word band, however, the groups do not seem to perform differently.Overall, two groups are identified, namely group one: High risk and Borderline and group two: Low risk and Low to no risk.It is worth noting, however, that Low risk students can form part of both groups.At the 5,000-word band level, no significant differences occur between High risk and Borderline students or between Low risk and Low to no risk students, which are the two groups identified at this level.
The data were analysed further in order to determine the frequency band(s) which could best predict academic literacy proficiency 9 .Consequently, a multiple regression analysis was performed to predict academic literacy from the AWL, the 2,000-word, 3,000-word, and 5,000word bands.Results (see Tables 1 and 2 in Appendix E) indicate that, statistically, these frequency word bands significantly predict academic literacy [F(4,268) = 53.97,p = 0.000; R² = 0.446].The R² which is 0.446 should be interpreted as academic literacy being predicted at 44.6% by performance on vocabulary knowledge at the different frequency bands.Furthermore, all four variables (frequency bands) added significantly to the prediction except the 2000-word band (see Table 3 in Appendix E).The frequency bands could be ranked in the following descending order with regard to their predictive power over academic literacy: 5,000-word band, AWL, and 3,000-word band.These findings answer the third research question by indicating that the relationship established between vocabulary size and academic literacy level is preserved at the different word frequency bands, which predict academic literacy to varying degrees.The predictive relationship between vocabulary size and academic literacy also seems to be stronger towards infrequent word bands.

Discussion
The present study measures the vocabulary size of first-year students at North-West University in comparison with their academic literacy risk levels determined by the TALL, in order to (i) estimate the size of their vocabulary, (ii) test the relationship between their vocabulary size and their academic literacy, and (iii) test this relationship between vocabulary size and academic literacy at each of the word frequency bands and thereby determine which band(s) may best predict academic literacy.The vocabulary size of participants estimated using Laufer and Ravenhorst-Kalovski's ( 2010) formula shows that, on average, students master the 5,000 most frequent words.Even though they are ESL users, this vocabulary size is large enough to allow them to follow lectures in English (cf.Laufer and Ravenhorst-Kalovski 2010, Nation 2006, Schmitt 2008, 2010, Schmitt and Schmitt 2012).
The relationship between academic literacy and vocabulary size was explored by mapping the size achieved by participants onto the TALL.The levels set by the TALL were also identified in the vocabulary size.This relationship between vocabulary size and academic literacy was also found at each of the frequency word bands, establishing a strong link between vocabulary size and academic literacy.This relationship between vocabulary size and academic literacy and the strong correlation10 between the TALL and vocabulary size imply that students with bigger vocabularies are those with a greater chance of being successful in their studies (if one considers academic literacy a reliable and valid predictor of academic success -cf.Van Rooy and Coetzee-Van Rooy 2015, and Van Dyk's forthcoming publication for reports on this).These results also confirm previous findings that pointed to the conclusion that vocabulary size predicts reading ability (cf.Nation 2006 among others) and potentially all the four language skills (Milton and Treffers-Daller 2013) if one desires to interpret language in terms of skills at all.The present study extends this relationship to academic literacy as a whole and confirms the predictive relationship between vocabulary size and overall language proficiency (Beglar 2010, Meara 1996, Meara and Buxton 1987, and Meara and Jones 1988).Furthermore, the closer toward infrequent word bands, the stronger the predictive relationship between vocabulary size and academic literacy.This finding lends empirical support to Laufer and Nation (1995) and Meara and Fitzpatrick (2000) in their observation that more proficient students use words from infrequent word bands, which they consider a good discriminator of linguistic proficiency levels.
The vocabulary size achieved by participants in this study is, furthermore, large enough for them to follow lectures in English.However, some students do not seem to have achieved the required vocabulary size, as demonstrated by an in-depth analysis performed to track individual students' vocabulary size.We strongly recommend identifying individual first-year students who could be below this suggested threshold and giving them the coaching and encouragement with the aim to expand their vocabulary size.For the word bands which are mastered receptively by the students (i.e. the 2,000-word and 3,000-word bands as well as the AWL), the help students might require should rather be on a productive level than a receptive one.We therefore call for productiveoriented teaching of the words, especially those from the AWL.With regard to bands which are more problematic for many students, such as the 5,000-word band, students should be encouraged to read more and expand their vocabulary size and possibly aim at optimal understanding (which is 8,000 word families).Given that students still struggle even though their vocabulary size seems to be large enough, we are tempted to believe that minimal understanding is not the ideal option.We suggest considering optimal understanding and using Nation's graded readers books11 which could help in this regard.
It should also be noted that all risk groups as set by the TALL do not seem to differ significantly in terms of their vocabulary size.The 5,000-word band for instance sets the students in two clearly distinct groups, i.e.High risk and Borderline students belong to one group while Low to no risk and Low risk students belong to another group.The AWL and the 3,000-word band set the students in three groups, i.e.High risk students constitute group one, Borderline students constitute group two, and Low risk and Low to no risk students constitute group three.At the 2000-word band, only the most proficient group (Low to no risk) performs significantly better than all of the other groups barring the Low risk group.What we learn from this finding is that students from different academic literacy proficiency levels may need different support in terms of vocabulary.We suggest considering these levels when deciding on the vocabulary to provide to students in a similar fashion as the construct of the TALL is used to inform teaching-learning materials for the course followed by first-year students.This might, however, be unrealistic in the sense that one has classes constituted by students who have mixed abilities, which complicates differentiated learning.Nevertheless, additional differentiated teaching-learning materials might be designed in such a way as to promote autonomous learning in a blended teaching-learning approach followed nowadays at North-West University.

Conclusion
The present study reports on the results of an investigation conducted on first-year students at North-West University which estimated their vocabulary size and the extent to which it relates to their academic literacy.First-year students seem to have an overall vocabulary size large enough to allow comprehension of lectures in an L2.They also master the individual word bands, and the more infrequent word bands seem to "best" predict academic literacy.This implies that students with a high level of academic literacy have larger vocabularies, both general and academic.These results answer the research questions examined in this study, but they also raise other important ones, described below, which are worth considering in further studies.
Firstly, even though the results reported here suggest that students have the required vocabulary size to follow lectures, empirical evidence seems to suggest that using vocabulary, especially collocations, is more problematic among L2/FL students (Laufer andWaldman 2011, Nesselhauf 2005).Given that students' productive use of collocations was found to fall short of expectations -the students were found to master collocations of words from the 2,000-word band only (cf.Nizonkiza, Van Dyk and Louw 2013) -we agree with the suggestion to explore the relationship between academic literacy as a whole and productive knowledge of collocations.
Secondly, the results of this study show that students have a large (enough) academic vocabulary.For the same reason as explained above, we suggest comparing the receptive knowledge of academic vocabulary with its productive use.As a follow-up study, we intend to explore the relationship between the academic vocabulary from the VLT and productive knowledge of collocations selected from the AWL.We are of the opinion that productive knowledge of collocations, based on words in the AWL, will also benefit students.We are, accordingly, investigating the possibility of including a collocation component in the AGLA/E modules.Participants will be tested in a pre-and post-experimental design, and we hope to gain more insights into the relationship between the productive knowledge of collocations and academic literacy.

Figure 2 .
Figure 2. Correlation between academic literacy (measured by the TALL) and vocabulary size

Table 2 :
Vocabulary size means estimate

Table 3 :
Size groups as identified by the Scheffe test

Table 5 :
ANOVA at word bands

Appendix A: Multiple comparisons of size estimates
The mean difference is significant at the 0.05 level.

Correlations between TALL and VLT scores
Correlation is significant at the 0.01 level (2-tailed).

Multiple comparisons of VLT scores at each frequency band
The mean difference is significant at the 0.05 level.