#75, Article: ‘Survey research: How to develop a questionnaire for ESL/EFL research’ by David Ockert

Abstract

Language researchers who wish to conduct research may want to create their own survey to collect the information that they want to write up and publish. This paper explains how to conduct research by reporting the author’s development, piloting, administration and analysis of a substantive scale survey for research purposes. A substantive scale uses questions and a scale system (e.g. a Likert scale) to gather data for analysis. The sections of this paper are explained using the author’s own research project as an example. The survey was designed to determine whether respondents could be classified as intrinsically or extrinsically motivated. Exploratory factor analysis confirmed the students tended to adhere to one or the other motivational type.

Introduction

To begin a research project survey, there are critical questions that must be answered: How to formulate a hypothesis? How to choose the questions and how many? In what order should they be listed? What type of scale should be used? For example, the most common type of scale is the Likert scale. These generally use a 1 to 5, 6, or 7 numeric systems, which corresponds to a series of answers such as always, sometimes, never, etc. There are more than a dozen different scale types (Alreck & Settle, 2003); however, sometimes a simple Yes/No question will suffice (Stone, 2003).

What are good sources of information to read before designing the survey? Not only are the statistical methods quite complicated and intimidating, but the terminology itself can be difficult to comprehend. Before beginning, have a clear goal. For this research project, I wondered how much research on: 1) Japanese student motivation, 2) pedagogical activity preferences, 3) learning strategies, and 4) the relationships between the three had been done at the tertiary level in Japan. It would have been possible to administer a survey by another author, known as replication, but after a brief literature review, I discovered that this was not always the best choice (Dornyei, 2001), and making one’s own survey to suit the learning environment was perfectly acceptable.

Data analysis options should be decided before the development of the survey, since the survey data will determine what type of analysis is appropriate. (Brown, 2001; Dornyei, 2001).

What to do with research findings? Most teacher/researchers hope to publish their findings in a professional journal. In fact, having a proven record of publications on a job applicant’s curriculum vitae can make the difference in landing a job or not (McCasland, P. & Poole, B., 2004; McCrostie, J., 2007). There are many journals to choose from on almost any topic.

Terminology and definitions

The terms substantive scale, instrument, and survey are synonymous. An item refers to a question on the survey. The items attempt to measure a construct, which actually refers to a way of thinking that exists within the minds of the participants (Brown, 2001). Therefore, since we cannot actually see a construct, we must test for it and attempt to measure it. Therefore, items are created to measure the construct.

Once the survey has been administered and the data collated, the accumulated results of the test items are referred to as variables, since they now represent the results of several surveys and are analyzed as individual results for each original survey question. More simply, a variable is the quantified means to measure an observable characteristic of a phenomenon (Voelker et al, 2001). For example, item number five on a questionnaire administered to one hundred students will have a total of one hundred different responses. These are all added up and analyzed using any of a number of statistical methods and the result is the answer for variable five for all of the participants.

Before scale construction: Background reading

First, decide on an area in our field that is interesting. Next, begin reading the relevant literature. The literature review serves three purposes: 1) to see if the research question(s) have already been answered; 2) to provide the reader with the necessary background concepts; and, 3) how this prior research supports the present research endeavor. If not the first, then, how has the work done by other researchers lead up to the research questions? The survey for this article was designed to test student motivation; therefore, I read publications by Dornyei (2001), Tremblay & Gardner (1995) and Gardner & Lambert (1959). Since their work has primarily been in English as second language (ESL) environments, and this article’s questionnaire was designed for students in an English as a foreign language (EFL) environment, this was taken into consideration in constructing the instrument.

Research questions and hypotheses

Much research on student motivation of French Canadian learners of English as a Second Language (ESL) has been reported on (Gardner & Lambert , 1959; Tremblay & Gardner, 1995; Noels, et al., 2000). Regarding extrinsic, intrinsic, and amotivation (unmotivated) orientations, experts generally accept that they are not categorically different, but rather exist along a continuum of self-determination (Deci & Ryan, 2002; Noels, et al., 2003). Ryan and Connell’s research (in Deci & Ryan, 2000) tested for “different types of motivation, with their distinct properties” to confirm that they do indeed “lie along a continuum of relative autonomy” (p. 73). This theory has tremendous value for educators, since the notion that motivation lies along a continuum and a person’s motivation varies depending on circumstances could help us learn about our learners as persons. Intrigued, I wanted to know if the students in my classes could be divided between an intrinsic and an extrinsic motivational orientation and wrote items that are hoped to be one or the other (Ockert, 2005; 2007).

Questionnaire construction

Questionnaire items

Brainstorming works well to start writing items (Griffee, 1999). However, by reading the instruments constructed by the authors mentioned previously, how statements are worded became clearer. Constructing an instrument to measure a group of learners’ motivational attitudes toward language learning remains difficult; therefore, when choosing questionnaire statements for a survey there are some rules to keep in mind according to Stone (2003). These include:

1. Avoid factual statements.

2. Do not mix past and present. Present is preferred.

3. Avoid ambiguity.

4. Do not ask questions that everyone will endorse.

5. Keep wording clear and simple.

6. Keep statements short and similar in length.

7. Express only one concept in each item.

8. Avoid compound sentences.

9. Assure that reading difficulty is appropriate.

10. Do not use double negatives.

11. Do not use “and” or “or” or lists of instances. (p. 288)

Furthermore, in Teaching and researching motivation, Dornyei lists several items from his research, providing a wealth of ideas. Following the advice above, I began writing the survey items (see Appendix).

Next, how many questions are enough to test the hypothesis? To get specific answers to a set of questions requires simple statistical analysis (see below). However, a factor analysis provides groupings of similar questions to test for possible relationships between specific variables. The former may be easier, but finding relationships between variables with factor analysis helps create a stronger and more valid instrument after removing items that do not “fit in”. First, start with more questions than may be necessary and discard those that don’t. Working with my M.Ed. professors, the survey statements were selected using the expert rating approach (Brown, 2001: 179-80). The first eight are hoped to measure intrinsic motivation and the latter eight extrinsic motivation.

Survey organization and item selection

How much information should be written on the questionnaire? It’s best to keep the instructions clear, simple, and concise. As mentioned above, the Appendix survey has two parts: the first eight items are testing for an intrinsic motivational orientation and the second eight are testing for an extrinsic orientation. If the respondents perceive a difference in the two sections and indicate answers differently than if the items were arranged randomly, this would result in a response bias. This occurs when the answers given do not reflect the students’ true beliefs as a result of the wording or ordering of questions. Therefore, care should be taken to avoid presenting the questions in a manner that has a “pattern” in order to avoid collecting biased data (Gendall & Hoek, 1990; Lynch, 2007) .

Response and rating scale formats

When using a Likert scale, consider what kind of Likert scale will work best. Originally designed by Rensis Likert (1932), this scale usually consists of four, five, six, or seven points. However, there are advantages and disadvantages to not only the number of choices, but also whether or not the number of points is odd or even. For example, the advantage of an evenly numbered scale is that it removes the neutral answer option, which would tell us nothing regarding a positive or negative attitude toward the survey question or statement (Stone, 2003).

It is necessary to view the item from the perspective of a respondent, and the actual responses from the pilot testing stage (see below) need to be examined carefully. If the item is easily answered with a dichotomous option, then that choice should be available instead. For example, if the answers converge on 1 (never) or 5 (always) of the scale on a specific item, then the other options need not be made available since they would yield little analytical value. It is crucial that the actual responses be analyzed and not just the average (see mean, below). For example, on a five point scale, if the average is 3 the responses may in fact be mostly 1 and 5 indicating that the middle 2, 3, and 4 are of no real value. A Yes/No question format would best suit this item. It’s best not to construct a large number of questions and assume from the start that every item will fit into a standard five or seven point scale (Stone, 2003). This requires careful analysis to understand the underlying item data that compose the variable data.

Furthermore, Stone says rating scales should follow a graded response format such as never, sometimes, frequently, always; or none, some, a lot, and all. While this may seem easy, these terms are actually ambiguous: what is the difference in meaning between usually and frequently? Do the terms none and always mean absolutely and without exception? The meaning will differ according to how each participant uses the terms in everyday life. However, rating scales can be made that solicit information without confusion (Stone, 2003). Finally, depending on the analytical method used to sift through the data, what is a minimum number of respondents necessary to have a representative sample? Most experts agree that twenty randomly selected surveys per 1,000 potential respondents is acceptable. Brown suggests 28-30 as being sufficient (Brown, 2001: 74).

Methods

Pilot testing

The survey should be pilot tested with a smaller, sample group before using it for research purposes. By asking native speakers (NS) to review the questions first will assure that the instrument items make sense to your peers; therefore, ask colleagues to review the questionnaire items beforehand. Any ambiguities in the instructions should be found out during the pilot phase. Researchers may also wonder: What about translating the survey into the respondents’ mother tongue (L1)? Or should the items be written in both the second language (L2) and the L1? (In this case, English and Japanese). How can researchers handle issues of low L2 proficiency? Certainly it is a good idea to ask a small representative sample group of non-native speakers (NNS) to check the instrument for clarity. Any problem areas that are difficult to comprehend should be corrected and re-checked (Griffee, 1999).

Students

The students (N=104) who took this survey were members of my Communication I class in a private university in Japan. This means they are a sample of convenience and the results may not be applicable to the general population of Japanese university students (Brown, 2006). Most of the respondents were male, so gender was not taken into consideration when analyzing the results. Participation was voluntary, anonymous and had no influence on student grades.

Administering the instrument

For this project, I decided to administer the Motivation Survey to the students in my communication classes. Teacher bias and external validity (see below) were eliminated as negative influences since the respondents were all my students in the same environment.

Statistical analysis

The Statistics Package for the Social Sciences, v13 (SPSSv13) can simplify data analysis. There are several statistical analytical procedures available to interpret collected survey data according to your research objectives. Calculating the average (the mean), determining the most frequent response (the mode), and determining the central cut-off point (the median) are commonly used processing methods (Brown, 2001: 119-21). These are the simplest methods of reporting data. Depending on what information the researcher wishes to report, there are more sophisticated procedures such as factor analysis (see below). When providing the information on the number of students, use (N=???) where N stands for “number”; use SD for standard deviation; for mean, mode, and/or median clearly indicate with M for just one of the three; write out the word otherwise (Kachigan, 1991).

Validity

According to Brown (1998; 1996; 2001) and Nunan (1992), there are several types of validity and ways to test them. We will look at the three most commonly referred to types here: internal validity refers to whether or not the questionnaire is in fact measuring what it claims to measure; external validity refers to whether or not those persons taking the survey by answering the questionnaire did so under similar conditions; finally, Brown (2001) explains construct validity as the “degree to which the survey can be shown experimentally to be measuring whatever construct you are trying to measure” (p.181). This can be done rather easily with factor analysis (see below).

Reliability

As important as the validity of the instrument is the reliability. Does the instrument measure what it purports to measure in a consistent manner at different times? (Brown, 1988; 2001; Griffee, 1999). In other words, do different groups of persons who answer the survey give similar responses? To test the reliability of the instrument the researcher uses the split-half method known as Cronbach’s alpha coefficient of reliability (for more information, see Brown, 2001).

Factor analysis

In order to find underlying relationships between the variables, a multivariate statistical calculation known as factor analysis can be used. Exploratory factor analysis can answer the question, “Will the variables fit together as hypothesized?” (Nunan, 1992; Brown, 2001). Factor analysis will organize the responses in variable groups and analysis of these groups will yield the answer. The survey for this project (see Appendix) was created with the hypothesis that the first eight items measure intrinsic motivation and the second eight measure extrinsic motivation. Ideally, they should “cluster together” in two sets of eight. These “clusters” are referred to as “factors” and the author gets to name them. The responses clustered together nicely as hoped (see Ockert, 2005; 2007).

Conclusions

As Griffee (1999) has noted, validation of a survey instrument requires months if not years before administering it for research results. It is a specialized business and should not be undertaken lightly (Nunan, 1992). However, the dedicated pursuit of an answer to a hypothesis remains a worthy goal and provides the foundation for growth and learning in our field, and statistical analysis can help even those of us who are novices gain a better understanding of language learners (Ockert, 2008). Since getting published remains a vital need for most educators, there is no better time to start than the present.

The author thanks the column editor, Joe Falout, for his professional encouragement and prompt feedback in bringing this manuscript to publication; and David Carlson and Fred Carruth for proofreading the original manuscript.

About David Ockert

Mr. Ockert was born in Michigan, USA. His research interests range from student motivational orientation, learning strategies, and their relationships between specific classroom activities, either traditional or task-based, to educational system development. He has a B.A. in Political Theory and Constitutional Democracy (PTCD) and East Asian Studies (Japanese) from James Madison College, Michigan State University, and an M.Ed. from Temple University Japan. He can be contacted at davidockert1@gmail.com

For a simple explanation of factor analysis and how it works please visit <http://www.janda.org/workshop/factor%20analysis/factorindex.htm>.

References

Alreck, P.L. & Settle, R.B. (2003). The survey research handbook. New York: Mcgraw Hill.

Brown, J.D. (2006). Generalizability from second language research samples. Shiken, 10(2), 24-27.

Brown, J.D. (2001). Using surveys in language programs. Cambridge: Cambridge University Press.

Brown, J.D. (1996). Testing in language programs. Upper Saddle River, New Jersey: Prentice Hall Regents.

Brown, J.D. (1988). Understanding research in second language learning. Cambridge: Cambridge University Press.

Deci, E.L. & Ryan, R.M. (2002). Overview of self-determination theory: An organismic dialectical perspective. In Deci, E.L.,& Ryan, R.M. (Eds.) Handbook of self-determination research. New York: University of Rochester Press.

Dornyei, Z. (2001). Teaching and researching motivation. London: Pearson, ESL.

Gardner, R. C. & Lambert, W. E. (1959). Motivational variables in second language acquisition. Canadian Journal of Psychology, 13, 266-272.

Gendall, P. & Hoek, J. (1990). A question of wording. Marketing bulletin, (1), 25-36. [Online] Available: http://marketing-bulletin.massey.ac.nz/V1/MB_V1_A5_Gendall.pdf

Griffee, Dale T. (1999). Questionnaire construction and classroom research. The Language Teacher, 23(1), [Online] Available: http://www.jalt-publications.org/tlt/articles/1999/01/griffee

Kachigan, S. K. (1991). Multivariate statistical analysis: A conceptual introduction. New York: Radius Press.

Likert, R. (1932). A technique for the measurement of attitude. Archives of psychology, 140: 1-55.

Lynch, Scott M. (2007). Introduction to Bayesian statistics and estimation for social scientists. New York: Springer.

McCasland, P. & Poole, B. (2004). The person, the package, the presentation: Lessons from a recent job hunt. The Language Teacher, 28(10). [Online] Available: <www.jalt-publications.org/tlt/articles/2004/10/mccasland>.
McCrostie, J. (2007). Perishing to publish: An analysis of the academic publishing process. OnCue Journal, 1(1), 74-79. [Online] Available: <jaltcue-sig.org/node/57>.

Noels K.A., Pelletier L.G., Clément R., & Vallerand, R.J. (2000). Why are you learning a second language? Motivational orientations and self-determination theory. Language Learning, 50, 57-85.

Nunan, D. (1992). Research methods in language learning. Cambridge: Cambridge University Press.

Ockert, D. (2008 June). Statistical analysis of an online survey questionnaire of student motivation and classroom activity preferences. Paper presented at the June 22 meeting of the Shinshu Chapter of JALT (Japan Association of Language Teaching), Matsumoto, Japan.

Ockert, D. (2007). Computer assisted learning about language learners’ learning styles. IATEFL CALL Review, 2007(Spring), 17-20.

Ockert, D. (2005). Student motivation and pedagogical activity preferences. [Online] Available: <http://jalt-publications.org/archive/proceedings/2005/E151.pdf>

Pedhazur、E. J. & Schmelkin, L. P. (1991). Measurement, design, and analysis: An integrated approach. New York: Lawrence Erlbaum.

Ryan, R.M. & Connell, J.P. (1989). In Ryan, R.M. & Deci, E.L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55(1), 68-78.

Stone, M. (2003). Substantative scale construction. Journal of Applied Measurement, 4(3), 282-297.

Tremblay, P. F. & Gardner, R. C. (1995). Expanding the motivation construct in language learning. Modern Language Journal, 79, 505-520.

Voelker, D., Orton, P., Adams S. (2001). CliffsQuickReview Statistics. New York: Wiley.

Appendix

What is your attitude toward learning English? Circle the number of the answer that best matches your opinion:

1 = strongly disagree   2 = disagree   3 = neutral   4 = agree      5 = strongly agree

1) I enjoy studying English. 1 2 3 4 5
2) English is important to me because I want to make friends with foreigners. 1 2 3 4 5
3) English is important to me because I want to study overseas. 1 2 3 4 5
4) English is important to me because I want to read books in English. 1 2 3 4 5
5) Language learning often makes me happy. 1 2 3 4 5
6) Language learning often gives me a feeling of success. 1 2 3 4 5
7) I study English because being able to use English is important to me. 1 2 3 4 5
8) English is important to me because I like English movies or songs. 1 2 3 4 5
9) I study English because it will make my teacher proud of me/ praise me. 1 2 3 4 5
10) I study English because it will make my parents proud of me/ praise me. 1 2 3 4 5
11) I study English because I want to do well on the TOEIC test. 1 2 3 4 5
12) I study English because I want to do well on the TOEFL test. 1 2 3 4 5
13) In the future, English will be helpful/ useful to me. 1 2 3 4 5
14) English is important to me because I might need it later for my job. 1 2 3 4 5
15) I study English because all educated people can use English. 1 2 3 4 5
16) I study English because I must study English. 1 2 3 4

Download ELTWeekly Issue#75 in PDF Version

2 comments

  1. Pingback: ELTWeekly Issue#75

Leave a comment

Your email address will not be published. Required fields are marked *