Dilemmas of the datasphere:
Issues impacting information gathering in physical and virtual sites

Michele Knobel (Montclair State University)

Colin Lankshear (University of Ballarat)


Paper presented at the American Education Research Association Annual Meeting,

San Diego, April 15, 2004.



This paper grows out of our experiences during the past decade of undertaking different kinds of qualitative research about out-of-school literacy practices of adolescents and children within physical and virtual settings, as well as of surveying work done by other researchers with similar interests across a similar range of settings. Drawing partly on our own work and partly on the work of others, then, we want to identify and briefly describe what we see as some significant issues associated with collecting data about out-of-school literacy practices. Some of these involve genuine dilemmas, in the sense that risks of one kind or another are involved whichever option we decide to take. Others, however, are more or less ‘self-contained’ issues, in the sense that researchers can avoid or avert unwanted risks or consequences without landing themselves in some other kind of difficulty as a consequence of making a particular decision.

We will draw on a range of recent studies where the researchers involved have experienced and reported issues associated with data collection to get a broad spread of significant issues. These include:

1. In-depth ethnographic case studies of the in school and out-of-school literacy practices of four adolescents. These case studies focused almost entirely on literacy practices in physical rather than virtual settings (Knobel 1999).

2. A study based on short face-to-face interviews and focused conversations, supported with artifact collection, about stories collaboratively produced by a preschool child and his father using a range of digital technologies and HTML programming language, and which were published on his highly successful website (Lankshear and Knobel 1997).

3. A small study using email interviews and online archival sources to investigate aspects of participation in an online community for buying and selling (netgrrrl Ü (12) and chicoboy26 (32) 2002; Lankshear and Knobel 2003).

4. An ethnomethodological investigation, using membership categorization analysis, of how social order is maintained in internet relay chat rooms through talk (Vallis 2001).

5. A substantial online interview-based study of heavy internet users – including adolescents and young adults – focusing on their experiences of cyberspace (Markham 1998).

6. An in-depth ethnographic study of literacy practices among Grades 8 and 9 youth in on- and offline settings (Leander 2003a; Leander and McKim 2003), which aims to enhance understanding of (a) the use of literacy practices for identity and social networks, (b) the ‘situatedness’ of literacy, or in other terms, how to understand literacy in relation to space-time, and (c) reflexively, new methodologies for conducting research across online and offline contexts by tracing how the participants use a range of new ICTs. These include instant messaging, chat, e-mail, searching the Internet and building web sites, and gaming.

The authors of these and similar studies describe a range of issues associated with data collection that can usefully be handled under four broad types of concern:

Ethical issues

Ethical issues and dilemmas are concerned with consequences for the good or harm of human beings within areas of human activity where people can reasonably be assigned rights and obligations (cf. May 1995, Warnock 1970). The things we do and say, believe and pursue - or refrain from saying, doing and pursuing - have, ‘intentionally or not, consciously or not, and in tandem with others’ beliefs [actions, behaviors, pursuits, omissions, etc.] and the institutions in our society, effects on other people’ (Gee 1993: 292, our italics). Effects can be described as harmful to the extent that they deprive others of ‘goods’ (ibid.), and beneficial to the extent that they endow others with ‘goods’. Goods range over such things as health, dignity, status, economic resources, power, esteem, pleasure, material possessions, security, integrity, and so on.

Collecting data from other people falls within the range of things humans do that (can) have consequences for human good and harm. Since ethical conduct aims to maximize beneficial effects and minimize harmful effects, both the negative and positive ‘sides’ to ethical issues are relevant to considerations of data collection. Notwithstanding this fact, the explicit ‘ethical considerations’ dimension of research conducted under institutional auspices is often confined to minimizing risk of harmful consequences. Furthermore, a number of researchers have noted that ‘ethical research practices’ are often designed and imposed more to protect universities from litigation than to seriously address considerations of participants’ wellbeing. For example, obtaining written consent from participants is not an automatic guarantee that the study will be ethical; indeed, ‘consent forms have become like "rental car contracts" (Hilts 1995)’, aimed at protecting the company but not necessarily engaging with the possible moral consequences of participating in a study (Denzin 1997: 288).

Collecting data about out-of-school literacy practices may run into one or more of several significant issues with respect to implications or consequences for the wellbeing of people associated more or less directly with a study. Some of these are associated particularly with data collection in physical sites; others with data collection in virtual sites; and some pertain to both kinds of sites.

The diverse ethical issues associated with data collection in studies of out-of-school literacy practices include the following examples of ‘negative’ and positive’ considerations.


(a) ‘Negative’ – avoiding harm – considerations

All research involving participation by human subjects, whether conducted in physical settings, virtual settings, or both, is inescapably intrusive to a greater or lesser extent. Participants’ routines and, indeed, to some extent their lives, are changed from what they otherwise would have been because of the researcher’s intrusion into their routines. Obtaining informed consent does not obviate intrusion; it simply gives researchers permission to intrude. Intrusion may be minimal, as when a sensitive and ‘skilled’ researcher conducts observations almost unnoticed in a community setting where a child or adolescent is involved outside of school hours. Even here, however, social dynamics are impacted to some extent. The degree and kind of intrusion are, of course, greatly amplified when data collection involves interviewing, surveying, measuring, artefact collection, and so on.

While it is impossible to eliminate intrusion altogether, researchers should aim to minimise intrusion by avoiding impositions that do not contribute productively to their research purposes. ‘Minimising’ needs to be understood carefully in this context. It involves a balancing of costs and benefits, and does not equate with ‘the least possible amount’. For example, too little intrusion (an odd notion at first blush) may result in a genuine subtraction of value from the project. This would be the case when a marginal increase in intrusion would have generated a considerable increase in significant findings. Making such judgements in the field is difficult and the concept can easily become interest serving. Capacity to make good judgements generally improves with research experience. At all times, however, it is better to err on the side of ‘under intruding’.

Intrusion has several dimensions. It is not just the physical ‘taking up of space’ or ‘being in the way’. It can be ‘non-spatial’, as when data collection pries into areas it does not need, or has no right, to go. Researchers need to keep their data collection purposes very clear, and be vigilantly honest about the information they need for addressing their question well. Our own preference for a rule of thumb here is: ‘If in doubt, don’t ask (gather, observe, etc.) it’ – especially if the data in question are in a recognized area of sensitivity (e.g., religious, political, sexual, cultural, ethnic areas, and so on).

Intrusion can also be ‘temporal’, in the sense that whether or not ‘being there’ is experienced as an (undue/unwanted) intrusion may be a function of the researcher’s ‘timing’ – such as, ‘now is not good, but the same thing later would be fine’. This is why it is important to establish agreed times and places with participants and to stick to them punctually unless participants initiate changes. Even then, however, an agreed time and place could turn out to be intrusive because of contingencies. It is important that participants genuinely feel comfortable about negotiating changes. Often they will not. Hence, it is important that researchers strive to be as alert as possible to signals of inconvenience or unease and, if in doubt, to ask participants whether they are uncomfortable about where things are going and/or whether another time/place would suit them better.

The explosion of participation by children and adolescents in diverse social practices within ‘virtual’ environments greatly enlarges the potential scope for researchers to investigate out- of-school literacy practices, and the ease with which this might be done. At the same time, this expansion and ease is associated with ethical issues and pitfalls to which researchers are less readily and less obviously prone – although by no means immune – when they are researching in physical spaces. Commentators widely acknowledge the ease with which it is possible to participate fully within a virtual world without alerting others to one’s research status and intentions (Knobel 2003; Leander and McKim 2003). Furthermore, it is possible for researchers to go to what are seemingly ‘thoroughly public’ domains, such as discussion lists, chat and user group archives, and the like, and simply download data for analysis, without the original ‘message posters’ being aware of this, or of the use to which their words might be put. For such reasons, research in cyberspace requires researchers to be especially alert and sensitive to issues of consent and disclosure. Indeed, some investigators of online practices insist that obtaining informed consent from participants is an inalienable researcher responsibility. Amy Bruckman (2001), for example, proposes that consent should be given via e-mail if the participant is over 18, but that signed parental consent needs to be mailed or faxed to the researcher if participants are less than 18 years old. However, her position suggests it is possible to ascertain beyond a shadow of a doubt that the person targeted as a participant in a study is indeed aged 18 years or more. The internet is, of course, rife with children masquerading or avatar-ing as adults and vice versa. Bruckman also argues that if consent cannot be obtained readily, then the researcher should either change the study’s design, or abandon the project altogether.

Other researchers suggest studying the online practices of young people to whom one has in-person access and can obtain their written consent in a face-to-face mode before observing their interactions online (e.g. via strategically placed video cameras, tracking and recording software; see, for example, Leander 2003b). Kevin Leander’s study of adolescents and their online practices involves observing them in person in meatspace while the participants are communicating online, as well as conducting online participant observations of their public and semi-public online group interactions (e.g. in chatspaces). In the online participant observations, Leander and his colleagues obtain signed consent from their target participants, and from friends with whom the participants interact online and offline. In addition, in those cases where ‘e-friends’ (those people key participants know only online) participate in ways deemed important to the project, they are contacted online and asked to sign post hoc research consent forms that give permission to Leander and his research team to use their postings as data (Leander, 2003b; see also Leander and McKim 2003).

Opinions vary about what counts as appropriate conduct around such issues as engaging in observation as an invisible ‘lurker’ or, even, as a participant in the practice who has not openly declared a research interest (Knobel 2003). Views differ as well about the acceptability of simply appropriating online conversation as data for analysis.

In most cases, arguments over whether informed consent should or should not be obtained from participants in an online community boils down to arguments over which online spaces are public and which are private. Within the humanities, most people agree that research conducted within public spaces (e.g. parks, shopping centres, in the street) does not require the researcher to obtain informed consent from all observed participants (cf. Goffman 1963, 1974). However, few researchers of online practices appear to agree on what criteria should be brought to bear on a space in order to judge the space ‘public’ or ‘private’. Some argue that the publicness or privateness of an online space should be judged according to how it is perceived by the people who interact within it. Allison Cavanagh, for example, points out that public space metaphors abound online—such as: village, cafe, town hall, town square—and indicate the non-private status of these different spaces (1999: 1). Cavanagh also points out that ‘lurkers’ or non-contributors to online interactions are tolerated, if not expected or assumed, in online communities or discussion groups. She observes that when lurkers change their status to more active participation, they generally are welcomed warmly by the community or group. Cavanagh attributes this to a shared cultural assumption about life online that ‘internet interactions occur within a public arena and are therefore matters for public consumption’ (3).

Some researchers suggest the best response to the public-private dilemma is to create purpose-built research spaces online, such as a room in a MOO or e-mail discussion list which is established explicitly for collecting interactional data, with the purpose of the room written into its publicly available description (Bruckman 2001). Other strategies include setting up websites and the like that signal the researcher’s status and to which participants in an online community can be directed. Other researchers, however, call for the physical nature of the space to be taken into account when judging whether an interaction is public or private (Frankel and Siang 1999). For example, password protected communities—such as some online cafes and salons—are generally assumed to be private spaces. On the other hand, archivable discussions—such as those generated on web-based discussion boards—are generally presumed to be public spaces (cf. AOIR 2001). However, these distinctions do not always hold, and it is the responsibility of the researcher to make reasoned judgements concerning the nature of the space.

In our own study of the eBay ‘feedback’ discussion boards, we treated the spaces as public (see, netgrrrl Ü (12) and chicoboy26 (32) 2002, Lankshear and Knobel 2003). Many posters’ messages to this discussion board are clearly directed to a wide and anonymous audience, and include, for example, unsolicited open letters to newcomers about how to participate effectively in eBay transactions, general calls for comments or advice on a problem encountered within a transaction, and so on. However, we obtained written consent from those eBayers we contacted for email-based interviews.

Barbara Sharf (1999: 254) describes one commonly held view when she suggests researchers should (i) introduce themselves clearly to online groups or individuals who are the intended focus of study with respect to their identities, roles and purposes; (ii) make concerted efforts to contact directly and obtain consent from individuals who have posted messages they want to use as data; and (iii) ‘seek ways to maintain an openness to feedback from the e-mail participants who are being studied’. Sharf’s ‘optimal’ position was partly influenced by the fact that she was investigating participation in a discussion list established for breast cancer sufferers. Hence, potential for harmful consequences existed of kinds that are not necessarily present in cases where the object of research is out-of-school literacies. At the same time, there is obviously plenty of scope for harm of different kinds to arise from research into young people’s online practices. In the end, it seems that cases need to be dealt with individually, on their own merits.

In her study of how order is maintained via talk within internet relay chat rooms, Rhyll Vallis (2001) adopted and defended a different position from that advocated by Sharf. At the same time, as she notes, formal procedures for ethical clearance from a university provided safeguards that seemed quite adequate for the contexts she was investigating. Notwithstanding the fact that the use of aliases offers considerable protection to chat room participants, and the fact that participants are aware that their talk is public and recordable, Vallis promised to delete all port addresses (a number which identifies each computer logged onto the system and which is logged by the chat room server) from her data, both in publications and daily from her hard drive. With respect to the debate over whether it is ethical to record chat room proceedings without participants’ knowledge, Vallis judged that the chat rooms she was investigating did not deal with sensitive topics, and that the nature of the talk involved entailed insignificant risk of harm to participants.

Seeking, obtaining, documenting, and honoring informed consent are indissolubly related to issues of trust and honesty, and entail researcher obligations to protect the privacy and to respect the dignity of every study participant and others whose interests are impacted through collecting data. In various different studies in which we have been involved, we have been told of nervous breakdowns, childhood traumas, family secrets and the like that have not been prefaced as ‘secrets’ in the data collection. This raises questions about what is to be done subsequently with such information, especially if it has significant import for interpretations and findings. To include it in reported work risks participants’ vulnerability; to leave it out may compromise rigor; to negotiate it might risk pain or intrusiveness, or eat into time participants had not reckoned on having to give when they consented to be researched. Consequently, at the point of data collection itself the question arises as to what the researcher should do when interviews or conversations start to go down such a line.

This can present a genuine dilemma for the researcher, especially when considerations of reciprocity are taken into account. For the researcher, data collection is about obtaining information pertinent to addressing a question and/or problem. From the interviewee’s perspective, however, ‘talking’ with a person they may see as potentially having knowledge about or insights into some matter of urgent concern to them or, perhaps, simply being someone they have come to trust and who may offer a sympathetic ‘ear’ to the airing of such concerns, can be a welcome ‘opportunity’. On one side of the researcher’s dilemma are the legitimate issues about where the decision to open oneself up to such lines of ‘data’ may subsequently lead, and the risks associated with going there. On the other side is the fact that if she or he resists going into such territory she or he may withhold from the participant a potentially fruitful and meaningful opportunity to practice the ethics of reciprocity (see below).

(b) ‘Positive’ – promoting good – considerations

Maintaining trust involves reciprocity during and after data collection. Within qualitative research, reciprocity is often best enacted through exchange of ‘favors and commitments’ that ‘build a sense of mutual identification’ (Glazer 1982: 50, cited in Glesne and Peshkin 1992: 122; see also Lather 1991: 60). This is not an easy task. On the one hand, there seem to be few things a researcher can offer that can even begin to repay the generosity of participants who open their homes and recreational spaces to observations and inventories; or who endure seemingly interminable questions about processes, rituals, habits, and other practices in the course of investigating aspects of learners’ out-of-school lives. In our own research we aim partially to enact reciprocity by recording all actions and utterances diligently and meticulously – taking special pains to do so when we don’t agree with their views or actions – and respecting their reasons for acting and speaking in such ways. Reciprocity also includes completing seemingly mundane – but often appreciated – tasks, such as lending a hand with drying the dishes, chopping vegetables, child minding, acting as a sounding board for ideas, actively listening as a participant talks through a problem he is facing, writing referee statements, conducting short professional development workshops, and so on (see also Glesne and Peshkin 1992). Here again, these are not simply individual acts whose enactment or omission results in more or less direct or immediate effects, but are also simultaneously part of what our construction of qualitative research lends to the moral character of human life as a whole.

The kinds of reciprocity that online researchers can offer study participants include helping them with some online task such as writing ‘bot’ programs (e.g. a small program that acts as butler in a MOO room, welcoming people as they enter the door), helping solve HTML dilemmas encountered in setting up a personal website or a personal profile page within an online community, offering lists of URLs for relevant information on a topic or issue needed by a participant, and suchlike. In our eBay research, we only interview people we have met either face to face or from whom we have bought something. This line of approach could be regarded as problematic by some researchers in that it limits the scope of interviews, or that having purchased something from someone risks them feeling obliged to respond to interview questions via e-mail. However, researchers do not have an inalienable right to expect people to want to be researched for nothing in return. In many ways, the reciprocity factor reminds the researcher to appreciate the time and effort outlaid by each participant in responding to questions, agreeing to be observed while using a computer or the internet, and suchlike.

To complicate matters, however, the very (nature of) sites of data collection for studies of out-of-school practices – homes, clubs, churches and so on – can present issues and dilemmas to conscientious researchers who want to practice reciprocity. These include running the risk of getting in the way, of unwittingly offending people, and so on in one’s efforts to reciprocate. One’s best intentions might fall over badly where shortfalls of ‘sensitivity’ (on either ‘side) result in participants feeling ‘inadequate’ as a result of what one gives or does in attempts to reciprocate.

Institutional issues

These are issues associated with the institutional affiliations of either or both the researcher and the research subjects. Institutional issues can arise with respect to such things as funding arrangements, research timelines in relation to the routines and rhythms of an institution, academic requirements and criteria, approval to conduct research, institutional policy implications for research, and so on.

Kevin Leander (in Lankshear and Leander in press) describes two institutional issues arising with respect to data collection at the outset of his study of youth networks in online and offline spaces. First,

It took several months to move this study through our Human Subjects Research Board at the university, which included having to educate the board on ICTs and the possibility of [conducting] research online. Some participants on the … Board were wrestling with the idea of whether or not this kind of research should be and could be done.

A second institutional issue impinging on data collection emerged as soon as Kevin and his colleagues began looking for appropriate sites.

The largest urban school district in our area is rejecting research proposals that do not line up with their current goals of improving standardized test scores. While I went to lengths to convince the administration of the value of the work, and even charted how the study would support their testing goals (feeling much like a sell-out), the proposal was rejected, and I had to move to a more distant and less socioculturally diverse school district.

A third institutional issue – which also has ethical and logistical facets – has arisen as the researchers seek to expand the range of participants in accordance with the project aim to investigate youth networks. This issue, which was still being resolved at the time we were informed about it, involves ‘enrolling online and offline friends in the study so as to better trace social networks’:

We have had some success in that some of our original key participants are friends, but we have not moved far in truly mapping online and offline social relations in a way that can be institutionally authorized by gaining consent from every participant. As such, in many cases what we have is one-sided ‘authorized’ data in interactions. A good part of this problem was my sense as a researcher that we were pushing our participants already in terms of their involvement (being in their classes and out, online and offline) and that we did not want any of their social relationships to feel under pressure from their involvements with us. We continue to work on this issue and think about it during the follow-up stage and as we establish more history with the key participants (Kevin Leander, in Lankshear and Leander in press).

This third issue constitutes a genuine dilemma for the researchers, compounded by the fact that the study has been funded by a major foundation, whose interests would be best served if data could be collected from more rather than less extensive social networks. At the same time, the researchers are ‘pinched’ between the institutional norms for ‘authorised’ data – which presupposes consent from all informants – and their own ethical concern not to impose still further on participants by putting undue pressure on their social relationships. There is no obvious way to resolve this dilemma, which applies to research in online and offline environments alike.

Logistical issues

Logistical issues of data collection are concerned with how to organize the complex and complicated tasks of research in a way that allows the research purposes to be met effectively and successfully. As such, logistical issues involve dealing with things that have to be juggled; negotiating compromises, meeting the unexpected, and so on.

In her doctoral thesis, Life Online, Annette Markham (1998: Ch 2) describes the gulf that emerged between her initial expectations and plans for collecting data over a three week period using online interviews – and the assumptions underlying these – and the ‘reality’ of data collection once the process got underway. Markham describes the adjustments she had to make when it became apparent that collecting data via online interviews differed on almost every dimension from her initial expectations. For example:

One of the problems we’ve run into in our own research has been the vagaries of email as a research tool. On the one hand, using email to interview study participants has meant we have been able to contact a rich range of participants from a number of different countries, which has greatly added to the depth and breadth of our data. On the other hand, however, we have also found that email interview responses are often short to the point of cryptic in terms of length and detail. For example, in our case study of Alex, the (then) preschool child whose web site ‘Alex’s Scribbles: Koala trouble’ and stories about Max the koala and his friends (Lankshear and Knobel 1997) notched up in excess of 10 million ‘hits’ over a 3 year period, we conducted a follow-up interview with Alex’s father 5 years after we had collected our data for the original study. We had come to know Alex’s father—a webpage designer—quite well during the case study and knew him to be quite a voluble and articulate man. We had every expectation that his replies would be detailed and lengthy. This was not the case, however, and the following responses from two different email interviews were fairly typical:

5. What is Alex currently doing in school with new technologies, as far as you know?

Alex always gains top marks in the IT area. He is well ahead of the rest of his class through his natural interest in discovering and exploring all Windows applications on the computer.

8. What kind of graphics does he create [asked in response to a comment made by Scott earlier]? Does he create them directly on the internet, or on his computer and then uploads them?

Alex creates graphics using computer packages - he particularly enjoys using clip art and cutting and pasting those images into an overall scene... he is getting very good at that.

This pattern of brief responses in email interviews may be due to a range of reasons. The questions themselves may not encourage elaborated answers (this is a problem that besets in-person interviews, too). The nature of email itself promotes brevity and on-the-fly communication. Its asynchronous nature may also make the task of responding onerous for interviewees. It may well be that online interviewing is much more effective when conducted as a conservation using IM, Internet Relay Chat channels or MOOs like Markham above, or instant messaging software which allow synchronous discussions to take place, so that the researcher can prompt interviewees to respond more fully to each question.

In her study of how order is maintained through ‘talk’ in internet relay chat rooms, Rhyll Vallis (2001) faced a number of logistical issues associated with ‘turns’ in conversation. At the beginning of the study she planned to use mainly Conversation Analysis (CA) techniques and procedures. Her data collection approach was necessarily confined, however, to downloading text-based chat room logs. This approach inescapably generates uncertainties with respect to the turns in conversation.

First, there is no way of knowing when extra turns (taken via private messaging between two or more people) are occurring. Furthermore, it is not always possible to establish from log transcriptions that non-adjacent turns are pairs. This makes the application of CA difficult and, in the event, forced Vallis to use Membership Categorisation Analysis rather than Conversation Analysis: ‘Because up to 300 people can (and do) occupy a single chat room at once, even participants themselves may have trouble identifying turn pairs and just who is talking to whom. As a result of these difficulties, I decided to focus more on the use of categories and only used CA to help validate the claims I made using MCA’ (email communication with authors).

Second, Vallis had to take into account two special considerations unique to chat rooms concerning issues of timing. She says:

Because the chat room setting is quasi-synchronous (neither completely synchronous like spoken conversation nor completely asynchronous like email or text messaging), I felt it important to be able to record the timing of turns in seconds in order to be able to have a fairly accurate record of the timing of turns. However, because the MIRC chat room software only allows users to record the timing of turns in hours and minutes, I had to have a special small software program (a script) written for me to enable me to record the timing of turns in seconds.

The other issue which arose was that participants in chat rooms could be connected to the chat room via different chat servers, and the sequence of timing of turns on their screens would thus differ (although only marginally) from the sequence recorded on my computer. In the end I decided that since possible slight differences in the sequentiality of the talk made little difference to how participants established the meaning of talk, it made little difference to my own interpretation, which relied on the same methods as participants’ own for making sense of turns. In cases where differences in the sequentiality of talk between chat room participants became pronounced, participants always drew attention to this in their talk, and used a software command called a ‘ping’ to measure the difference, so that I had a record of large differences in sequentiality between participants in the data 9personal communication).

Issues of a different order raised themselves in our small-scale study of the ratings and reputation system in place on Plastic.com. Plastic—whose tagline reads, "Recycling the Web in real time"—is an online forum devoted to posting the "best content all over the Web for discussion" (Plastic 2002: 1). People who posted to this forum fell into two sharply-defined categories: those who had registered as members with Plastic and those who had not. Those who had registered were able to select an alias or screen name that would be displayed at the start of the post. The rest were given a default screen name, "Anonymous Idiot" and which also appeared at the start of their posts. The latter made it difficult to differentiate between postings made by non-members and we were forced to pay more attention to members’ postings as we traced posters’ contribution patterns across time (cf., Lankshear and Knobel 2003). Of course, looking at members’ postings is a nebulous task—it transpired in one forum discussion that some members used two or more aliases (e.g., one for posting news items and one for commenting on others’ posted news items). In the end, we were forced to ‘read’ comments and postings as any user might and had to give up on trying to explore individual patterns of posts and comments in any depth.

Issues involved in verifying one’s data from online environments: Is the data credible and trustworthy?

Whether or not the data researchers collect is trustworthy involves having grounds for believing that what their informants told them is as ‘bona fide’ as possible. The point here is not so much that respondents can deliberately give researchers false information – although that can happen. The point has more to do with the fact that respondents may give us information that is authentic in terms of how they see themselves, but that this might be different from how other people see the situation. Alternatively, the way a respondent is thinking or feeling at the time may prompt a response that is different from how they would have responded at a different time or under different conditions, or in a different context.

Consequently, it can be very useful for qualitative researchers to think about developing and implementing their data collection tools in ways that build in opportunities to check for consistency between responses as a way of getting a sense of how far they might have confidence in the data provided. It might be a matter of asking another person a question that provides a kind of check on what a respondent has said. For example, a 13 year old boy might say that he does not read very much at home, and that he only spends 15 minutes a night on homework if he does it at all. He might genuinely believe this information is correct because it is integral to the image he has constructed for himself as a boy with a particular kind of identity (not a ‘scholarly’ boy; but rather, a ‘tough’ boy, a ‘cool’ boy). Yet, if one were to ask his mother she might say that he reads quite a lot out of school, and that he typically spends at least 45 minutes a night on homework. This is a form of ‘triangulating’ data. The higher the level of similarity between what an informant tells us and what someone who knows the circumstances of the informant tells us about the same thing, the stronger the researcher’s grounds are for believing the data is trustworthy.

Evaluating the trustworthiness of data is not a matter of surveilling responses. It is a way of maximizing one’s chances of getting data one can have trust in by getting different perspectives on a response. Asking another informant who is close to the original informant is one kind of procedure, and a very simple one. Another procedure is to ask an informant similar things at different points in time (later in the same interview, or in a subsequent interview, or casually in a conversation) and/or from slightly different angles or perspectives, and then to compare the data provided (another kind of ‘triangulation’ check).

Two brief examples from Kevin Leander’s study indicate what is at stake here even when researchers have physical access to observing online social practices. He reports that

… one of our participants at the wireless network school was willing to participate in the study, including home visits and the like, but she would not allow us access to certain parts of her online world. In particular, this participant (‘Angie’) was continually involved in playing some sort of online game, and likely involved in other online interactions during the school day. While we continually observed her participating in this activity (during classroom observations), and ‘mode-switching’ between gaming and coursework, she would not discuss this with us. Rather, Angie seemed set on presenting more of a student persona about her work in school. In somewhat similar vein, ‘Brian,’ who is an intense gamer at home, allowed us access to his gaming (but little to his IMing), and offered us a somewhat sanitized version of his online life. My hunch about this stems from bits of interaction captured by the screen capture software that were of an entirely different discourse than the interactions that Brian made available to us. For instance, Brian nearly never would curse online (or offline) while being observed but would use screen passwords something like ‘Ufuckoff’ (Leander in Lankshear and Leander in press).

At a different level of interest in verifiability, in some of our own work – e.g., the small scale studies of eBay participants (netgrrrl Ü (12) and chicoboy26 (32) 2002; Lankshear 2003), and of Alex, mentioned earlier – we have collected data which have been impossible to verify in respect to particular details. For example, in our study of ratings on eBay one of our email interview informants told us of being prepared to be ‘duped’ by buyers rather than running the risk of getting negative feedback that would lower their ratings. Bea1997 told us:

Sometimes I lose money from customers who break an item and ask for money back. I just don’t want to risk having my good reputation ruined for a few lousy bucks so I just take the blame and send their money back (e-mail interview, 25/09/2000).

While we were unable (and unwilling – since it would not be appropriate) to pursue corroboration of this claim at a level that might be convincing in each case, we noted that Bea1997’s claim tallies well with what other researchers report. Erik Sherman (2001: 63) reports that

Both buyers and sellers get burned from time to time, but usually not badly. Shamus remembers someone who bought a $25 trading card from him on eBay then returned it, but with a corner newly bent. "He said, ‘That’s what you sent me’ ", says Shamus, who didn’t argue because the amount was too small and negative feedback would hurt his future sales.

Such corroborating accounts affirm ‘the ring of authenticity’ to data like that reported by Bea1997, in the light of which we felt justified in accepting its credibility and trustworthiness.

In the case of Alex, it was impossible to gauge the precise role he played in the overall production of the different stories – such as the extent to which the text rendered his storylines as distinct from faithfully reporting his ‘own’ words. We did know that he could not write the stories himself in the early stages of his website, and we knew he did not mark up the webpages himself. At the same time, we were interested in understanding as well as we could ‘who Alex was’ within the overall ‘look and feel’ and production of ‘his’ site. Although we could not be there when his moments of inspiration came, it was easy enough to get a sense of his narrative imagination and to relate these to published storylines by talking with him about the characters and the tales. Alex would often ad lib potential storylines as we were talking to him, and he was forthcoming about how boring the Net was for him before he began posting his stories online. This cohered well with interview data about the site originating in Alex’s frustration that there was ‘nothing on the Net for kids like him’.


The kinds of issues and dilemmas we have reported are by no means exhaustive. We think, however, that they are typical and among some of the most important issues and dilemmas that arise in the processes of investigating aspects of the out-of-school literacy practices of children and adolescents. Furthermore, such issues and dilemmas will always be with researchers. They are inherent in the very acts of researching. For this reason it behooves educational researchers to anticipate them and prepare for them in ways that will ameliorate as far as possible their potential to impede and diminish our research.


AOIR (Association of Internet Researchers) (2001). AOIR Ethics Working Committee: A Preliminary Report. aoir.org/reports/ethics.html (accessed 16 March 2002).

Bruckman, A. (2001) Ethical guidelines for research online: a strict interpretation. Unpublished position paper. www.cc.gatech.edu/_asb/ethics (accessed 28 February 2002).

Cavanagh, A. (1999) Behaviour in public? Ethics in online ethnography, Cybersociology, 6. www.socio.demon.co.uk/magazine/6/cavanagh.html (accessed 28 February 2002).

Denzin, N. (1997). Interpretive Ethnography: Ethnographic Practices for the 21st Century. Thousand Oaks, CA: Sage.

Frankel, M. and Siang, S. (1999). Ethical and Legal Aspects of Human Subjects Research on the Internet. Washington, DC: American Association for the Advancement of Science.

Gee, J. (1993). Postmodernism and literacies. In C. Lankshear and P. McLaren eds. Critical Literacy: Politics, Praxis and the Postmodern. Albany, NY: State University of New York Press, pp. 271-95.

Glazer, M. (1982). The threat of the stranger: Vulnerability, reciprocity and fieldwork. In J. Sieder (ed.), Ethics of Social Research: Fieldwork, Regulation, and Publication. New York: Springer-Verlag, pp. 49-70.

Glesne, C. and Peshkin, A. (1992). Becoming Qualitative Researchers: An Introduction. White Plains: Longman.

Goffman, E. (1963) Behavior in Public Spaces: Notes on the Social Organization of Gatherings. New York, Free Press/Macmillan.

Goffman, E. (1974) Relations in Public: Microstudies of the Public Order. Harmondsworth: Penguin.

Hilts, P. J. (1995, January 15). Conference is unable to agree on ethical limits of research. New York Times, p. A12.

Knobel, M. (1999). Everyday Literacies. New York: Peter Lang.

Knobel, M. (2003). Rants, ratings and representations: Issues of validity, reliability and ethics in researching online social practices. Education, Communication and Information. 3(2). 187-210.

Lankshear, C. and Knobel, M. (1997). Chapter 7. In C. Lankshear Changing Literacies. Buckingham and Philadelphia: Open University Press.

Lankshear, C. and Leander, K. (in press). Social science research in virtual realities. In Somekh, B. and Lewin, C. eds. (2004). Research Methods in the Social Sciences. London: Sage.

Lather, P. (1991). Getting Smart: Feminist Research and Pedagogy with/in the Postmodern. London: Routledge.

Leander, K. (2003). Writing travellers’ tales on new literacyscapes. Reading Research Quarterly 38 (1): 392-397.

Leander, K. (2003) Researching digital literacies as situating practices. Article in preparation.

Leander, K. and McKim, K. (2003). Tracing the everyday ‘sitings’ of adolescents on the internet. Education, Communication and Information 3 (1): 11-30.

Markham, A. (1998). Life Online. Walnut Creek, CA: AltaMira Press.

May, T. (1995). The Moral Theory of Poststructuralism. Albany, NY: SUNY Press.

netgrrrl Ü (12) and chicoboy26 (32) (2002). What am I bid?: Reading, writing and ratings and eBay.com. In I. Snyder (Ed.) Silicon Literacies. London: Routledge-Falmer. 15-30.

Plastic (2002). Plastic. www.plastic.com (accessed 13 February 2002).

Sharf, B. (1999). Beyond netiquette: The ethics of doing naturalistic discourse research on the Internet. In S. Jones (ed.). 243 – 256.

Vallis, R. (2001). Applying membership categorization analysis to chat-room talk. In A. McHoul and M. Rapley (eds), How to Analyse Talk in Institutional Settings: A Casebook of Methods. London: Continuum, 86-99.

Warnock, G. (1970). The Object of Morality. London: Methuen.