Editor’s Note: For our Virtual Identity edition, contributing editor Heather Ford (@hfordsa) explores the complications of attribution and identification in online research. Are members of online communities research subjects, research participants, amateur artists? When is online participation public, private, or something in between?
When I published one of my first studies of online communities as part of my master’s research, I came up against one of the most challenging aspects of online research: how to reflect the identity of one’s research participants. I had been observing an open educational content community and quoted one of the participants’ missives from the publicly available mailing list without referring to his name or username. I had thought that this was the right thing to do: to anonymize the data, thus protecting the subjects. But the “subject” was angry that he had been quoted “without attribution”. And he was right. If I was really interested in protecting the privacy of my subjects, why would I quote his sentence when anyone could probably Google it and find out who wrote it.
Since then, my process has evolved a lot, but I still send my research participants a draft of my paper before it gets published so that they can choose whether I a) anonymize their statements b) attribute according to their usernames or c) attribute their full (“real”) names. But the process becomes unwieldy when doing detailed content analysis (or “trace ethnography” as per Geiger and Ribes) on Wikipedia where only some editors accept email and where other editors may have left the project. These are publicly available statements on a website that is explicitly open for copying and remixing, but I’m also taking those statements out of the context in which they are written. This is technically a “remix” but may make some editors uncomfortable.
So, do I quote users and attribute their comments to their username on publicly accessible websites like Wikipedia? Or do I need to get their written permission where they choose whether they want me to attribute their name, username, both or neither?
Two papers about Internet ethics have been really helpful to me in trying to work through these issues. The first is a paper by Amy Bruckman (2002) who reviews her own experiences of disguising material in her research accounts and argues that the problem with Internet research ethics is that neither the “human subjects” nor the humanities approaches to research are appropriate. Rather, Bruckman argues that Internet users are more “amateur artists” than “human subjects”. “Amateur artists” captures the spirit of the Web in that users are often people who are using the Internet to express themselves, acknowledging that their usernames have an accompanying reputation and that we can’t treat them the same as “human subjects” are treated in medical trials, for example. Bruckman explains using an example of her own experience as an amateur photographer:
One of my photographs hangs in the hall of my house. My husband had a print framed as a surprise birthday gift. Were you to choose to study amateur photography before the Internet age and wished to comment on this photo, you would clearly need my permission to do so. You could not gain access to my house without my permission, and in the process we would naturally negotiate the terms of that access. Whether the researcher construes this as human subjects research or not, a negotiation would take place and I could hopefully make a choice about how much access to grant to either my work or discussion of personal details relating to its creation. Were I to submit the work to a gallery, however, it’s clear that a critic from the local paper could comment on the work without my knowledge or consent. The professional public display of the work changes the rules.
Bruckman notes, however, that the photograph on her own website exists in a somewhat different space from the binary distinction between public and private. She argues that “(m)ost work on the Internet is semi-published”. If her photograph of Bryce Canyon was on the front page of CNN.com it would clearly be “published”, she writes, but if it was posted on a “private” web page, not linked to any “public” web page, “it is somewhere in between the traditional categories of published and unpublished”. All this is important because the analogies we use to explain the Internet determine the conclusions that we reach.
We may come to very different final results if we say that “an Internet chatroom is like a public square” than if we say that “an Internet chatroom is like my front porch” or “an Internet chatrooms is like a telephone party line.” In this paper, I have argued that the metaphor that the Internet is like a playground for amateur artists is useful in reasoning about some ethical dilemmas, because it highlights key features of the environment. However, it’s worth noting that ultimately the Internet is neither a public square, porch, telephone line, nor playground – it is the Internet.
I like Bruckman’s characterisation of four states of “disguise” that researchers can engage with: from “no disguise” (where “pseudonyms and real names may be used with permission of the individual”) to “heavy disguise” (where “names, pseudonyms and other identifying details are changed”). The problem is that there isn’t a clear distinction about whether the guidelines apply to interviews or only publicly accessible online content, and doesn’t cover the sticky problems when one cannot obtain permission from those we might be quoting.
Here, a piece by David Berry (2004) in Internet Research is really helpful. Berry argues for an “open source” approach to ethics guided by an “ethics of care” which he takes from Capurro and Pingel (2002). An ethic of care is one that “responds to the concerns of others not out of a sense of duty, but from a feeling of responsive mutuality” (P329). It responds to questions including: “Is the researcher responding to the needs of others? Do they care about the activities of members of online groups as people with feelings like themselves?” This approach urges researchers to take account of participant interests and include research subjects in a project in a way that really accords with the principles of free/libre and open source software (FLOSS) projects.
This would… encourage open and participatory research methodologies, promote an ethics of care, and return research results to the community and the researched groups (P330)
I really like this approach because it sees users as research participants rather than subjects. In my experience, involving Wikipedians that I’ve interviewed in the process and evolution of my research projects has been illuminating and extremely rewarding. It has made me review the traditional process of research where a supposedly more learned individual extracts data from their subjects, goes away into a dark room to analyse the data and then delivers the results – usually to a completely different audience. The way I want to do research is to consider myself as a mediator of knowledge that members of the group inherently hold within themselves. The final product will ultimately reflect my own perspective, but the process should involve working iteratively rather than sequentially, feeding results back to the community from which it comes, and enabling research participants to speak back to the data.
For now, that means that I’ll use my best efforts to contact those whose statements and conversations on Wikipedia I want to quote. More generally, I’m going to continue to talk to Wikipedians about what they think about these issues. It’s not perfectly informed consent since not every Wikipedian could be involved, but one thing is for sure: those who we study should have a say in the ways in which this issue evolves.
Amy Bruckman. (2002). Studying the Amateur Artist: A Perspective on Disguising Data Collected in Human Subjects Research on the Internet. Ethics and Information Technology, 4(3), 217–231.
David M. Berry. (2004). Internet research: privacy, ethics and alienation: an open source approach. Internet Research, 14(4), 323–332.
Nissenbaum, H. F. (2010). Privacy in context: technology, policy, and the integrity of social life. Stanford, Calif.: Stanford Law Books.
Featured image: lel4nd on Flickr, CC BY