Tag Archives: Wired

Big Data Needs Thick Data

Tricia Wang

Tricia Wang

Editor’s Note: Tricia provides an excellent segue between last month’s “Ethnomining” Special Edition and this month’s on “Talking to Companies about Ethnography.” She offers further thoughts building on our collective discussion (perhaps bordering on obsession?) with the big data trend. With nuance she tackles and reinvents some of the terminology circulating in the various industries that wish to make use of social research. In the wake of big data, ethnographers, she suggests, can offer thick data. In the face of derisive mention of “anecdotes” we ought to stand up to defend the value of stories.

__________________________________________________

image from Mark Smiciklas at Intersection Consulting

image from Mark Smiciklas at Intersection Consulting

Big Data can have enormous appeal. Who wants to be thought of as a small thinker when there is an opportunity to go BIG?

The positivistic bias in favor of Big Data (a term often used to describe the quantitative data that is produced through analysis of enormous datasets) as an objective way to understand our world presents challenges for ethnographers. What are ethnographers to do when our research is seen as insignificant or invaluable? Can we simply ignore Big Data as too muddled in hype to be useful?

No. Ethnographers must engage with Big Data. Otherwise our work can be all too easily shoved into another department, minimized as a small line item on a budget, and relegated to the small data corner. But how can our kind of research be seen as an equally important to algorithmically processed data? What is the ethnographer’s 10 second elevator pitch to a room of data scientists?

…and GO!

Big Data produces so much information that it needs something more to bridge and/or reveal knowledge gaps. That’s why ethnographic work holds such enormous value in the era of Big Data.

Lacking the conceptual words to quickly position the value of ethnographic work in the context of Big Data, I have begun, over the last year, to employ the term Thick Data (with a nod to Clifford Geertz!) to advocate for integrative approaches to research. Thick Data uncovers the meaning behind Big Data visualization and analysis.

Thick Data: ethnographic approaches that uncover the meaning behind Big Data visualization and analysis.

Thick Data analysis primarily relies on human brain power to process a small “N” while big data analysis requires computational power (of course with humans writing the algorithms) to process a large “N”. Big Data reveals insights with a particular range of data points, while Thick Data reveals the social context of and connections between data points. Big Data delivers numbers; thick data delivers stories. Big data relies on machine learning; thick data relies on human learning.

Read More…

Why Wikipedia is no ‘proxy for culture’ (Part 1 of 3)

Culture close up Bomedical scientist, Nathan Reading on Flickr (CC BY)

Culture close-up by biomedical scientist, Nathan Reading on Flickr (CC BY)

Last month’s Wired magazine showed an infographic with a headline that read: ‘History’s most influential people, ranked by Wikipedia reach’ with a group of 20 men arranged in hierarchical order — from Jesus at number 1 to Stalin at number 20. Curious, I wondered how ‘influence’ and ‘Wikipedia reach’ was being decided. According to the article, ‘Rankings (were) based on parameters such as the number of language editions in which that person has a page, and the number of people known to speak those languages’. What really surprised me was not the particular arrangement of figures on this page but the conclusions that were being drawn from it.

According to the piece, César Hidalgo, head of the Media Lab’s Macro Connections group, who researched the data, made the following claims about the data gathered from Wikipedia:

a) “It shows you how the world perceives your own national culture.

b) “It’s a socio-cultural mirror.

c) “We use historical characters as proxies for culture.

And finally, perhaps most surprising is this final line in the story:

Using this quantitative approach, Hidalgo is now testing hypotheses such as whether cultural development is structured or random. “Can you have a Steve Jobs in a country that has not generated enough science or technology?” he wonders. “Ultimately we want to know how culture assembles itself.”

It is difficult to comment on the particular method used by this study because there is little more than the diagram and a few paragraphs of analysis, and the journalist may have misquoted him, but I wanted to draw attention to the statements being made because I think it represents the growing phenomenon of big data analysts using Wikipedia data to make assumptions about ‘culture’. Read More…