For my dissertation, to get a general sense of the themes and trends of conservation discourse, I did a pilot topic modeling project in R on over 3,000 publications from the World Wildlife Fund including everything from informal pamphlets to their most high-profile publications: their Living Planet Reports and Annual reports.  In very basic terms, a topic model works by grouping words that tend to co-occur.  You put in your texts, decide the length of the segments you want it to divide each document into, and the number of topics to sort words into.  And the output is a series of “topics” comprised of words that tend to co-occur.  By adjusting your “chunk size” and your number of topics, you can get the model to produce topics that seem meaningful: coherent and discrete, rather than nonsensical. In this case, I found I got meaningful results when I set the text chunk size to 400 words and set the topic number at 50. (You can see my R code here.)


I find that some of the most intellectually interesting work in topic models is in naming the topics, so let’s take a look at some of the topics I got.

Energy: Fossil and Alternative

Fishing and Overfishing

Contact Us!


Status of Major Biomes

Funds, Fundraising, Where Money is Spent

Forests and Deforestation

My topic model even grouped boilerplate together, as you can see from the “Contact Us!” topic.  In addition to recognizing and naming topics,  you can do things, like arrange your documents by date and then look at the probability of finding a particular topic in the documents.

This is a graph of the “Forests and Deforestation” topic through time.  This downward trend in its presence fits my general understanding that deforestation, as a topic of environmental concern has been a waning paradigm, and that climate change rather than deforestation is more and more often focused on as the primary threat to the environment. 





One topic that looks very similar to the “Funds, Fundraising, and Where Money is Spent” topic is this one:

Corporate Partners

Corporate Partners

However, a few important words make this topic very different.  Rather than the internal money flows and structures of the WWF (where money is spent, who donates it), this topic is about corporate partners. Not only are the words “company” and “partner” prominently associated with this topic, but “Nokkia,” (the camera company) “HSBC,” (the world’s fourth largest bank) and “Lafarge” (a “world leader in building materials” according to their website) indicate specific companies who support and partner with the WWF. This topic points the sticky associations between conservation and capitalism, which I examine in the first chapter of Eden’s Endemics.




It is also a topic that is increasing through time:

Additionally, this topic shows that certain topics in my model were very highly associated with a particular type of document.  In this pilot experiment I included Living Planet Reports, which are published every other year and Annual Reports, which are published every year.  You can see from the pattern above, that this topic has a much higher probability of occurring in the annual reports. 

A topic of particular interest to me, with my dissertation on biodiversity discourse, is endangered species and extinction.  It’s a topic that’s pretty present throughout time, in both types of report, though with a higher probability of occurring in the annual report.  It is a topic that is not passe, and it is a topic in which the word “Africa” has a high probability of co-occurring, perhaps reflecting the imaginative association of endangered species and Africa in the WWF's conceptualization of global biodiversity.

Sustainable Development

Sustainable Development

You can also query R to return topics that are the most highly associated with certain keywords.  The topic to the right, for example, is the topic most associated with the words "indigenous," "local," and "native." It is also pretty clearly about development.  The most highly probable words in the topic are “natural, environmental, economic, sustainable, and development": and the topic is pretty evenly shared among them. (One of the advantages of using the much-maligned word cloud to display topics is that it immediately indicates not only which words have the strongest association with the topic, as a list would do, but how evenly distributed the relative weight of each word is). Local does feature prominently in this topic as well. This topic gestures toward the idea of conservation as development: the idea that conservation might not be an action in which land is set aside to exist outside the market economy, to produce nothing, make no money.  Here, the linking of "natural," "environmental" and "sustainable" with "development" and "management" reveals how conservation can instead be an avenue to bring new populations into the market economy. Biodiversity conservation may be a route through which “indigenous, local” populations who may not have participated much in market capitalism are recruited into it. 

You can see that the last document, which happens to be the most recent Living Planet report, features the highest probability of the words in this “Sustainable Development” topic co-occurring.  So, here’s where my topic model helps me find the passages in my corpus that may be relevant to my particular interests in biodiversity conservation. If I’m interested in conservation’s imbrication with development, Document 25 (the 2014 Living Planet Report) might be a good document to take a closer look at. 

A section of this report that has a relatively high probability of this topic occurring is a case study on ecotourism and the Mountain Gorilla in Virunga National Park in the Democratic Republic of the Congo and Uganda.

This section is devoted to enumerating how the community has benefited from conservation. It is one of the jubilant success stories celebrated in the report, which states that gorilla tourism has “transformed” the region. This section describes the “Clouds Mountain Gorilla Lodge, a community owned boutique hotel that welcomes 1,200 guests a year.  It directly employs more than 40 people, but the benefits extend to more than 30,000 others living in nearby villages” (107). There are also “restaurants, bars and other accommodation [that] are opening up, while craft shops sell carved wooden gorillas, t-shirts and baskets made by local artisans” (107).

Here it’s clear how previously “untapped” populations are being brought into the market economy through conservation.  Conservation is also generating new market items through images and representations of gorillas carved in wood and found on t-shirts.  You can now purchase baskets made of plant material that you saw in the park.  These claims also appear at least a little exaggerated.  A hotel that employs 40 people directly somehow has benefits that extend to 30,000 people in nearby villages?  This sounds to me like the WWF just took the total populations of nearby towns and plugged it in, assuming that everybody living near the hotel was benefiting from it. This is despite the fact that development is a heterogeneous process that is more often characterized by uneven benefits rather than blanket ones. Also, while the Clouds Mountain Gorilla Lodge is owned locally, by a white family who has been working in the region since 1993 and does good work for the community and for other species, “community owned” is a stretch.  "Community owned" implies some sort of communal project with distributed dispersion of both responsibilities and revenue.

The Living Planet Report states in this section that “Ultimately, local people gain more from preserving their natural resources than from exploiting them in the short term” (107).  But what did this “exploitation” look like?  The only answer the Living Planet Report provides is this: "women and children used to collect water from streams within the national parks.  Not only was this an arduous and potentially dangerous chore, but the presence of large numbers of people posed a threat to the gorillas and other wildlife.  Now, many women and children have more time to spend on education and improving their livelihoods, and fewer people need to enter the gorilla’s habitat” (108).

So here, one type of use, getting drinking water, is defined as “use” of the environment, and another type of use “ecotourism” is called “nonuse.”  This definition is despite ecotourism’s obvious costs to the planet in terms of plane flights, carbon, roads, and infrastructure. Similarly water retrieval led to human presence that was a “threat to gorillas and other wildlife” but ecotourism does not. 

Biodiversity conservation discourse can position certain groups as “users” of the environment and others as virtuous protectors of the environment that are helping nature and local communities even as they travel for their own enjoyment. These nonprofit organizations are carrying out projects to protect biodiversity, so they are remaking the world in their image of biodiversity.  If they define biodiversity as gorillas with international visitors but not local water-fetchers, then that is the kind of biodiverse space we will be left with in the future.  The ways in which NGOs and others discuss biodiversity has very real effects on how our future biodiversities will look and what human voices are counted in conservation decisions. We need careful consideration of these types of discourse, and we need to ask if they promoting the kinds of biodiversities that we actually want.  Digital methods can help us find these moments, these types of discourses, can help determine how widespread they are, and help us see the connections between topics, key terms, and how discourse gets carried out on the material world.