Reflections

Posted 20 October, 2015 by Christopher Wilson

Research, discretion and sharing: our experience with a tricky balancing act

Turtle balance

Nature in the Balance (Credit: Mr.TinDC)

Using technology for social change means contesting power, and some of the most interesting uses of technology tend to be the riskiest. People innovating with tools in repressive states, on the fringes of conflict or shining the light on deep corruption have a lot to lose, and there’s a novel tension built into the research of these efforts. This is especially true when the researcher believes wholeheartedly in the principles of open research, and wants their work to reach and help as many other change-makers as possible. How do you balance the absolute need to protect the people whose work you are trying to understand, with the desire to share what you learn as widely as possible?

The engine room was originally envisioned as a research collective. Interviewing people involved in the Egyptian uprising in 2011 (so smart, determined and well-resourced, yet making happenstantial decisions about technology), we were struck by what an imperfect information market there is for tech-driven advocacy, and decided we wanted to fill that gap. That has remained the engine room’s mission, and though the form has changed a lot in the last five years, research to understand how technology is actually anticipated, adopted and used by people pushing for change has stayed central to what we do.

As I finish my 5-year stint as knowledge lead for the engine room, this blogpost outlines some of what we’ve learned about balancing discretion with openness, and how we’ve navigated the ethical and pragmatic challenges we’ve faced.

What have we done? (!)

Research at the engine room has been an iterative experiment, adjusting constantly to find how we can best produce useful information, which still meets basic methodological standards. But this always involves discretion a bit more involved than standard research consent.

Sometimes this extra discretion is about professional courtesy, such as when we are interviewing groups that are discrete by virtue of their operations. Our research on responsible data practices among donors was premised on very careful protection of interview material, and strict commitments to maintain confidentiality. Because some individuals in this community tend to be sceptical about talking about operations outside of their peer groups (and especially with potential grantees), we often had to make special commitments. We have a rigorous policy of sharing and approval built into the consent process, which means that detailed outputs of this research might not ever be seen by anyone but the participants. Our interviews with the financial investment community aimed to understand what kind of NGO-produced monitoring data could be useful in supporting socially responsible investment decisions. But we couldn’t do that without understanding the processes through which investment decisions are made; a process that many investors consider to be proprietary, constituting a business secret that must be protected from other investors at all costs. This means that this research might remain very firmly walled off from a lot of the small organizations who could most use it.

More often, discretion has to do with protecting actual groups and individual in precarious contexts. Our research on the technology use of NGOs in seven countries was in-depth and and inevitably helped to identify opportunities for individual organizations using tech. It also surfaced the weak spots at which they could be attacked and harmed, and these two groups of insights were difficult to disentangle in any meaningful way. As a result, individual organizational reports were shared with those organizations, and country reports with the donor who commissioned the work, but the vast majority of what we learned wasn’t shared at all.

This is a shame, because we spent lots of time, money, brainpower and social capital developing the methods and a capturing a rigorous data set that won’t see the light of day. Worse, this information could have been tremendously useful in each of the countries: for coordinating advocacy, for sharing insights and warnings about what might go wrong, and for generally helping groups that aren’t tech-born to do their work smarter and more efficiently. But, faced with even the slightest potential of actual harm to the people we were researching, we clearly had to adopt a conservative approach to data protection. In the end, we were only able to release one country report, heavily redacted and country un-named, and none of the detailed data has been shared beyond the donor network.

Discretion in our research usually finds itself somewhere between somewhere in the middle of these two poles, like our research on the strategies used to train digital security trainers, which sought to advance a field, without outing individuals who do digsec, or any of the strategies that might compromise digsec. There is some component of both security and professional discretion to all our decisions about sharing engine room research. Anytime you’re learning about people who are challenging power, whether they are social movements or accountability organizations, it’s something you have to think about carefully. And I can’t think of a single research project where we managed to share what we learned as much as we would have liked.

Ethics – responsibility vs utility?

At bottom, this might represent a consistent and fundamental tension between two competing norms: #protection vs #open.

Protection is a fundamental principle to any research, whether it’s about protecting people’s security, data, organizational interests, personal information, or simply the consent of research participants. This is well established as the principle of informed consent to participate in research, but we tend to take it extra seriously because of the type of stuff we’re studying, because the people we’re researching are also the people we’re trying to help, and because we spend so much time thinking about responsible data in other contexts. This last bit is especially important, because we’ve come to recognize that responsibly handling (research) data isn’t just about keeping personally identifiable information secret. We have to think about how the protection and sharing of data impacts people’s data agency. We have to think about how data can be re-used and re-purposed for contradictory objectives. We have to think about how affiliation with an international research project might damage credibility. We have to think about a whole host of other eventualities along a data pipeline of infinite possibilities and dependencies. Sigh.

But the norm of open is important too. The way research has been conducted for the last decades (research learning trapped behind paywalls, incentives to hoard data in order to publish first and secure tenure, power asymmetries between researchers and researchees) isn’t just annoying, it’s wrong.

We firmly believe in open by default. This translates into an ethical imperative to share what we learn as widely as we can, and to make what we learn as accessible and useful as we can, in the interest of helping people at the front lines of social change to do their work safely and efficiently. We have to think about who we’re not helping when we don’t release data and methods into the wild.

These two principles are almost always at loggerheads when we start looking at how much of our research we want to publish. And while we have consistently considered #protection as the primary norm, (it beats #open in specific conflicts every time), we’ve tried hard to think of creative ways to reconcile the two, and have been careful not to use protection as an excuse to avoid the work and time that’s required to share (redacting, disassociating, pseudonomynizing, oh my).

Pragmatics – the importance of process

So how to do it? If there’s one thing we’ve learned across all these projects, it’s that there’s no one thing. We have tried to find and apply rules and tools, but they’re consistently a poor fit to the individual project context. At the end of the day, like so many other responsible data challenges, we’ve been stuck spending time thinking carefully about the contexts and risks, and crafting a strategy for sharing on the basis of that. This is a lot work, which is annoying, but we haven’t found any way around it.  Most importantly, we’ve become convinced that any structured process is a win. How one does it matters a lot less than the fact that one does anything at all. And just getting people to set aside time and energy to thinking through these issues can make a huge difference.

One way to systematize our approach would be as follows:

  1. Review the consent and other agreements you have with people reflected in the research
  2. Map the data
    What kinds of data are there? Can you break down different variables? Who has access? How is it stored?
  3. Map who the research could help
    Make a list of who might be interested in the data, and what they might do with it. Don’t forget to move beyond decision-making, to think about people who might want to replicate or expand on the data in their own research.
  4. Divide the data into sensitivity categories
    Evaluate each of the types of data (all the way down to individual indicators if you can) to think about what harm might lead from sharing them. Tools like the Data Risk Checker can be useful for this.
  5. Map the risks
    What could go wrong? Brainstorm risks for the different kinds of data you’ve mapped. Think about what other data is already out there? Think about how contexts might change in the next year, in the next five years. Consider holding a pre mortem. Adjust your sensitivity categories.
  6. Consider alternative solutions for the all data in the middle categories?
    Could limited sharing with closed group be used? Can the sensitive data be adjusted somehow to make it less sensitive?
  7. Check your plan with some people reflected in the research data

Balancing act

These tension between protection and sharing will be familiar to anyone researching vulnerable populations or working in security-conscious contexts, and hopefully also to anybody reading or writing about technology and accountability. We’ve spent a lot of time thinking about them precisely because there are no easy tools or solutions (we believe that things like academic ethics boards and standardized consent forms are woefully inadequate, not only for grey zone researchers like us, but for the academics that rely on them).

And though we might not have any answers or turnkey solutions, we do believe that supporting community conversations about these challenges and how we deal with them is critical for managing them better and avoiding shortcuts. A broader conversation about how difficult this balancing act is might also make it easier for many of us to invest in a careful and systematic assessment. Making that investment is hard. Carefully assessing risk and opportunities takes resources and time which are often in short supply. But doing so can be critical to make sure you’re handling data responsibly, and essential to avoid defaulting to data hoarding.

We’ve been learning as we go, and don’t always get it right, but would love a broader discussion about how to strike that balance, even when over budget and behind schedule, even when it’s hard to know what the risks are, and even when it would just be so much easier to cite “privacy” and leave your data messy and secret.

Related articles