Cogito, Ergo Sumana for 2016 February

# 19 Feb 2016, 06:10PM: Comparing Codes of Conduct to Copyleft Licenses (My FOSDEM Speech): "Comparing codes of conduct to copyleft licenses": written notes for a talk by Sumana Harihareswara, delivered in the Legal and Policy Issues DevRoom at FOSDEM, 31 January 2016 in Brussels, Belgium. Video recording available. Condensed notes available at Anjana Sofia Vakil's blog.

Good afternoon. I'm Sumana Harihareswara, and I represent myself, and my firm Changeset Consulting. I'm here to discuss some things we can learn from comparing antiharassment policies, or community codes of conduct, to copyleft software licenses such as the GPL. I'll be laying out some major similarities and differences, especially delving into how these different approaches give us insight about common community attitudes and assumptions. And I'll lay out some lessons we can apply as we consider and advocate various sides of these issues, and potentially to apply to some other topics within free and open source software as well.

My notes will all be available online after this, so you don't have to scramble to write down my brilliant insights, or, more likely, links. And I don't have any slides. If you really need slides, I'm sorry, and if you're like, YES! then just bask in the next twenty-five minutes.

I. Credibility

I will briefly mention my credentials in speaking about this topic, especially since this is my first FOSDEM and many of you don't know me. I have been a participant in free and open source software communities since the late 1990s. I'm the past community manager for MediaWiki, and while at the Wikimedia Foundation, I proposed and implemented our code of conduct, which we call a Friendly Space Policy, for in-person Wikimedia technical spaces such as hackathons and conferences.

I wrote an essay about this topic last year, as a guest post on the social sciences group blog Crooked Timber, and received many thoughtful comments, some of which I'll be citing in this talk.

I am also a contributor to several GPL'd pieces of code, such as MediaWiki and GNU Mailman, on code and non-code levels. And I am the creator of Randomized Dystopia, a GPL'd web application that helps you in case you want to write scifi novels about new dystopian tyrannies that abrogate different rights.

And I have been flamed for suggesting codes of conduct; for instance, one Crooked Timber commenter called me "a wannabe politician, trying to find a way to become important by peddling solutions to non-problems." Which is not as bad as when one person replied to me on a public mailing list and said, "Deja Vue all over again. I finally understand why mankind has been plagued by war throughout its entire history...." So maybe I'm the cause of all wars in human history. But I probably won't be able to cover that today.

II. The basic comparison

So let's start with a basic "theory of change" lens. When you're an activist trying to make change in the world, whether it's via a boycott, a new app, a training session, founding an organization, or some other approach, you have a theory of change, whether it's explicit or implicit. You have an assessment of the way the world is, a vision of how you want the world to look, and a hypothesis about some change you could make, an activity or intervention you could perform to move us closer from A to B. There's a pretty common theory of change among copyleft advocates and a couple of theories of change that are common to code of conduct advocates.

A. GPL

The GPL restricts some software developers' freedom (around redistributing software and around adding code under an incompatible license) so as to protect all users' freedom to use, inspect, modify, and hack on software.

The copyleft theory of change supposes that more people will be more free if we can see, modify, and share the source code to software we depend on, and so it's worth it to prohibit enclosure-style private takeovers of formerly shared code. Because in the long run, this will enable free software developers to build on each others' work, and incentivize other developers to choose to make their software free.

B. Codes of Conduct

Now, codes of conduct, antiharassment policies, friendly space policies: They restrict some people's behavior and require certain kinds of contributions from beneficiaries, so as to increase everyone's capabilities and freedom in the long run.

One pretty popular theory of change goes like this: we will make better software and have a greater impact if more people, and more different kinds of people, find our communities more appealing to work in. One thing making an unpleasant environment and driving away contributors, especially contributors with perspectives that are underrepresented in our communities, is hurtful misbehavior in community spaces. So we'll make the trade and say that it's worth it to restrict some behavior, in order to make the environment better so more, and more varied, people can do work in our communities, and thus make more free software and make it better.

And here's another related one, very similar to the one above, but focusing on the day-to-day freedom of community participants who are marginalized. If the constraint stopping me from, for instance, speaking in an IRC channel is that I strongly suspect I'll be harassed if they know I'm a woman, and that I don't have any reason to believe I can avoid or usefully complain about that harassment, how free am I to participate in that community? Is there perhaps a way to understand a certain level of safety as a necessary prerequisite to liberty?

I realize that this is probably the one room in the world where I have the highest chance of getting into a multi-hour "what does freedom mean" bikeshedding session, so I'm going to avoid focusing on the second model there and focus more on the first one, which emphasizes the end result of more free software.

Photo of me at FOSDEM 2016, CC BY-SA Luis García Castro, https://twitter.com/luiyo/status/700938185115836416

C. Assumptions

So I am not assuming that everyone in this room is a copyleft advocate, but I am going to assume from this point forward that we in this room fundamentally understand the restrictive license argument, that we have a handle on the theory of change that it's operating on. And similarly, I'm sure there are people here who aren't so big on codes of conduct, but I'm going to assume that we fundamentally understand the theory of change behind that approach, regardless.

D. Similarities

Now let's talk about similarities. Chris Webber calls both of these approaches "added process which define (and provide enforcement mechanisms for) doing the right thing." I agree. Without this kind of gatekeeping we see free rider incentives, on other people's software work and on other people's attention and patience and emotional labor.

They are written-down formalizations of practices and values that some community members think should be so intuitive and obvious that asking people to formally offer or accept the contract is an insult, or at least an unnecessary inconvenience. And so some people counterpropose sort-of-humorous policies, such as the "Do What the Fuck You Want to" software license and "don't be a jerk" codes of conduct.

They are loci of debate and fragmentation.

Some people agree to them thoughtfully, some agree distractedly as they would to corporate clickthrough EULAs, some disagree but click through anyway (acting in bad faith), some disagree and silently leave, some disagree and negotiate publicly, some disagree and fork publicly. Some people won't show up if the agreement is mandatory; some people won't show up UNLESS it's mandatory; some people don't care either way. And, by the way, good community management requires properly predicting the proportions, and navigating accordingly.

Both copyleft licenses and codes of conduct are approaches to solving problems that became more apparent along with different people realizing they have different expectations and needs, and consider different outcomes or processes to be "fair."

These kinds of codes and licenses usually cover specific bounded events and spaces or sites, and their scope covers interpersonal or public interactions. Codes of conduct usually don't cover conversations outside community-run spaces or the beliefs you hold in your head; open source licenses' restrictions usually kick in on redistribution, not use, so they don't constrain anything you do only on your own computer.

Neither one of these approaches can rely on self-enforcement. There is some self-enforcement of both, of course. There's a perception that -- as Harald K. commented on my blog post -- "licenses more or less police themselves (or in extreme instances, are policed by outsiders) whereas codes of conduct need an internal governing structure, a new arena where political power can be exercised." My personal understanding, which I share with people like Matthew Garrett, is that there's a ton of license-breaking happening, and we need to support existing organizations like the Software Freedom Conservancy to police that misbehavior and litigate to defend the GPL. As Conservancy head Karen Sandler points out in her December essay "From a lawyer who hates litigation", "I've seen companies abuse rights granted to them under the GPL over and over again. As the years pass, it seems that more and more of them want to walk as close to the edge of infringement as they can, and some flagrantly adopt a catch-me-if-you-can attitude." And you see enough individuals in our communities acting similarly that I don't think I need to belabor this point; codes of conduct are much more productive when they're actually, you know, enforced.

And both copyleft licenses and codes of conduct restrict freedom regarding certain acts, over and above what is restricted by the law, in the interest of a long-term good, which can in both cases be construed as greater freedom. As Belle Waring says to one skeptic in the Crooked Timber comments, paraphrasing their argument: "part of your reasonable resentment is, 'I don't want to be forced to do freedom-restricting things in support of a very uncertain outcome, just because the final proposed outcome is a good one.'" I will go into that bit of argument later.

E. Differences

But these kinds of agreements are different on a few different axes, which I think are worth considering for what they tell us about open stuff community values and about our intuitions on what kinds of freedom restrictions we find easier to accept.

One is that many codes of conduct focus on in-person events such as conferences, rather than online interactions. Many of the unpleasant incidents that caused communities to adopt CoCs -- or that communities see as "let's not let that happen here" warning bells -- happen at face-to-face events. And face-to-face spaces have a much longer history and context of ways of dealing with bad behavior than do online spaces. After all, a pretty widespread reading of the core function of government and law enforcement is that they keep Us Good Guys safe by stopping The Bad Guys from committing face-to-face (or knife-to-face or chair-to-face) assault.

But there's another axis I want to explore here: whether the behavior constraint feels like a contract or whether it feels like governance. Of course, we toss around phrases like "the social contract" and use the metaphor of contract to talk about the legitimacy of government, but to an ordinary citizen, contracts and governance feel like significantly different things. To oversimplify: to a non-lawyer like me, something that feels like a contract formalizes a specific trade, something discrete and finite and a bit rare. A copyleft license feels that way to me; it specifies that if I distribute a certain artifact -- which is something I would only do after some amount of thought and work -- I then also undertake certain obligations, namely, I must also redistribute the software's source code, under the same license. And, notwithstanding edge cases, it is often easy to examine the artifact, follow a decision procedure, and determine that I have complied with the terms of the license. If I meant to comply in the first place.

On the other hand, when we make rules constraining acts, especially speech acts, it feels more like governance.

Codes of conduct serve as part of a community's infrastructure to fulfill the first duty of a government — to protect its citizens from harm — and in order to make them work, communities must develop governance processes. That is to say, "governance" is what we call it when we're explicit about who gets to make and implement rules that affect everyone in a community, and how we choose those people or get rid of them. And a governance body does not necessarily have to be a legal entity. For instance, in MediaWiki governance, there's an architecture committee that decides on large technical architectural changes, and it has no standing in the eyes of the United States government.

It takes work to evaluate whether actions have complied with rules, and that work might require asking questions of suspects, bystanders, and targets. Enforcing a code of conduct, even a narrowly scoped anti-harassment policy, often requires that someone act on behalf of a community to do this, and to implement the outcome -- be it informed by retributive, rehabilitative, transformative, or some other justice model. And it feels more like governance than contract to me if a rule applies to actions I take many times a day without deliberate planning -- such as saying something in my project's live internet chat room.

One way of thinking about this is: is there some kind of authority that the community acknowledges as having legitimate power over everyday behavior, over and above existing government with a capital G? Because, again, licenses affect certain coding and architectural decisions, but they don't preclude, for instance, everyday discussion. In fact, the social and digital infrastructure it takes to make robust and usable software, including our bug reports, our automated tests, our conversation on mailing lists, and so on, is often not covered by any particular open license -- if it were, maybe we'd be seeing a different level of pushback even from developers who are happy with copyleft as applied to their code.

F. Shortcomings of the contract model

But I think another interesting thing that happens when you compare a governance model to a contract model, regarding approaches we take to improving behavior in our communities, is seeing how governance wins. It takes a lot of work, but it has a lot of advantages.

1. Flexibility

Contracts are binary where ongoing dialogue and governance can be more flexible and responsive. If I were going to be really annoying I would compare them to compiled bytecode and to interpretable scripts. Contracts have to sort of self-contain the tests for what the contract permits, mandates, and prohibits, whereas governance mechanisms and bodies can use more general standards, which might change over time. To quote one of the commenters on my essay, Stephenson-quoter kun:

contracts explicitly restrict acts which are simply unpardonable -- not sharing the source code to your modified version of a GPL-licensed project, sexually assaulting someone at a conference -- because everyone agrees that those things are wrong and we feel confident that we can agree up-front that there can never be any extenuating circumstances in which those things are actually OK. Governance, however, can serve to 'nudge' people away from bad behaviours – poor coding standards, rudeness on mailing lists -- by giving us a standard to measure those things against without enumerating every possible violation of the standard. A governance procedure can take context into account, and is much more easily subject to improvement and revision than a contract is.

Sometimes it's the little stuff, more subtle than the booth babe/groping/assault/slur kind of stuff, that makes a community feel inhospitable to me. When I say "little stuff" I am trying to describe the small ways people marginalize each other: dominance displays, cruelty in the guise of honesty, the use of power in inhospitable ways, feeling unvalued, "jokes", clubbiness, watching my every public action for ungenerous interpretation, nitpicking, and bad faith.

Changing these habits requires a change of culture, and that kind of deliberate change in culture requires people who take up the responsibility in stewarding the culture.

And a governance approach has a lot more ability to affect culture than a contracts-only approach does.

2. Contracts give us an illusion of equality and self-containedness

As Tim McGovern said in the comments to my Crooked Timber post:

contracts have taken over as a primary way of negotiating relationships: a EULA is a replacement for a legal understanding of the relationship between two parties who are doing business. I don't, in other words, sign a EULA when I buy a pair of socks -- or even when I buy a car (Teslas excepted) because the purchase relationship is legally defined; even the followup on what can and can't be in your warranty is legally defined. But companies would rather be bound by an agreement they write than a body of law based on either commonlaw or constitutional concepts, or legislation.
Contracts presume an equality between the parties; in theory, both sides can take a breach of contract to court. In practice, of course, a EULA is a contract that masks radical inequality in power between the parties.... Governance requires wrestling with equality in a real way, on the other hand, and voluntarily submitting to an authority constituted in some fashion (over time, by people, etc.), as opposed to preserving a contractual illusion of equality.

3. Contract pretends you have choices

I recommend that, if you haven't, you check out the article "Mothering versus Contract" by Virginia Held, from Beyond Self-Interest in 1990. It suggests that perhaps we should fundamentally conceive of our interactions with others as following a paradigm of motherhood rather than of contract -- one truth this approach acknowledges is that by default most interactions in your life are opt-out rather than opt-in, if there's any opting or choice at all.

Yes, there's the freedom to fork. But realistically, if you want to get things done, you have to collaborate with others, and we need to accede to other people's demands, in terms of interface compatibility, learning and speaking fluent English, and all sorts of other needs. A FLOSS project with a thriving ecology of contributors is far more valuable than a nearly identical chunk of code with only a couple of voices available to help out, and thus the finite amount of human attention limits our ability to make effective forks. We're more inderdependent than independent, and acknowleding that as a fundamental truth complicates the contracts-y libertarian narrative potentially beyond usefulness.

III. Lessons

I hope that my analysis helps give some vocabulary and frameworks for understanding arguments around these issues, and that we can use them to develop more effective arguments.

A. Freedom tradeoff comparisons

The first step might be — if you're trying to get your community to adopt a code of conduct, you might benefit by looking at other freedom-restricting tradeoffs the community is okay with, so you can draw out that comparison.

Or with UX (user experience) -- design is the art of taking things away, and when you're advocating for better user experience, which often involves reducing the number of visible ways to do things, consider comparing your approach to one of the freedom tradeoffs that your interlocutor is already okay with, such as the fact that your community has standardized on a single version control system. A single way for that kind of user to interact.

B. Artifacts

And if you're trying to build a code of conduct consensus in your community, it might help to start by talking, not about day-to-day beavior, but about artifacts that people think of as artifacts. Talk about the things we make, like slide decks for presentations, articles on your wiki. That can get people on the same page as you, in case they're not yet ready to think of the community itself as an artifact we make together.

C. Theory of change

If you're an advocate for a new initiative, licensing, code of conduct, or something else, understand your own theory of change, and build mental models to help you understand the people who disagree with you. Understand what part of the theory of change they disagree with, and gather data to counter it.

And, incidentally, this lens will also help you appreciate other complementary approaches that will help you achieve your goals. As Mike Linksvayer says: "Of course I think that copyleft advocates who really want to ensure people have software freedom rather than just being enamored of a hack should be always on the lookout for cheaper and/or socialized enforcement (as implied above, control of distribution channels that matter, and state regulation)."

So why might people oppose codes of conduct? Here are a few ideas:

they might disagree on whether the goal makes sense
or on whether codes of conduct, when enforced, make the situation more conducive to diverse populations and to net growth in community -- have your research close at hand!
or on what the biggest problems you're facing are, and whether they're community recruitment and retention

As Chris Webber notes, "there's an argument that achieving real world social justice involves a certain amount of process, laying the ground for what's permitted and isn't, and (if you have to, but hopefully you don't) a specified direction for requiring compliance with that correct behavior." The addendum is that, as Alberto Brandolini said "The amount of energy necessary to refute bullshit is an order of magnitude bigger than to produce it." So part of the mental model you're trying to understand is what the person you're arguing with is trying to maximize, and another part is whether you agree on how to maximize it.

Paul Davis, the Ardour BDFL, commented on my Crooked Timber post, "The dilemma for a mid-size project like mine is that the overhead of developing and maintaining a CoC seems like just another thing to do amidst a list of things that is already way too long, and one that addresses a problem that we just don't have (yet)." He said he's more worried about technical, architectural decisions causing developer loss.

So, for instance, you could argue with Paul: what genuinely causes developer loss? And what priorities should you have, given your goals?

D. A fresh set of governance needs and questions

CoC adoption drives the adoption of explicit governance mechanisms, as Christie Koehler has recently explored in depth in her post "The complex reality of adopting a meaningful code of conduct" .... but we have many open questions that the legal and policy community within free and open source could really help with.

For instance, it's great that we have people like Ashe Dryden and organizations like Safety First PDX helping develop standards and advising organizers on developing and enforcing codes of conduct, but should we actually be centralizing this kind of reporting, codification and enforcement across the FLOSS ecosystem? Different subcommunities have different needs and standards, but just as OSI has helped us stave off the worst possibilities of license proliferation, maybe we should be avoiding the utter haphazardness of Code of Conduct proliferation.

And -- given how interconnected our projects are -- what if single open source projects are the wrong size or shape or scope for this particular aspect of stewardship and governance?

I'd very much appreciate thoughts on this from other folks in future devroom talks or blog posts -- if you tell me this is the kind of thing we talk about on the FLOSS Foundations mailing list then maybe I'll have to bite the bullet and go ahead and subscribe.

IV. Other thoughts + Conclusion

A. Comments on my CT piece

The comments on my Crooked Timber piece had many fine insights, on enforcement, culture, exit, voice and loyalty, fairness, and the consent of the governed. They're worth reading.

B. Hospitality to liberty spectrum

In addition to the contract-governance contrast, I think it's also worth thinking about the spectrum of liberty versus hospitality. The free software movement really privileges liberty, way over hospitality. And for many people in our movement, free speech, as John Scalzi put it, is the ability to be a dick in every possible circumstance. Criticize others in any words we like, and do anything that is not legally prohibited.

Hospitality, on the other hand, is thinking more about right speech, just speech, useful speech, and compassion. We only say and do things that help each other. The first responsibility of every citizen is to help each other achieve our goals, and make each other happy.

I think these two views exist on a spectrum, and we are way over to one side, the liberty side, as a community, and moving closer to the middle would help everyone learn better and would help us keep and grow our contributor base, and help make it more diverse. And to the extent that comparing codes of conduct to copyleft licenses helps some people put new initiatives in perspective, balancing the relationship between rights and responsibilities, perhaps that can also help shift our culture into one that's more willing to be hospitable. I hope.

C. This feels like a potentially insoluble problem

William Timberman said in Crooked Timber comments, "how does a socialist persuade a libertarian that coherence and the common good is sometimes a legitimate constraint on individual freedom?" And the answer is that I don't know, but I hope it is a soluble problem, and I hope I've opened up some avenues for exploration on that topic. Thank you.

Filed under: Work Work:Wikimedia Conferences and Performances Deliverables Open Source and Free Culture

# 19 Feb 2016, 06:50PM: What Should We Stop Doing? (FLOSS Community Metrics Meeting keynote): "What should we stop doing?": written version of a keynote address by Sumana Harihareswara, delivered at the FLOSS Community Metrics Meeting just before FOSDEM, 29 January 2016 in Brussels, Belgium. Slide deck is a 14-page PDF. Video is available. The notes I used when I delivered the talk were quite skeletal, so the talk I delivered varied substantially on the sentence level, but covered all the same points.

Photo of me at FLOSS Metrics meeting, public domain by ben van't ende, https://photos.google.com/share/AF1QipMGh90Jfl8uVKH3U-e4CGF93i-vbHvhjbWVOvkn3ZlOeBAoc5PX_n_augA9v-cvPQ/photo/AF1QipNmQJdbqw2TrhcX6HuooqUuQmLfFbRkn73QW_Aq?key=NXFqUUVqdlN6MWdQSFdSNEFBSVFKajRpQVVQNnpn I'd like to start with a story, about my excellent boss I worked for when I was at the Wikimedia Foundation, Rob Lanphier, and what he told me when I'd been on the job about eight months. In one of our one-on-one meetings, I mentioned to him that I felt overwhelmed. And first, he told me that I'd been on the job less than a year, and it takes a year to ramp up fully in that job, so I shouldn't be too worried. And then he reminded me that we were in an amazing position, that we would hear and get all kinds of great ideas, but that in order to get anything done, we would have to focus. We'd have to learn to say, "That's a great idea, and we're not doing it." And say it often. And, he reminded me, I felt overwhelmed because I actually had the power to make choices, about what I did with my time, that would affect a lot of people. I was not just cog # 15,000 doing a super specialized task at Apple.

So today I want to talk with you about how to use the power you have, in your open source projects and organizations, and about saying no to a lot of things, so you can focus on doing fewer things well -- the Unix philosophy, right? I'll talk about a few tools and leave you with some questions.

Tool 1: Remember to say no to the lamppost fallacy

The lamppost fallacy is an old one, and the story goes that a drunk guy says, "I dropped my keys, will you help me look for them?" "OK, sure. Where'd you drop them?" "Under that tree." "So why are you looking for them under this lamppost?" "Well, the light is better here."

A. Quantitative vs qualitative in the dev data

The first place we ought to check for the lamppost fallacy is in overvaluing quantitative metrics over qualitative analysis when looking at developer workflow and experience. Dave Neary said, in the FLOSSMetrics meeting in 2014, in "What you measure is what you get. Stories of metrics gone wrong": Use qualitative and quantitative analysis to interpret metrics.

When it comes to developer experience, you can be analytical while both quantitative and qualitative. And you rather have to be, because as soon as you start uncovering numbers, you start asking why they are what they are and what could be done to change that, and that's where the qualitative analytical approach comes in.

Qualitative is still analytical! Camille Fournier's post, "Qualitative or quantitative but always analytical", goes into this:

qualitative is still analytical. You may not be able to use data-driven reasoning because you're starting something new, and there are no numbers. It is hard to do quantitative analysis without data, and new things only have secondary data about potential and markets, they do not have primary data about the actual user engagement with the unbuilt product that you can measure. Furthermore, even when the thing is released, you probably have nothing but "small" data for a while. If you only have a thousand people engaging with something, it is hard to do interesting and statistically significant A/B tests unless you change things drastically and cause massive behavioral changes.

This is applicable to developer experience as well!

For help, I recommend the Wikimedia movement's Grants Evaluation & Learning team's table discussing quantitative and qualitative approaches you can take: ethnography, case studies, participant observation, and so on. To deepen understanding. It's complementary with the quantitative side, which is about generalizing findings.

B. Quantifiable dev artifacts-and-process data versus data about everything else

Another place to check for the lamppost fallacy is in overvaluing quantifiable data about programming artifacts and process over all sorts of data about everything else that matters about your project. Earlier today, Jesus González-Barahona mentioned the many communities -- dev, contributor, user, larger ecosystem -- that you might want to research. There's lots of easily quantifiable data about development, yes, but what is actually important to your project? Dev, user, sysadmin, larger ecology -- all of these might be, honestly, more important to the success of your mission. And we also know some things about how to get better at getting user data.

For help, I recommend the Simply Secure guides on doing qualitative UX research, such as seeing how users are using your product/application. And I recommend you read existing research on software engineering, like the findings in Making Software: What Really Works and Why We Believe It, the O'Reilly book edited by Andy Oram and Greg Wilson.

Tool 2: know what kind of assessment you're trying to do and how it plays into your theory of change

Another really important tool that will help you say no to some things and yes to others is knowing what kind of assessment you're trying to make, and how that plays into your hypothesis, your theory of change.

I'm going to mess this up compared to a serious education researcher, but it's worth knowing the basics of the difference between formative and summative assessments.

Formative assessment or evaluation is diagnostic, and you should use it iteratively to make better decisions to help students learn with better instruction & processes.

Summative assessment is checking outcomes at the conclusion of an exercise or a course, often for accountability, and judging the worth/value of that educational intervention. In our context as open source community managers, this often means that this data is used to persuade bosses & community that we're doing a good job or that someone else is doing a bad job.

As Dawn Foster last year said in her "Your Metrics Strategy" speech at the FLOSSMetrics meeting:

METRICS ARE USEFUL Measure progress, spot trends and recognize contributors.
Start with goals: WHY FOCUS ON GOALS? Avoid a mess: measure the right things, encourage good behavior.

Here's Ioana Chiorean, FLOSS Community Metrics meeting, January 30th 2015, "How metrics motivate":

Measure the right things... specific goals that will contribute to your organization's success

Dave Neary in 2014 in "What you measure is what you get. Stories of metrics gone wrong" at the Metrics meeting said:

be careful what you measure: metrics create incentives
Focus on business and community's success measurements

And this is tough. Because it can be hard to really make a space for truly formative assessment, especially if you are doing everything transparently, because as soon as you gather and publish any data, people will use it to argue that we ought to make drastic changes, not just iterative changes. But it might help to remember what you are truly aiming at, what kind of evaluation you really mean to be doing.

And it helps a lot to know your Theory of Change. You have an assessment of the way the world is, a vision of how you want the world to look, and a hypothesis about some change you could make, an activity or intervention you could perform to move us closer from A to B.

There's a chicken and egg problem here. How do you form the hypothesis without doing some initial measurement? And my perhaps subversive answer is, use ideas from other communities and research to create a hypothesis, and then set up some experiments to check it. Or go with your gut, your instinct about what the hypothesis is, and be ready to discard it if the data does not bear it out.

For help: Check out educational psychology, such as cognitive apprenticeship theory - Mel Chua's presentation here gives you the basics. You might also check out the Program/Grant Learning & Evaluation findings from Wikimedia, and try out how the "pirate metrics" funnel -- Acquisition, Activation, Retention, Referral, Revenue, or AARRR -- fits with your community's needs and bottlenecks.

Tool 3: if something doesn't work, acknowledge it

And the third tool is that when we see data saying that something does not work, we need to have the courage to acknowledge what the data is saying. You can move the goalposts, or you can say no and cause some temporary pain. We have to be willing to take bug reports.

Here's an example. The Wikimedia movement likes to host editathons, where a bunch of people get together and learn to edit Wikipedia together. We hoped that would be a way to train and retain new editors. But Wikipedia editathons don't produce new long-term editors. We learned:

About 52% of participants identified as new users made at least one edit one month after their event, but the percentage editing dropped to 15% in the sixth months after their event

And, in "What we learned from the English Wikipedia new editor pilot in the Philippines":

Inviting contribution by surfacing geo-targeted article stubs was not enough to motivate or help users to make their first edits to an article. Together, all new editors who joined made only six edits in total to the article space during this experiment, and they made no edits to the articles we suggested.
Providing suggestions via links to places users might go for help did not appear to sufficiently support or motivate these new editors to get involved. 50 percent of those surveyed later said they didn’t look for help pages. Those who did view help pages nevertheless did not edit the suggested articles.

But over and over in the Wikimedia movement I see that we keep hosting those one-off editathons. And they do work to, for instance, add new high-quality content about the topics they focus on, and some people really like them as parties and morale boosters, and I've heard the argument that they at least get a lot of people through that first step, of creating an account and making their first edit. But that does not mean that they're things we should be spending time on, to reverse the editor decline trend. We need to be honest about that.

It can be hard to give up things we like doing, things we think are good ideas and that ought to work. As an example: I am very much in favor of mentorship and apprenticeship programs in open source, like Google Summer of Code and Outreachy. Recently some researchers, Adriaan Labuschagne and Reid Holmes, raised questions about mentorship programs in "Do Onboarding Programs Work?", published in 2015, about whether these kinds of mentorship programs move the needle enough in the long run, to bring new contributors in. It's not conclusive, but there are questions. And I need to pay attention to that kind of research and be willing to change my recommendations based on what actually works.

We can run into cognitive dissonance if we realize that we did something that wasn't actually effective. Why did I do this thing? why did we do this thing? There's an urge to rationalize it. The Wikimedia FailFest & Learning Pattern hackathon 2015 recommends that we try framing our stories about our past mistakes to avoid that temptation.

Big 'F' failure framing:
We planned this thing: __________________________
This is how we knew it wasn't working: __________________________
There might have been some issues with our assumption that: __________________________
If we tried it again, we might change: __________________________

Little 'f' failure framing:
We planned this thing: __________________________
This is how we knew it wasn't working: __________________________
We think that this went wrong: __________________________
Here is how to fix it: __________________________

For help with this tool, I suggest reading existing research evaluating what works in FLOSS and open culture, like "Measuring Engagement: Recommendations from Audit and Analytics" by David Eaves, Adam Lofting, Pierros Papadeas, Peter Loewen of Mozilla.

Priorities

I have a much larger question to leave you with.

One trend I see underlying a big chunk of FLOSS metrics work is the desire to automate the emotional labor involved in maintainership, like figuring out how our fellow contributors are doing, making choices about where to spend mentorship time, and tracking a community's emotional tenor. But is that appropriate? What if we switched our assumptions around and used our metrics to figure out what we're spending time on more generally, and tried to find low-value programming work we could stop doing? What tools would support this, and what scenarios could play out?

This is a huge question and I have barely scratched the surface, but I would love to hear your thoughts. Thank you.

Sumana Harihareswara, Changeset Consulting

Filed under: Management and Leadership Work Work:Management and Leadership Work:Wikimedia Conferences and Performances Deliverables Open Source and Free Culture Open Source and Free Culture:Advice

# 23 Feb 2016, 04:00PM: Leadership Crisis at the Wikimedia Foundation: This week, the Wikimedia Foundation, the main organization supporting Wikipedia and several other free knowledge projects, is at the peak of a leadership crisis more than a year in the making. Molly White's timeline of the crisis is a useful guide to the facts, and I feel compelled to speak publicly for the first time about the problem, and share my personal perspective, with a bit of context to help non-Wikimedians understand.

I left the Wikimedia Foundation in September 2014 after four years. I mentioned the reasons then, in that post, around learning new things and working on projects with less of a public spotlight. I'm happy with my new direction, Changeset Consulting LLC, but still have so many fond memories of working with fantastic people and making a difference.

I left WMF thinking that it was fine -- in fact, that's a reason I felt okay about deciding to leave a place I cared about so much, because I thought WMF could cope without me. As I perceived it, former Executive Director (nonprofit-speak for "CEO" basically) Sue Gardner had led the organization to a stable enough place that she felt free to move on. For years, when I was at Wikimedia Foundation, our top priority was reversing the decline in the number of active Wikipedia contributors and other Wikimedia contributors; Sue Gardner articulated this priority and ensured everyone knew what we were aiming at and why. Lila Tretikov, the new executive director, was settling in and I perceived WMF to be on the right track, iteratively moving closer to reversing the editor decline, with solid management and plans in place to keep positive momentum going. I thought the conflicts and stumbles from summer 2014 were normal temporary pains, not unusually worrying.

A few months after I left, when I caught up with old Foundation colleagues, I started hearing wariness about the new high-level management (the ED and some other newer executive hires). The worries progressed into stronger and stronger concerns, getting more and more disturbing. For instance, in November 2015, a committee that disseminates Wikimedia funds to budget other Wikimedia institutions (such as chapters) wrote a scathing critique:

...the [Funds Dissemination Committee] laments that the Wikimedia Foundation's own planning process does not meet the minimum standards of transparency and planning detail that it requires of affiliates applying for its own Annual Plan Grant (APG) process....
The FDC is appalled by the closed way that the WMF has undertaken both strategic and annual planning, and the WMF's approach to budget transparency (or lack thereof).

I nearly could not believe my eyes when I saw this. For those of you who don't follow these bureaucracies, let me assure you that the FDC does not throw around words like "appalled" lightly. (Followup on the FDC recommendations.)

Early this year, it became public knowledge that the conflicts within the Foundation had led to an employee survey with a 93% response rate. The results included:

I have confidence in senior leadership at Wikimedia: 10% agree

It is a miracle if 90% of Wikimedia Foundation personnel agree on anything beyond the fact that WMF's commitment is to "a world in which every single human being can freely share in the sum of all knowledge." And to be clear, "confidence in senior leadership" here means that the employees trust that the C-level executives have the basic competence to run the organization. This isn't about agreeing or disagreeing on particular choices of method; when I was at WMF, the Executive Director and the Chief ____ Officers made decisions that some employees disagreed with, but they explained their reasoning, they encouraged feedback and responded to it (example), and we fundamentally knew they were aiming to collaborate with us in achieving the mission. It sounds like some big pieces of that trust are now missing.

Also:

1. In late December the Board of Trustees dismissed a well-liked community-elected trustee, Dr. James Heilman, for reasons that remain somewhat mysterious...
3. Revelations about newly appointed Board trustee Arnnon Geshuri's involvement in an illegal anti-poaching scheme while at Google has drawn community outcry
4. Besides failing to vet Geshuri, the WMF's increasing tilt toward the Silicon Valley and focus on (perhaps) the wrong technology projects have come into sharper relief

So, Arnnon Geshuri. You know that scandal where it came to light that big Silicon Valley tech firms were colluding to suppress wages and reduce employee mobility with illegal "no-poaching" agreements? With evidence including super damning emails? Guess who sent some of that email, perpetuating that pact? Arnnon Geshuri. The WMF Board of Trustees appointed him as one of the trustees. He's since stepped down but the incident damaged already-shaky trust between the Board and the larger Wikimedia community.

And as of last week it's clear that the situation's gotten even more dire. Fantastic colleagues are voting with their feet and leaving (and do you know how hard it is to find and hire the right people for an org this weird and this important?). People who would rather walk with rocks in their shoes than throw their coworkers under the bus are compelled to speak in public about dysfunction at the top: Ori Livneh, Anna Stillwell, and Greg Grossmeier, for instance, and Brion Vibber, who was the first employee the Foundation ever hired. Faidon Liambotis, Principal Operations Engineer at the Foundation & a longtime Debian Developer. Gayle Karen Young, former WMF Chief Talent & Culture Officer, who has a world-class ability to fuse compassion and systems-level thinking in the management of people processes, writes publicly about "dysfunction at the top" and "the enormous toll" it's taking on the staff. Erik Möller, who served on the Board of Trustees before he served as WMF second-in-command for more than seven years and then left in April 2015, a guy who has seen a thousand Wikimedia thunderstorms come and go and could probably charge for calm-as-a-service, says that "the situation is very much out of control" and "this is a crisis". This is not just the ordinary grumblings of a transparent organization. This is dire.

Executive Director Lila Tretikov said on Monday that Wikimedia has now managed to stem the editor decline. Möller replies and asks: is that so? and reviews the current stats, which do not reflect this claim.

But overall, it seems premature of speaking of "stemming the decline", unless I'm missing something (entirely possible).

There have been a thousand thunderstorms before. The Image Filter, SOPA, the transition from Toolserver to Tool Labs, Narrowing Focus, paid editing, the first VisualEditor rollout, the India Education Program, I could keep going and the point is that we Wikimedians have big ideas and big passionate arguments and we know some things about how to get through them. The movement, thank goodness, is bigger than the Foundation. The volunteers, the chapter staff, the teachers and photographers and coders and editors and everyone in the hundreds of subcommunities in our ecology have some buffers against the ripples coming out of the Foundation. There's frustration, sometimes, in how hard it can be for one subcommunity or organization to persuade or lend a hand to another, but right now it's a good thing because there is short-term resilience in that loosely federated structure.

This has been a singularly destructive time, but we still have time to keep the leadership problem from further damaging the Wikimedia movement.

As I said back in September 2014,

One of the things I admire about Wikimedia's best institutions is our willingness to reflect and reinvent when things are not working.

I don't know what the answer is.

The choice between exit and voice is conditioned on loyalty. We know that Wikimedians have been exercising the "voice" option; the Board and the ED have heard these criticisms loud and clear. And we know they've witnessed the stream of talented employees exiting, their steel-clad loyalty finally succumbing to the pressure. If unaltered, this is the kind of dynamic that leads to schisms and forks. I would hate for the movement to have to pay that kind of cost but, unless the Wikimedia Foundation Board of Trustees and Executive Director change course, I think that's a potential outcome.

Filed under: Management and Leadership Work:Management and Leadership Work:Wikimedia Open Source and Free Culture