I just finished a day at the SIAM CSE 2019 conference, where I
gave a talk as part of a mini-symposium on software sustainability
(my
abstract, and my talk
slides; see the 'cpr' tag for all my
recent blog posts on this topic.)
When I was outlining the talk, I spent a fair amount of time
noodling about how I wanted to approach the subject. I have a lot
of disorganized thoughts that I think can be put together in
interesting ways, but for a 20 minute talk, I really needed to pick
a narrow focus.
Here's what I ended up with. I'm curious for reactions!
Defining a term, "communities of effort"
I'll start by defining "communities of effort" as a community
formed in pursuit of a common goal. The goal can be definite or
indefinite in time, and may not be clearly defined, but it's
something that (generally speaking) the community is aligned
on.
The term "effort" here refers to focused
or engaged attention, and in this sense in particular, I mean
the focused attention applied towards the common goal.
One rational goal of such a community is to achieve the goal
without wasting effort through duplication or redundancy in work.
This connects with my earlier blog post on the
open source anti-Sisyphean League, a term coined by Cory
Doctorow: the idea is that there are a number of rocks to be rolled
up hills, and (in an open community) there is no reason for people
to roll those rocks up the hill independently, since they can take
advantage of each other's efforts.
This community of effort directs itself towards achieving the
goal, applying the available effort to the task. Here, effort is a
finite resource that is consumable - you cannot apply the
same effort to more than one task, and the effort that is applied
towards one task is not available to be applied to another task.
(Of course, the available effort can be renewed or
increased - more on that later.)
Effort as a common pool resource
The trickiest and most uncertain link is this: I think that the
effort applied towards the common goal is, to some extent, directed
by the community. That is, the available effort - which consists of
work by individuals towards the collective goal - is at the very
least loosely coordinated with the community, if not coordinated
more closely.
(This may be because the community needs to be involved in order
to decrease redundancy. Not sure.)
If this is true - that effort is coordinated by the community
rather than the individual, and so is non-excludable, and also is a
finite resource that can be consumed, and is thus rivalrous, this
turns it into a common pool resource.
Common pool resources are well known to anyone who has heard of
the tragedy of the commons: they are resources that are subject to
this tragedy, of being consumable by many in an unregulated
way.
What are some examples of these "communities of effort"?
A prime example is open source projects like Python. They're
rooted in a community approach; they're not not run by a
corporation or a government agency; and any structure (like a
nonprofit) is created after they already exist (and usually after
they are successful!)
I think the Carpentries training community is another good
example. This is a community of people interested in teaching and
training in data science and software engineering that essentially
self-assembled, and is aligned around their mission (of teaching
and training). The non-profit structure around it is, again, an ex
post facto creation.
Data analysis commons, in which methods, data, compute
resources, and data analysis interfaces are coordinated to address
the data analysis needs of a community, would be another
example.
(Wikipedia might be another, but I'm less familiar with how it
works.)
Why do we care about these communities?
Well, these communities are amazingly effective, in at
least some cases. For example, Python and R between them are
essentially the modern data science languages - both are
open source, both are community coordinated.
More generally, it is probably not an exaggeration to say that
the products of open source communities of effort underly the vast
majority of Silicon Valley software, as well as most research
software.
Sustaining, growing, and supporting these communities is pretty
important!
How do these communities get started, and why are they
effective?
One feature of successful communities of effort - those that
seem to succeed in growing their pool of available effort - is that
they are often very organic in their approach to tackling their
mission. This is probably an effect of the community-based
approach, in that the members of these communities are to a
reasonably significant extent self-motivated and self-directed to
solve their problems, and so the solutions are often bottom-up
created with only a light level of coordination on top. (I'll
revisit this in terms of governance in a bit.)
The other kind of fun thing is that these days it's pretty easy
to bootstrap a community of effort: with some enthusiasm and a site
like GitHub, you can spin up a new community project quite
quickly.
Last but not least, many (most? all?) communities of effort have
at least one person who has placed their effort at the service of
the community mission. These are the leaders and/or maintainers of
the project.
So what's the problem? It's all good, right?
Well... there are a few things I don't really understand.
For one, the formation of large groups of people who sustain a
collective to pursue a common goal violates basic tenets of
collective action - at least, as I understand them. The idea here
is that, if there is a large group of people pursuing a common
goal, then the smart (economically rational) thing for someone to
do is ...not do any work at all, because the individual will reap
the benefits of the group work. So, what's different with these
communities of effort?
Sustainability and in particular maintenance is a big
question, too; these communities often rely on one or a few core
maintainers to make things happen, and it is really unclear why
these maintainers (who are often unpaid or underpaid) would take on
these tasks. Yes, they get kudos and reputation, but kudos and
reputation do not put food on the table... why do they do it?
(One thought - perhaps the creation of a successful community of
effort really depends on there being at least one person who
ignores short-term economic rationality? So then you just don't see
all the failed attempts where someone decides not to be irrational
and hence not bother? Another thought is that perhaps the key
aspect of many of these communities being open means that
the maintainer-type folk realize that no one else is tackling the
common goal, and since they need the goal met as well, they might
as well do it?)
Does framing the problem as a common pool resource problem
yield any solutions?
I think it does.
First, once you recognize effort as the limiting resource, the
question of how to maintain and increase that resource comes to the
forefront. There are a number of possible mechanisms, including
investing in making the community easy or rewarding to join,
welcoming new contributors, and/or providing special methods or
data or access to community members. In this view, these activities
become more central than they are if you are thinking only about
the overall goal or mission of the community.
Second, Elinor Ostrom outlined some design principles for
sustainability of common pool resources based on empirical studies,
in
Governing the Commons. One of these principles is about making
collective choice arrangements that allow most of the appropriators
(members of the community) to participate in the decision making
process.
Basically, this boils down to rewarding people who invest effort
with some level of influence in how that effort is applied towards
the community goals. This both incentivizes participation with
collective ownership, and also seems to allow a form of organic
communication where the people applying effort feed results from
their work back into the overall community direction. This is, to
my mind, one of the things that leads these communities to be so
effective.
This mode of governance by members of the
community for the community goal leads to another
interesting thought. Funders participate in these communities in
indirect ways, by seeking to fund (or being sought out to fund)
effort within the community. Rarely is the direction of this
support directly dictated by the funder; it's usually laundered
through the community member(s) being supported. This is both good
and bad - it limits the degree to which funders (and companies) can
directly influence the project, but also means that funders may not
be able to easily identify the uses to which their money will be
put.
Who is part of the community of effort?
Anyone who contributes their effort is part of the community,
and hence should get some form of influence over governance (by the
above design principle).
Extractive contributors - contributors who do not contribute to
the overall effort, especially the maintenance effort -
would not, however, be considered part of the community. See
How
open is too open? for this argument.
People who are using the product of the community but not
costing the community any effort (e.g. consumers of the source
code) would also not be part of the community, unless they
contribute in some way to the project.
One interesting result of this kind of thinking is that, for
data analysis commons, people who provide data or methods, or
training people, or contributing documentations, are contributing
effort. This provides a level of rational inclusion of this kind of
work within the community, and also in governance; they are in a
direct sense contributing to the sustainability of the community of
effort.
Is academia a good home for these communities of effort?
I note that the leadership and governance model in basic
research, at least, is often not inclusive of the people who are
doing the work, and instead centers on reputation and hierarchy. I
don't think universities and colleagues focused on basic research
are likely to be a good part of the support network for communities
of effort, in general.
I have been quite impressed with what I've seen of extension
efforts at universities, which are faculty-level investments of
time and energy in communities. I'm planning to look more in to the
idea of a digital extension model.
Some final thoughts
I think it's important to recognize that (these days at least)
there are lots of competing projects in which people can invest
their time and effort, and it's probably not a bad thing to frame
it as a competition between these communities for people's time and
attention. Communities that do a good job of attracting
contributors and incentivizing the inclusion of effort can win out
and potentially be more sustainable than communities that do a
lousy job. (This has potentially dire implications for some
scientific research communities, which are not always very
welcoming or inclusive. I'm not sad about this.)
This framing also puts soft skills front and center in
the equation, and I think this is also a good outcome.
Open / unaddressed questions
Two open questions.
First, what are communities of effort not good at? I
would venture that any boring or maintenance level jobs would tend
to be addressed poorly by these communities, due to how human
enthusiasm works.
Second, I want to return to the missing link mentioned above -
that these communities seemingly depend on one or more people
placing their effort in service of the community. What are the
reasons why people do this, and how do we support and maintain it?
Inquiring minds want to know... It would be nice if we had a
reasonably comprehensive picture of why this occurred, because it
doesn't seem like rational behavior on the face of it. (I'm very
thankful that people do this, of course, which is why I want to
better support this path!)
Acknowledgements
I gratefully acknowledge Adam Resnick, Matter Trunnel, Josh
Greenberg, Nadia Eghbal, Luiz Irber, and Tracy Teal, with whom I've
had inspiring conversations on these fronts.
The NIH (via the Data Commons funding) and the Moore Foundation
provided funding to me to think about, read about, and explore
these issues.
Comments welcome!
--titus