Deliberative Polls, Citizen Assemblies, and an Online Deliberation Platform
An exploration of Meta’s giant deliberative process. What worked. What didn't. What we could be doing differently.
Meta (formerly Facebook) recently made two major announcements:
They released results from the ~6500-person, 32-country, 19-language deliberative process I mentioned in my initial newsletter piece (technically their partners at Stanford released it).
Meta will convene another process on generative AI!
As readers of this publication know, I had been advocating for such processes and was the sole 3rd party embedded observer for the design and project execution of this first semi-global process (and part of the larger group that was able to observe the actual deliberations). I published a piece in Wired—‘Meta Ran a Giant Experiment in Governance. Now It’s Turning to AI’—that is out today with my high-level reactions to the process and the potential next steps.
In this post, I’ll first give a brief overview of what happened and my role, and then go into significant detail on some of the design decisions, benefits, and failures—focused specifically on the process design and online execution. (Consider the following an adapted excerpt from a working paper; for now at least, this is the canonical citation.)
Overview
Why does Meta’s deliberative process matter?
We absolutely need new ways for companies, governments, and transnational bodies to make informed and democratic decisions on critical issues that move much faster than traditional transnational policy-making—and especially around the AI systems that will transform our world.
This was one of the most ambitious and global attempts so far to implement deliberative democracy processes that might make this possible.
My perspective
I am excited to see a company give people more agency to make collective decisions about their products and policies.
I am fairly concerned about many of the details around the process and follow-through.
This is an incredible first step—it shows that such things are possible—but there is a lot that should be done differently going forward, both within the structure of the deliberative process itself, and its connection to product and policy.
My 'observer' role:
In my (intentionally unpaid) third-party observer role, I had a carveout to the standard Meta NDA (which I needed to sign for other areas of my work). That theoretically enables me to speak fairly freely about what I observed in the actual implementation of the deliberative ~6500-person process. (I mostly just can't share the meeting documents or anything about particular participants that are covered under privacy law.)
I was able to observe many of the meetings where the deliberation was planned, including all of the (very intense) logistics of running a process across every region of the globe—and of adopting the results across the company. I also separately spoke with many of the key actors involved in running and adopting the results of the deliberation. (Beyond being an embedded observer, I did have indirect influence on the process: informally advising on the pilots; through the questions that I asked; through briefings that I gave to the team; and through discussions I had around options and tradeoffs for different process approaches.)
This is not just about Meta
Meta has done many terrible things either through action or inaction—and I’ve called it out for this repeatedly over the past seven years. However, this approach shouldn't be judged negatively solely because it was done by Meta. Before Meta even considered this, I had reached out to many companies and organizations arguing for this approach (as did a few others as I later discovered). I've linked to this before, but if you haven't already, you can check out this working paper I put out while at Harvard Kennedy School’s Belfer Center and my last piece pulling together applications of deliberative processes to (and with) generative AI.
Meta may not be the ideal first mover—but we shouldn't judge the baby by the bathwater!
Process details
Most of the rest of this piece will get into the details of deliberative process design and execution—feel free to skip to the last few paragraphs if you're just interested in understanding the direct impacts on technology writ large (and check out the Wired article). Also, if you haven't already read my previous pieces on platform democracy and are not familiar with citizen assemblies or deliberative polls, I recommend checking that out to get a bit more context first:
Design & framing: deliberative polls versus citizen assemblies
There are a variety of different approaches to ‘representative deliberative processes’. Meta chose a deliberative poll, which is somewhat different from the citizen assembly processes that I had previously written the most about. In order to understand the Meta process, we will dive deep into the distinctions between these two competing approaches for representative deliberation.
Deliberative polls and citizen assemblies are similar in many key respects, especially that they both:
Prioritize considered judgments (i.e. deliberation) over gut opinions (i.e. the way that many people approach referendums, elections, traditional polls, and surveys). This deliberation leads to better-informed results.
Involve people selected via something like sortition (representative sampling) to form a microcosm of the population—instead of self-selection (e.g. in normal participatory processes) or delegation (e.g. elections). This randomness provides some robustness against powerful interest groups and malicious actors.
While there are significant similarities, and both would be considered representative deliberations, their differences in both framing and process have very significant impacts. In theory, they are roughly analogous to differences in procedural rules and processes in different legislative bodies (e.g. filibuster versus not in the US Senate), but in practice, the differences in framing have many downstream impacts.
In particular, deliberative polls, in practice, are primarily seen by participants, external observers, and even conveners, as a data-gathering endeavor that may inform governance, while (best-practice following) citizen assemblies are seen as themselves an alternative form of governance. In deliberative polls, the people are often called participants1 and often view it as an experiment or a study, in sharp contrast to citizen assemblies where the people are called assembly members and treated as such (the preferred word is ‘delegates’). Perhaps the lines between these should not be as distinct as they currently are throughout society—but the ramifications of in our current world are potentially significant, with implications on agency, external communications, and privacy. Moreover, as we will see, the currently implemented online system for running deliberative polls can particularly feel like a study, and is far more limited even than what I've outlined above—the process is extremely ‘cookie cutter’, with very little participant agency.
One of the primary focuses of deliberative polls, which illustrates a difference between data gathering and governance, is that deliberative polls are significantly focused on opinion change before and after deliberation. Knowing about opinion change can be extraordinarily helpful to know as a decision-maker, is a valuable frame for deliberative democratic legitimacy, and has other important implications—but it isn't the sort of thing that people tend to focus on for a governance process such as a legislature. For the purpose of governance, people care about the ultimate decision of the legislature, not where it started. Moreover, in many cases, no one even knows what the questions that should be tracked are; those are identified in the process of creating the recommendations or legislation.
Here is a more detailed overview of the differences in their goals and structures:
Citizen Assembly:
Treated as a sort of temporary, scoped legislature (usually advisory, but often requiring an action or response).
Convened to recommended policy changes.
Outputs are recommendations drafted by assembly members and approved through ‘rough consensus’ methods (often aiming for 80% approval).
Participants are selected through a variety of means, particularly 2-stage sortition, focusing on demographic representativeness (and sometimes including attitudinal representativeness—i.e., opinions on the issue at hand).
Prioritizes agency for members and creating good conditions for collaborative problem-solving in order to develop realistic and actionable recommendations.
Longer deeper deliberation (5+ days), including access to the ultimate decision-makers, and often the freedom for assembly members to identify additional experts and stakeholders to bring in to inform their deliberations.
Non-standardized structure, albeit with best practice guidelines. There are processes using the term even though they don't live up to the best practices.
Deliberative Poll:
Treated as an improved form of polling (often referred to as an experiment or study).
Convened to understand how an entire population would answer a set of questions if they had more time to think about it (with statistical rigor, including control groups).
Outputs are the participant's answers to polling questions (both before and after the deliberation; focusing on opinion change) and recordings of the small group deliberation, for decision-makers to interpret.
Participants are selected through a variety of means depending on the location, including: traditional polling firms, survey panels, and GIS-based sampling; with the goal of ensuring both attitudinal and demographic representativeness.
Prioritizes approaches for mitigating groupthink dynamics such as domination by the more advantaged, and the common tendency of small groups to go to more extreme positions.
Shorter deliberations (1-2 days), with pre-chosen experts and stakeholders and limited opportunity to interact with them (but enabling some people to participate who might not be able to join otherwise; even with significant compensation).
Highly standardized structure, and control over the term via trademark.
This isn't a perfect description, and there are notable exceptions; e.g. citizen assemblies that did not live up to the descriptions provided above, or deliberative polls where the outputs were directly tied to decisions instead of simply informing decision-makers. However, it allows us to summarize two of the often contentious ‘camps’ within the world of deliberative democracy and representative deliberations that have very strong opinions on these two kinds of processes. A deliberative poll ‘camp’ perspective is that citizen assemblies have sample sizes that are too small, don’t confirm if their members are truly representative in terms of their perspectives, are too long to be inclusive (because of the burden of participating in longer processes), with too much control by facilitators, and with insufficient evidence for avoiding groupthink dynamics, etc. A common citizen assembly ‘camp’ perspective is that deliberative polls don't do governance at all and are close to giant focus groups (overly focused on statistical significance), don't give participants agency, don't let them actually provide anything more than poll outputs, don't involve much deliberation given their shorter length, etc.
From my perspective, deliberative polls provide significant value due to their empirically backed demonstration that one can, with a specific process design, overcome many of the challenges that have been predicted in deliberative decision-making. For example, ensuring that men, the more educated, etc. do not dominate the small group discussions. I have tremendous respect for the deliberative polling team for developing processes focused on addressing these issues. As a result, I see processes that involve key aspects and lessons from deliberative polling as being useful at the very end of a more elaborate decision-making process, to enable a sort of informed referendum of a microcosm, after a set of policies or principles has been narrowed to a small set or menu of options. However, given my observations, I believe that aspects of the current deliberative polling processes have significant weaknesses (particularly with the current online platform), and there is much that should be incorporated from the citizen assembly world even for that limited application of an ‘informed microcosm referendum’.
Citizen assemblies provide a whole host of additional ‘capabilities’ that are not present in deliberative polls—including enabling the development of that set of options that might be decided by a microcosm referendum (and they also provide alternative ways to structure a microcosm referendum). However, they also have something to learn from the rigor by which deliberative polls have been studied, e.g. to quantify the extent to which dominant voices might impact the proceedings with different process designs. I would like to see significantly more investment in the study of assembly processes and their subcomponents to see what works and what doesn't.
In both cases, there is evidence to suggest that deliberators feel more empowered as a result of the process. Also in both cases, there is an opportunity for augmenting the very human deliberation processes both internally and externally with AI-augmented tools eliciting and distilling perspectives at scale, at specific stages where they can be particularly helpful (for example, for participatory input from the broader public beyond the deliberators, and as a way to source responses and follow-up questions within the deliberation plenaries).
In sum, I believe that deliberative polls are an effective mechanism for doing exactly what they were intended to do: ‘determine what people would decide if they had more time to think—and to consider some competing arguments in an evidence-based way.’ However, aspects of the deliberative poll framing, process design, and execution (including the current online platform) may over-optimize for statistical and scientific rigor over other factors that matter for deep deliberation, high-quality decisions, and democratic legitimacy.
Execution
Meta ultimately chose to use a deliberative poll, working with the Deliberative Democracy Lab at Stanford, and supported by the Behavioral Insights Team; and they opted to use the online deliberative polling platform also created at Stanford.
There are many aspects to running a deliberative process, including for example, a mechanism for governing the overall process, identifying the question or questions at hand, development of briefing material, identifying experts and stakeholders to provide context, participant recruitment, communications to participants, and the management of the deliberations themselves. Here I'll focus primarily on participant recruitment and the online platform used to run and moderate the deliberations themselves.
Participant recruitment
An ideal deliberative process would involve participants who are perfectly representative of the entire group being ‘governed’. There are several best practices that have been developed around this using sortition for influential governance processes. From what I understand, these were not used in this process in many of the regions, and instead, Meta’s process recruited participants using 14 survey panel providers across the 32 countries. There are a variety of reasons that Meta and its partners on this chose to go this direction, many of which are understandable given logistical constraints. If there had been a plug-and-play alternative for best-practice sortition processes across countries, that would have been ideal, and I see this as another critical piece of infrastructure that we need to collectively invest in.
This is particularly important, as the choice of using survey panel providers likely increased the extent to which the process was seen as an experiment or a study, as opposed to a meaningful governance exercise (I personally heard participants referring to the process as an experiment or a study).
Meta did compensate participants for their time, and some of the panel providers did provide ‘concierge’ services to help ensure that they were all set up for success.
The online deliberative poll platform
The online deliberation system was both impressive—and had significant issues. It was developed by the Stanford Crowdsourced Democracy Team and did exactly what it is meant to do from a process perspective at scale (with just a few minor bugs here and there)—which is incredible for this space, and I hold the team that developed it in high regard. However, it is not yet at the level of polish that I believe is needed for processes of significant societal import, and there are a number of design and process decisions that make it deeply impersonal and in some cases, even dysfunctional, for meaningful deliberation. I see this not as any failure of the team, which has created something novel and valuable but as a result of our failure to be funding democratic process innovation at an appropriate level given our technological capabilities.
A core goal of the system was to be able to scale deliberative processes far beyond what would be possible with human facilitators. There still were human moderators with relevant language expertise available and watching the deliberations as they progressed, coordinating over Slack. However, they were not “at the table” directing the conversation, but instead acting more like moderators, watching for issues, and addressing them as they came up (this is in theory somewhat similar to what excellent citizen assemblies facilitators do, e.g. Mosiac Lab; but in practice, the online platform experience did not live up to that standard). So humans were still theoretically a critical part of the process of moderating the discussions, but far less people were needed to moderate the same number of participants, which enabled the process to be run at the scale that it was given the timeline and costs.
The online deliberative poll system was essentially a heavily customized video chat system that runs in the browser. Here is a short overview of the kinds of things that worked well, some of which could be adopted in other systems:
Functionality:
It showed participants instructions for each stage of the process, including a reference to the questions being asked.
It allowed groups to progress to the next stage through a voting process (though there were some downsides, as I will cover below).
It integrated auto-playing videos to introduce new topics at each stage.
It directly integrates the process of writing, editing, and voting on a set of questions for the experts.
Moderation:
It encouraged participants who hadn't spoken to speak up.
It provided a ‘speaking queue’ so people could request to speak in a structured way, eliminating interruptions by others.
It provided a powerful backend for non-participant moderators to see transcripts in real-time (with potentially problematic segments emphasized) and listen to segments of them, and very rapidly get a sense of the entire conversation with a well-designed expert user experience (so that they could take corrective action if need be).
However, the system, at least as executed within the processes that I observed, also had many significant flaws:
Deliberation quality:
While there was some real deliberation—e.g. weighing of trade-offs across issues—the extent of deeper deliberation versus simply the sharing of opinions appeared significantly lower than I have seen in some other processes, and other facilitators that also were able to observe the process itself shared a similar perspective.
Impersonality:
In some of the small groups that I observed, many or most people were identified only by numbers, which resulted in impersonal interactions such as participants saying “I agree with 4823" (perhaps this was intentional for privacy, or due to participants not following directions to set their name).
The system prevented people from speaking more than 45 seconds at a time, regardless of how much time they had taken in the past, and cut them off mid-sentence. (There are good reasons to do the latter in some cases, but it should be handled much more delicately!)
Larger 'plenary' sessions were designed as one-way webinars, with participants having no way to interact with each other or with the experts (the small groups did come up with the questions, so they do get to hear what other groups found important; and in most deliberative polls the person who came up the first draft of the question does get to ask it).
Lack of meaningful agency:
Participants could only interact with each other in the small groups that they were randomly assigned, and had no interaction across groups even within their country or region—not to mention across languages. Beyond impacting agency, it also dramatically limited the collective intelligence of the group.
People were mostly randomly assigned to a new small group every session (though this was a logistical limitation that will be fixed in future sessions).
Participants had no way to directly suggest modifications to proposals being provided to them that they needed to choose between. (In theory, this might be picked up in the qualitative analysis of the transcripts of the conversations, but that is different in kind from agency or democratic decision-making; and it would have required that participants be very aware that this was a potential impact of their deliberations—potentially counterproductively leading to more performative deliberations.)
Process short-circuiting:
Each small group deliberation was meant to be an hour and a half, but it appeared to me that in some small groups, participants realized that they could leave early (while still being compensated) if they all agreed to move to the next stage fairly quickly. I saw this happen multiple times, leading to minimal deliberation on the topic, and ending sessions in less than half of the allotted time. (In contrast to the citizen assemblies and offline deliberative polls where participants often wanted more time to make better decisions.)
Confusion and non-participation:
Participants were initially thrust into the video meeting room with instructions to make introductions—but they did not always know how to use the interface. I saw groups where it took several minutes for someone to finally speak up.
Some people never made an introduction. In one group I observed, only 3 people introduced themselves.
I saw multiple people who were entirely silent across an entire small group deliberation, with their cameras turned off, just listening (or completely ignoring it).
Disruption:
Some participants would repeatedly take up significant amounts of time by repeatedly adding themselves to the question queue. As an extreme case, one participant was just babbling incoherently continuously throughout the entire small group session that I witnessed, repeatedly disrupting the conversation.
This is far from a complete list of the issues that came up with the process and system. I share all this not to argue that Meta was terrible due to their use of this system—this is a system that Stanford professors and students poured their hearts and souls into long before Meta dreamed up their Community Forums. It has also enabled a new approach to group communication—with far higher quality than, for example, most Facebook groups! (And the experience was still positive enough that 82% of the participants said they would “recommend this event to Meta as a way to make decisions in the future” and ‘78% thought the members of their group “participated relatively equally”’.) Any new modality will likely start out far from perfect, and I think the general approach holds promise for specific uses within a larger process; in spite of these issues, the majority of groups did function reasonably effectively with a reasonable level of quality of deliberation.
While there are real flaws that need to be addressed, many of these issues can be overcome given sufficient resources for implementation and operational support and the team is actively working on addressing them. I would also like to see significant resourcing to decompose systems like this into modules, enabling experimentation with a variety of processes across applications and organizations (if you are doing or are aware of solid work characterizing these sorts of ‘modules’; ideally even more general than this and this paper, please reach out). My expectation is that the best processes in the future will have some smaller-scale components with very hands-on human facilitation, and also larger-scale components with vastly improved automated facilitation. We can invest now in ensuring that even the automated aspects of facilitation that enable broader participation are also as humane as possible.
Beyond the deliberation process
The deliberation itself is only one piece of the much larger operation. There is much more to say about the gritty details of getting such a process off the ground, making it as neutral as possible on the issues at stake, and then having the desired impact on the organization and society. That will wait for another time since you've already read (or skimmed) over 3,000 words. Suffice it to say for now that I did not see Meta’s hands on the scale in any meaningful way (aside from the implicit impacts of choosing what questions to ask).
Looking forward
Meta did not just announce the results of one massive deliberative poll—it also announced that it would be running another one on generative AI. I've also been developing a set of recommendations to inform future processes for Meta and other organizations that are starting to explore this space. I hope to share more on this shortly, but for now, I’ll just throw one concrete proposal into the mix—I believe that it would be valuable to run a global high-quality deliberative process on the question of “Under what conditions should powerful foundation models be open sourced?” Meta is particularly well placed to be part of setting up such a deliberation given its central place in open-sourcing such models.
Beyond just Meta and its process, above I alluded to a set of challenges where we will need additional investment to develop better deliberative processes. I expect that you'll hear more from me on that soon, as one of my current core goals is to help scope out the critical ingredients (and resources) we will need for effective transnational governance of transformative AI.2
Thanks to Jessica Yu, Andrew Konya, Kyle Redman, James Fishkin, and others for reading drafts and providing constructive feedback, regardless of differences of perspective on the conclusions.
Please share this with people who might find it interesting—and tag me if you share on social media: I’m @metaviv, aviv@mastodon.online, and Aviv Ovadya on LinkedIn.
Stay in the loop by following me on any of those platforms, reaching out at aviv@aviv.me, and of course, subscribing.
The deliberators in Meta’s Deliberative Poll were referred to as participants and this is also how they were referred to in the most recent book on Deliberative Polls; however, some deliberative polls do use the term delegate instead—a much better choice if they are being granted significant influence or power.
I’m using governance here to refer to policy and product decisions, and also alignment and democratic fine-tuning. Reach out if you would be interested in being involved in this scoping of critical ingredients and modules—or would like to see how your existing processes or systems relate to a framework for understanding such modules.
Wouldn't one or more exceedingly well checked AI facilitators find a reduction in human bias and polarisation false or not from all parties?
Those deliberating could decide their level of privacy and have plenty of time to deliberate and present their views. These could be summarised and the summary reviewed by the person?
The conclusions through all stages could also be reviewed by those deliberating and moderators or however you and others are described ?
These are impressively high figures.
"... the experience was still positive enough that 82% of the participants said they would “recommend this event to Meta as a way to make decisions in the future” and ‘78% thought the members of their group “participated relatively equally”’.)