Automatic text summarization of results #915

colinmegill · 2021-03-20T08:30:56Z

Thanks to @micahstubbs and @amyxzhang for spurring this.

Interpretability of Polis results has been, and continues to be, a critical issue for Polis as a platform, and a challenge to the usage of the method by various stakeholders and user archetypes. Interpretability is a hard and foundational problem that will require ongoing development: as more advanced analytic methods are added to the system, interpretability by those without data science and statistical methods backgrounds will suffer, and the burden on producing interpretable results on those running conversation will increase.

In a sense, the summary the platform has is the visualization: "here are the groups, what differentiates them, and what unifies them". But the visualization is necessarily limited to a handful of comments, and how these comments are chosen is opaque. We have always thought about interpretability in ‘tiers’ (a simple list, clustering, etc), from a list, to the visualization, to the report, to human generated reports, to news articles summarizing, etc., with various levels of human in the loop. Consider the following examples.

Biodiversity in NZ

The comments chosen procedurally for the visualization for biodiversity: https://www.scoop.co.nz/stories/HL1908/S00014/scoop-hivemind-protecting-and-restoring-biodiversity.htm)
The procedurally generated report https://pol.is/report/r3epuappndxdy7dwtvwpb
And what PEP ultimately delivered to the government (PDF): https://www.scoop.co.nz/stories/PO1911/S00063/biodiversity-hivemind-report-plenty-of-common-ground.htm
Direct link to PDF above: https://img.scoop.co.nz/media/pdfs/1911/Biodiversity_HiveMind_Final_Report_Scoop.pdf
The debrief https://pep.org.nz/2020/12/01/doc-tries-to-restore-e-democracy/

Bowling Green Civic Assembly

The comments chosen procedurally for a the list, a different option than the visualization https://pol.is/9wtchdmmun
The report https://pol.is/report/r2xcn2cdbmrzjmmuuytdk
The final report written by Columbia & University of Kentucky: http://www.civic-assembly.org/bowling-green-report/
The news article from The Bowling Green Daily News https://www.bgdailynews.com/news/first-ever-civic-assembly-gives-residents-chance-to-be-heard/article_0a17254e-a8bb-5f4f-884f-9d0617ab9c08.html

The following image is from a town hall event in Bowling Green, KY, where the entire report was shared with citizens during a town hall, posted, as well as printed and distributed:

DEMOS

Engage Britain

https://engagebritain.org/your-opinion/results/

These examples provide reasonable references for what has been accomplished with human in the loop summarization tasks and the system as it exists at present.

In each of these cases above, dozens of hours by multiple highly trained professional facilitators, statisticians, journalists, academics and / or data scientists were spent to derive meaning and make obvious and comprehensible ‘what happened.’ Polis as a system produces and surfaces a latent space, but, what it produces is an intermediate representation which really serves as an input to subsequent tasks and decision making. It is worthwhile to continue to bridge ‘the whole picture of what happened’, which Polis definitely assumes as an output, and ‘what will be communicated to busy and / or non technical people who need a quick takeaway’ because it democratizes the method as a whole and demonstrates enhanced ability to make meaning of the space. This will be greatly aided, we anticipate, by #217

It’s possible that Polis will only ever be able to get so far, as it is effectively "platform-itized data science". Any learning in this direction, however, should serve to illuminate the boundaries of what is possible, and may generate ideas for future methods.

It is a worthwhile goal to bring interpretability down to the individual citizen, in the way that a newspaper column might report on a sports match, as this makes the entire exercise more accessible and reduces burdens on reporting out what happened, while increasing confidence.

There is also potential benefit in 'automated conversations': if there is no human in the loop, conversations can be triggered procedurally, say with 3000 randomly selected citizens (procedurally selected), 5 years after a law was passed, to do assessment on whether or not the law had impact, should be revisited, etc. cc https://twitter.com/marcidale or, to automatically generate news stories about a conversation that happened as a result of news stories cc https://twitter.com/chrismoranuk.

NLP

A notable strength of the system, which has facilitated spread around the world, has been the complete rejection of natural language processing. cc @ceteri

It seems attractive to consider ‘assembling’ narratives from data and translate-able building blocks. If the building blocks are sufficiently atomic, perhaps they could be assembled and displayed given any detected browser language string.

jucor · 2021-03-20T09:00:12Z

A good starting point would by the large language generation work behind "The automatic statistician". https://link.springer.com/chapter/10.1007/978-3-030-05318-5_9
I know the authors, I'm sure they'll be happy to have a chat.
This predates large language models, (LLM) which might seem "dated" but is in my opinion a very good thing: the biases in LLM (as highlighted by the parrots paper) are too high a risk for something so crucial the take away from public consultation.

colinmegill · 2021-03-20T09:10:17Z

Generally, I love this & printing to read, thanks for sharing. Obvious point to restate, but this is in line with previous choices to build on PCA over more black box options.

jucor · 2021-03-20T09:13:10Z

So aligned :)

…

ThenWho · 2021-03-20T11:02:34Z

A solid and worthwhile goal 💯. I'm skeptical as to how close to the target we can shoot, but this should be the aspiration, 100% yes 👍

I wouldn't dismiss further visualizations as intermediate waypoints towards this goal though. Visualization is a huge strength of polis after all. For example, keeping snapshots of the conversation as it unfolds and visualizing a meaningful subset of them at the end. Or calculating and visualizing refinement-type relationships between statements, i.e. "statement A is a refinement/rephrasing/evolution of statement B", using well-understood, explainable methods such as Levenshtein distance or similar ( #913 (reply in thread) ).

These kind of historic/timeline data can later on be used for textual summaries too.

amyxzhang · 2021-03-22T17:44:21Z

Thanks for writing this up @colinmegill! Given my orientation as an HCI researcher, I would first want to understand the goals and experiences of the people who are doing the work to create a report. In my previous work, I've interviewed people who work to generate reports/make decisions from a deliberative process, specifically Wikipedia RfC closers (written up in my thesis starting around page 94) and town hall meeting reporters (in this CSCW paper), and there are a number of considerations people balance and trade off as they're making editorial decisions while writing, including transparency, inclusivity, brevity, etc., as well as different audiences they write for. I suspect there are parts of the work that are tedious and highly automatable and parts that must or should be conducted by humans, whether that's a centralized report author or additional collective signaling by participants.

ceteri · 2021-04-05T18:57:06Z

I'd like to help. From what I've seen, you've got relatively brief, semi-conversational snippets of text, which are obtained from comment threads. Is that roughly correct as a description? From that, I don't quite see where text summarization comes in.

OTOH, if you had annotations for these comments, then it makes sense to generate some text-ish report/narrative describing the aggregates, segmentation, trends, and so on. That would entail a different kind of tooling. Definitely, reaching a "well annotated" state is expensive, and fleeting :) Some of the HITL approaches for active learning and weak supervision can help cut the costs dramatically, and there can be ways to leverage self-supervised learning to make this less expensive too.

For the articles listed, is the intent to work with comments from them? Are you parsing these articles, representing them into some larger structure (e.g., entity linking)?

Summarizing a collection of articles makes sense. For an excellent example, see these fully generated COVID-19 reports https://covid19primer.com/dashboard

Otherwise, while I do understand natural language work, modeling the semantics of parsed text, summarization and other areas of language generation, I'm not quite understanding what the ask is here. I guess what I'm asking is, from a "product" perspective how are the comments and articles intended to generate some result. What's the use case definition, other than "do stuff with this" :) That seems to be missing above? Or perhaps I've misunderstood much of the above?

One point to consider is there are a couple of categories of natural language work referenced above:

parsing/summarizing articles and managing the result (our team and our partners do lots of this)
understanding conversational threads (check with RASA, etc.)

FWIW, it's good to be cautious about mixing and matching approaches. The semantics of these categories of language have vastly different properties, and their rhetorical structure is also generally quite different. That can lead to trouble, although there can be ways around it. Also, it's perhaps wise to be a bit skeptical about promises made in research papers; those are mostly about beating benchmarks to get the authors published in NeurIPS, ACL, etc., while the likelihood of running their published code tends to be quite low.

Also, NLP-progress is a good source for checking SOTA in related research, such as question answering http://nlpprogress.com/english/question_answering.html

There are also tools that consider some intersection of the above. For instance, a new API from IBM for identifying claims and their support from within a corpus – or within conversational snippets of text – then generating narratives from that: https://early-access-program.debater.res.ibm.com/terms Think of automating debate, based on a corpus of discussion about topics.

NewJerseyStyle · 2022-06-06T08:56:37Z

@colinmegill I love the idea of automated conversations

There is also potential benefit in 'automated conversations': if there is no human in the loop, conversations can be triggered procedurally, say with 3000 randomly selected citizens (procedurally selected), 5 years after a law was passed, to do assessment on whether or not the law had impact, should be revisited, etc. cc https://twitter.com/marcidale or, to automatically generate news stories about a conversation that happened as a result of news stories cc https://twitter.com/chrismoranuk.

NewJerseyStyle · 2022-06-06T09:01:38Z

I'd like to help. From what I've seen, you've got relatively brief, semi-conversational snippets of text, which are obtained from comment threads. Is that roughly correct as a description? From that, I don't quite see where text summarization comes in.

OTOH, if you had annotations for these comments, then it makes sense to generate some text-ish report/narrative describing the aggregates, segmentation, trends, and so on. That would entail a different kind of tooling. Definitely, reaching a "well annotated" state is expensive, and fleeting :) Some of the HITL approaches for active learning and weak supervision can help cut the costs dramatically, and there can be ways to leverage self-supervised learning to make this less expensive too.

For the articles listed, is the intent to work with comments from them? Are you parsing these articles, representing them into some larger structure (e.g., entity linking)?

Summarizing a collection of articles makes sense. For an excellent example, see these fully generated COVID-19 reports https://covid19primer.com/dashboard

Otherwise, while I do understand natural language work, modeling the semantics of parsed text, summarization and other areas of language generation, I'm not quite understanding what the ask is here. I guess what I'm asking is, from a "product" perspective how are the comments and articles intended to generate some result. What's the use case definition, other than "do stuff with this" :) That seems to be missing above? Or perhaps I've misunderstood much of the above?

One point to consider is there are a couple of categories of natural language work referenced above:

parsing/summarizing articles and managing the result (our team and our partners do lots of this)

understanding conversational threads (check with RASA, etc.)

FWIW, it's good to be cautious about mixing and matching approaches. The semantics of these categories of language have vastly different properties, and their rhetorical structure is also generally quite different. That can lead to trouble, although there can be ways around it. Also, it's perhaps wise to be a bit skeptical about promises made in research papers; those are mostly about beating benchmarks to get the authors published in NeurIPS, ACL, etc., while the likelihood of running their published code tends to be quite low.

Also, NLP-progress is a good source for checking SOTA in related research, such as question answering http://nlpprogress.com/english/question_answering.html

There are also tools that consider some intersection of the above. For instance, a new API from IBM for identifying claims and their support from within a corpus – or within conversational snippets of text – then generating narratives from that: https://early-access-program.debater.res.ibm.com/terms Think of automating debate, based on a corpus of discussion about topics.

@ceteri How is the progress there? I am doing a demo on summarization with the openData collected from threads in Polis. Saw this issue here and think maybe you have already some made more progress? Say a question answering model? I can ask question in a topic and figure out what is going on without reading all the comments in the topic? Or ask the model how people think about the topic?

colinmegill · 2022-12-02T01:37:30Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic text summarization of results #915

Automatic text summarization of results #915

colinmegill commented Mar 20, 2021 •

edited

Loading

jucor commented Mar 20, 2021

colinmegill commented Mar 20, 2021

jucor commented Mar 20, 2021 via email

ThenWho commented Mar 20, 2021

amyxzhang commented Mar 22, 2021 •

edited

Loading

ceteri commented Apr 5, 2021

NewJerseyStyle commented Jun 6, 2022

NewJerseyStyle commented Jun 6, 2022

colinmegill commented Dec 2, 2022

colinmegill commented Dec 16, 2022

colinmegill commented Apr 25, 2024

Automatic text summarization of results #915

Automatic text summarization of results #915

Comments

colinmegill commented Mar 20, 2021 • edited Loading

Biodiversity in NZ

Bowling Green Civic Assembly

DEMOS

Engage Britain

NLP

jucor commented Mar 20, 2021

colinmegill commented Mar 20, 2021

jucor commented Mar 20, 2021 via email

ThenWho commented Mar 20, 2021

amyxzhang commented Mar 22, 2021 • edited Loading

ceteri commented Apr 5, 2021

NewJerseyStyle commented Jun 6, 2022

NewJerseyStyle commented Jun 6, 2022

colinmegill commented Dec 2, 2022

colinmegill commented Dec 16, 2022

colinmegill commented Apr 25, 2024

colinmegill commented Mar 20, 2021 •

edited

Loading

amyxzhang commented Mar 22, 2021 •

edited

Loading