We are the Data Science Team at Idibon, we just published a study on tone of conversation across reddit communities that’s been covered by Vice, The Guardian, VentureBeat and others, AUA!

To be clear, there were NO keywords involved in labeling comments as Toxic or Supportive. We did use sentiment analysis (which is machine learning based, not keyword based) to choose which comments were picked for annotation, but it was not used to do any of the labeling - simply because, the task is too complex for machine learning and requires human annotation. Each comment included in the study was labeled 3 times by human annotators - and comments were considered Neutral unless 2 out of 3 annotators agreed (at a minimum) that the comment was Toxic or Supportive. The definitions for Toxic and Supportive were given in the original blog post

That's not to say that human annotators are infallible, they certainly can make mistakes. In fact, one of the areas Idibon specializes in is getting the most of crowd annotation - our co-founders, Robert Munro and Tyler Schnoebelen co-wrote a seminal paper on the topic. For the Reddit study, we employed a number of techniques to minimize annotator error, including:

1) Gold: we created 150 "gold" questions, which we personally annotated as Toxic or Supportive. In order to participate in the study, annotators first had to pass a quiz in which they needed to correctly annotate at least 8/10 of these questions. From there on out, 1 gold question was embedded for every 14 other comments they annotated, and they needed to keep a high pass rate on these gold in order to continue annotating.

2) Pilots: we ran 2 pilot studies of about 1000 comments each before running the final study with 10,000 comments. We monitored annotator agreement rates and looked at annotator feedback to refine our instructions, definitions, and test questions for the final study. In fact, it was through this process that we decided to break apart Toxicity into its component parts in order to clarify the task for our annotators.

/r/IAmA Thread Parent