Inauguration speech analysis by IBMs Watson. Credit to Jeremy Waite.

This is true. I'm on mobile so didn't run the gamut.

This last area is a bit tricky. It could be as simple as classifying words, listening for those words, incrementing a counter, then giving the highest counter. Again, Watson would overkill for this.

Advanced NLP goes into the subtleties of human speech. This includes volume, speech speed, tone, patterns like emphasizing certain words, pausing, and so forth. And this is where ML can really shine with training - especially with something as powerful as Watson.

You give the software some variables with base values labeled like angry, shy, confident, nervous, happy, sad, etc. You play it some audio. You grade the audio against these variables manually on a scale from 0-1 (as granular as you'd like) so that it knows what each of these values should sound like. Then you train the software on this model. If it correctly classifies audio on its own, you have a good model. If it's off, you tweak your values. Rinse and repeat. Watson would be a phenomenal use for something like this.

/r/compsci Thread Parent Link -