The spring semester is quickly drawing to a close, and I was not able to get a chatbot working with Jeb. However, though my end goal was not met, I don’t feel as if I’ve wasted my time trying. Shortly after starting to work on this project, I found an API that would let me interact with a few different chatbots using Java. I added this API as a dependency to Jeb.
At this point, I was only able to communicate with the chatbots by typing to them, but my speech synthesizer was able to successfully speak their responses back to me. The first bump in the road occurred when I tried to hook the chatbots up to the speech recognizer (though I expected this). As detailed in some of my previous blog posts, in the grammar file for the version of Sphinx4 that I’m currently using, you have to explicitly define what words or phrases Sphinx4 should listen for. Since a conversation with a chatbot is fairly dynamic, I had to find an alternative to this.
I reread some of the Sphinx4 documentation and saw that they had a new version out which supported something new: dictation. As opposed to explicitly listening for certain commands, with dictation, Sphinx4 would be able to listen for anything. This sounded like a perfect solution. I got the new version of Sphinx4 working with my project, but the accuracy was quite low. It made my program unusable. After reading some more of the documentation, I found out that I could do something called acoustic model adaptation to improve Sphinx4′s accuracy. After doing this adaptation, I did see a noticeable increase in the accuracy of the speech recognizer; however, it was still so low that it made my program unusable. I’m not sure why my accuracy was so bad. Sphinx4 is supposed to be the best free speech recognizer for Java, and many people online speak highly of it. It’s a definite possibility that I set something up incorrectly or did the acoustic model adaptation wrong, but it’s hard to be sure.
By this point, time was running out. It was clear that I wasn’t getting anywhere with Sphinx4, so I decided to look for other options. I didn’t find much. The few resources that I did find either didn’t work at all or weren’t free.
So where does that leave me now? Well, I think I’ve decided to look outside of Java. If I can find a better speech recognizer in another language, I think it would be beneficial to either attempt to hook that up to Jeb, or even to port Jeb to that language. By itself, Jeb’s code is extremely simple; the 3rd party API configuration is the hard part. My college career may be ending in five days, but I think my work with Jeb will continue for years to come. I think the biggest thing that I’ve learned in my work with Jeb is that, at least for free/open source options, speech recognition has a long way to come. There are some great commercial solutions out there, but, for now at least, those are kept behind locked doors. However, I don’t think it’ll be too long before the open source software catches up. I look forward to that day.