How feasible would it be to integrate a digital assistant like Mycroft into the phone? I think Mycroft has the potential to be a stronger and more useful assistant than Siri or Google Assistant. Is there any possibility to work with them to have it work with the phone?
Technically it should be quite simple. The issue is that Mycroft uses remote services to do the voice processing, speech <> text and intent parsing. Remote services of this nature don’t fit with the Purism goals. Running locally is always an option but then it’s a question of battery usage and processing power of the device.
I think this would be very cool if we could get this to work.
I have been playing with mycroft for the past couple weeks with the goal of running it on this phone.
You do not need to use their service. I have successfully run the entire mycroft system on my laptop, so it should be trivial to run it on your own hosted server.
The only weakness in the mycroft setup so far is that it relies on google for speech to text. Supposedly, that is pluggable with other engines, but promises to be painful.
I do think that mycroft is the best bet among all the other options out there. Their approach to solving the problem is very pragmatic. Build what you can, then start chipping away at the hard problems. Their goals are admirable and afaics in line with purism.
I am planning to keep experimenting with mycroft to see what can be done with it.
This is definitely the Purism way. E.g. distribute a laptop with Intel ME running and AMI Bios and then port Coreboot and disable the ME and then work on the Intel FSP.
That, unfortunately, should be an absolute deal breaker. Streaming all of our commands to google servers seems absolutely against all “freedom” and “privacy” goals. At this point we might as well use Android.
The difficulty of speech <> text + floss is just a part of the topic, which is AI+floss.
GAFAMs have put enormous resources on training neural networks for speech synthesis and recognition.
While AI is often claimed to be the most open-source-friendly field in computer science, what is floss there? The neural network topology and creation tools? Yes. Its inference? Sure. But what about the huge training sets in-between, and the processing power needed for training which quite exludes any individual? Data may become free, not sure about hardware.
The quality of your final AI software reflects that of the data you fed it with. That problem might be overcome with the democratization of (yes, google inside…) self-training neural nets, and further ai factoring ai.
Yes, using google’s text to speech is not exactly ideal. However, you can substitute with another engine. It is just that google’s is the default.
Found a thread in their forum at https://community.mycroft.ai/t/replacing-cloud-services-for-stt-by-local-ones/1692/41
Probably more discusssion si you want to really dig.
The point is that mycroft is pluggable and there is no need to use google services.
Okay, so we provide Mycroft minus google. Speech recognition won’t be the best, but it will be better than nothing. I’m liking this idea.
Well I had missed this October post, Google assistant currently relies upon a Deepmind-based convolutional neural net, Wavenet.
To cut the off-topic : do you know where one can look for actively developed floss tts/stt engines?
I am colaborating with validating donations part, every free time I can.
It would be great to implement this Common Voice STT/TTS project on Mycroft and implement Mycroft on Librem5 phones.
Mimic may address the Text-to-speech problem
Has anyone had experience with Snips? I’ve been on and off playing with it for a while and would consider it a better option. Snips.ai
On device running.
The raspberry version will probably fit for Librem 5.
I didn’t check If it is open source.
I am seeing wit.ai
Used by over 180,000 developers
Wit.ai makes it easy for developers to build applications and devices that you can talk or text to. Our vision is to empower developers with an open and extensible natural language platform. Wit.ai learns human language from every interaction, and leverages the community: what’s learned is shared across developers.
Is used in a telegram bot @voicybot
The source is on github
What I like is the offline version and privacy focus.