Community-led Voice Assistant Integration for Ubuntu Desktop

As we transition to an AI-first world, more and more devices come with some form of personal AI assistant. From Windows 10 with Cortana, Apple with Siri, to Amazon with Alexa, Ubuntu Linux is currently behind the competition in this race. I say that the community needs to invest more resources into more innovative features like this. I hope that for 18.10 or 19.04 Canonical will introduce some form of AI integration. I personally would like the Google Assistant. However, Canonical has clearly shifted its efforts away from the regular desktop session of Ubuntu. Thus, it is up to US, the community to take the wheel and continue Ubuntu’s great reputation. I’d like to start a discussion below about this. Also, if there are any critics or official Ubuntu developers who know of technological roadblocks that prevent this addition, please let us know. :wink:

Here’s a concept of how we can give users the choice to use it or not.

Considering the reaction of some people when canonical introduced some kind of data request from them at Ubuntu installation time, I fear that many of the people won’t see this very well :smiley:

Personally I use assistants only on the phone, but have you ever checked the Watson project?

I don’t know anyone who uses the personal assistent on their desktop. Rarely do I see people use them on their phones. “All the others have it” and “it’s the future” aren’t really convincing arguments…

If you believe in this and want to push this forward, it might help by making it clear what benefits these have for users. Do users actually use this or want this?

2 Likes

Hey Merlijn,
I think the real question for this discussion is “Why shouldn’t we implement it?”.

For the average user, voice input is an excellent way to interface with the computer instead of punching in terminal commands like you and I are used to. For example, a good AI assistant will have application integration, so I can ask it, “Play my late night coding playlist on Spotify.” Think of it this way, just like we enter a terminal command with arguments to increase speed and accessibility (example sudo apt install gedit), the user wants quick access to the hidden information. In that command, we accessed the package manager, entered authentication, and initiated a download of a specified program all in one instance. Currently, the average user cannot do that without interfacing through a terminal emulator.

A voice assistant will provide this same capability in a user-friendly fashion, without having to navigate through the dreaded GUI to search for & access a package to install. Then they could, in theory, ask the assistant to “install gedit” and watch the results. Remember, everyone thought GUIs were useless until they brought computing to the billions, and some still think that terminals are the most efficient way to interface with a machine, myself included. Because of these reasons, I feel a voice assistant is useful for accessibility, and “innovation”. Not to mention, many people absolutely love using their Amazon Alexa, etc. In conclusion, an AI-powered voice assistant, like the Google Assistant, should be implemented as a new way to interface with an Ubuntu Linux device.

See also:
Pew Research poll discovers nearly half of Am…

I actually posted a concept image of the settings page. Using the assistant at all should require a Google sign in (opt in option) As passionate as I am, I believe in giving people a choice :wink:

I like the idea of an assitant and I use the Google one on my Android but I think Ubuntu could promote an assistant more focused on user’s privacy. Maybe something to do with Mozilla and its project “Common Voice”?

https://voice.mozilla.org/

1 Like

I raised this feature request once. Will guided me to Mycroft

They currently have some voice-based actions(skills) which work on Ubuntu

If you are technically capable, you can create and add more skills to this repo.

If Mycroft and Ubuntu decide to work together, then I believe the outcome would be great

3 Likes

My doubt are only about the use of the voice. AI can be used in GUI interface as well, and it would help a lot

I think the real question for this discussion is “Why shouldn’t we implement it?”.

Sorry if I say something you already know, since I don’t know how much experience you have in software development and in AI specifically.

I assume that you intended something like “it is evident that we need it, so no reason to explain why”, however to invest time and effort in anything, a precise reason is to be written down.

“Play my late night coding playlist on Spotify.”

This requires Spotify, as well as any other software, to export an API interface in order to execute this and any other command through interprocess calls, instead by people interaction. Spotify might have it already, I didn’t checked

In that command, we accessed the package manager, entered authentication, and initiated a download of a specified program all in one instance.

This exposes the level of security involved. I do not expect anyone to speak aloud their passwords

Currently, the average user cannot do that (install an application) without interfacing through a terminal emulator.

There is GNOME Software

A voice assistant will provide this same capability in a user-friendly fashion

This is way easier to say than do. In general purpose applications like a Desktop, usually people don’t even know what and how to ask for anything (in fact, google now on Android has a sort of guide always on, to show the potentialities, but still it is not complete and a phone is more limited than a desktop).

On the other way, a GUI is a lot more common and easy to use (but just beacause people already use it since years).

For the average user, voice input is an excellent way to interface with the computer instead of punching in terminal commands like you and I are used to.

The average user is not expected to use the terminal at all, when people are forced to use it (and not a choice, because it might be faster or offer more control) it is a flaw in the desktop, from my point of view.

2 Likes

Because time, energy and resources are limited. If with “we” you mean “the existing Ubuntu contributors”, then you must know that you are talking about a rare species of homo-I’mAlreadyDoingWayTooMuchius. Most contributors do this in their already-limited spare time, and most contributors have a long list of stuff they want to do, and they can only actually do the top 2-3 things on that list. There are sooooo many things that will obviously make Ubuntu a lot better that can’t be done because of lack of time and resources. If you want Ubuntu contributors to spend time on your idea, you’ll need to convince them that these are more important than all the other things on their “list of things to do”.

The second option is to contribute your own time, you already convinced yourself so that’s a plus! Thankfully, becoming an Ubuntu contributor is extremely easy, and we’re happy to help you get started.

This is actually an issue with voice assistants. Just like the commandline, you need to know what commands are available and how to say those commands before you can do anything. This is the advantage of a GUI: it shows you features you didn’t know existed and it guides you to do things when you forgot how. So a voice assistant and the commandline are pretty similar in that respect. The commandline is a little bit less accessible but it can be used in a lot more situations since every single program already interfaces with the commandline and you don’t need to talk. (many people rather not talk to their computers, especially when they are not alone.).

I disagree that this is a very user-friendly way to interface with technology. At least not with the current very limited intelligence of assistants.

This is good research and shows that it’s useful, but mainly because it’s hands-free. I think the biggest use-case for these assistants are cars and cooking, when you can’t use your hands.

  • Note that the research shows people find it “more natural” not “easier”.
  • Also note that even with giants like Apple, Google and Amazon behind it, these assistants still respond to their commands most of the time (39%) or sometimes (42%). Reliability is a very important feature for desktops and it seems assistants can’t really provide that yet.
  • And lastly, only 14% of the public has used a voice assistant on a computer or tablet, while 8% say they use them on a stand-alone device such as an Amazon Echo or Google Home. Even though THE most popular desktop operating system vendors have been shipping a voice assistant with their desktop and tablets for YEARS, only 14% has actually used them.

To me, this research shows that voice assistants are useful for cases where people can’t use their hands, but people clearly still prefer traditional interfacing methods (mouse, touch and keyboard) over voice assistants when they can, especially for desktops.

5 Likes

True, lack of time and resources is an obvious downside of open-sourced (“free”) software. I just wanted to express this idea with the community because Windows 10 already has AI along with every other major OS, but I guess if I need an AI assistant I’ll just use those platforms instead. Considering it took Ubuntu 7-8 years to implement a UI redesign, what was I thinking asking for an actual innovative feature. Not to mention, the only reason it has any form of redesign is because it is backed by a real company unlike the Communist Linux alternatives. Because at the end of the day, you get what you pay for.
Thanks for your help.
We should now get @ian-weisser in here before this discussion gets too off-topic.

Remember (Ubuntu in 2015)

VS

Windows 10 in 2015

Windows 10 in 2018

Thanks to everyone for a great set of perspectives and context on the current real challenges.

Since the topic seems to have run it’s course for this month…