Speak, Google! (speech/audio to home assistant)

jtara92101 · February 27, 2017

I'm exploring how to make the Home Assistant speak asynchronously. Thought I would start a thread, since this is of general interest to those with an ISY and Home Assistant.

Most obvious use case is to speak some timer, alert, alarm, etc. etc. etc. (My first use case is to let me know when my espresso machine is ready, based on time-out from turn-on using an OutletLinc with sense enabled.)

Since the Home Assistant can serve as a Chromecast Audio device, there should be several options for doing this locally. (Would be nice to avoid the cloud if possible, but if I found a cloud service, it certainly would be convenient for starters.)

I will look over what others have done with Alexa to get ideas.

I'm most interested in something that I can install on MacOS or on Linux on my router. I'm sure many others will be interested in raspberry Pi solutions.

I did a few searches, but mostly found:

- apps for casting MacOS system audio (needs SoundFlower)

- plugins for various home automation apps that can TTS and cast

I guess the latter could be adapted to stand-alone.

Suggestions?

jtara92101 · February 27, 2017

A nodes.js Google Home notifier:

https://github.com/noelportugal/google-home-notifier

Jimbo.Automates · February 27, 2017

I had played with this but I didn't find a way to continue what was already playing after the announcement since I use the a lot for playing music. Hopefully there's a way to do that? It would be easy to write a Python script to do it, I was just using curl to test.

Sent from my Nexus 6P using Tapatalk

jtara92101 · February 27, 2017

I had played with this but I didn't find a way to continue what was already playing after the announcement since I use the a lot for playing music. Hopefully there's a way to do that? It would be easy to write a Python script to do it, I was just using curl to test.

Can this be done with Alexa? That is, make an announcement, and then return to previous audio source?

I think this is something that these devices will need to address. I commented on the general need peripherally a while back, with regard to my AV amplifier. An emerging need for AV amps is to have an input that will "override" whatever the current audio source is. Like a DJ mic. If my amp had such a feature, I could plug a Chromecast Audio device into it for announcements while playing audio through the amp.

It is certainly something I could kludge-up for specific use cases. i.e. command the AV receiver to switch inputs, command the assistant to cast to a dongle on the AV receiver. (Or just mute the AV so that the assistant can be heard.) But needs to be more integrated.

Of course, in the future, many AV amps will have built-in chromecast audio, Alexa, Siri, Cortana, etc. But that is in the future. And there will still need to be some way to interrupt and continue.

I think the industry still needs to fully think-through how these devices will be used. At minimum, there needs to be some small number of "classes" of audio, e.g.:

- media

- announcements

- dialog

Dialog and announcements need to be able to cut-through or pause media, and then resume media when done. And it would be great if further configuration were possible. e.g. I don't want to hear THESE announcements if I am watching a movie, or between these hours, etc. Really what is missing is a specific announcement service, and ways to configure it.

Although Google Home "hears" very well, she doesn't hear so well over a movie. It would be great if she could have especially good ears for the trigger phrase, and once heard override media.

I note that Google Assistant already does this for dialog. If you are playing music through the assistant, and start a dialog (e.g. ask the weather) the volume level of the media is reduced during the dialog.

Edited February 27, 2017 by jtara92101

Jimbo.Automates · February 27, 2017

I never looked at how this was done with Alexa so not sure. And, Yes, agreed, but it was a deal killer for me until it can be figured.

Sent from my Nexus 6P using Tapatalk

jtara92101 · February 27, 2017

So, I just ran some tests. Installed google-home-notifier NPM module and started the little server they include after modifying port.

It uses Google TTS service, so it is NOT a purely local solution. The server runs on my Mac Mini, receives text on a port, sends it off to Google TTS which sends back a .mp3 file. It then sends the MP3 file to the Google Assistant.

The voice is not great, which is strange, given it's Google TTS. One would think it would be equally good as the assistant's own voice, but it is not. It doesn't have the smoothness. I think Google is giving the Assistant voice generation some extra AI love.

I did some experiments to see if I could prompt Google Home to continue playing previous music, and failed. I asked it to play some elevator music, and it picked an upbeat-but-sedate channel. Sent a notification, and the music just stops. I tried "continue playing music" and told me there was nothing playing. I tried "play previous channel" and it did indeed play some music, but it was a different channel, which it claimed was "soft rock". But, in fact, it was VERY raucous and loud hard rock.

Which pointed out a problem or two.

- It could not hear me over the loud music. I had to tap the top of the device to stop it. One would think it could subtract it's own output from the mic input!

- Volume equalization between sources. Not. Not even Google's own music channels.

It is a very new product, and I'm sure these things will be worked-out.

In any case, I have a working solution for notifications in case Home won't be used for playing music.

Will research if there is some way to change voice, get better TTS processing, etc. Realize, though, that part of the solution has nothing to do with Google Home, as it is just using google TTS service separately to make an MP3.

If you just want to cast something, I think you should be able to find plenty of Python or node.js solutions. The little server that comes with google-home-notifier could be adapted, to, say, accept the name of a local MP3 file instead of text to TTS, and then just skip the TTS part.

Actually, it would make sense to take that code, and just expand it so that you can either give it text, or the location of an MP3.

BTW, I did not have to make the modification to the Express server that is indicated on the Github readme. It seems to work without it. But think it has something to do with limiting to IPV4, and of course both Google Home and MacOS have IPv6, so works without the mod.

The server sets-up an ngrok connection as well. So, gives you the ability to send notifications from the LAN or from the Internet, without having to open up any port on your router. Assuming there is some way of setting-up a "permanent" ngrok DNS name. If you are not using it, that code should be commented-out.

Maybe I will fork that project, and tailor it more toward ISY needs and provide some options so that code doesn't have to be edited for addresses, ports, use ngrok or not, etc.

BTW, there really isn't much to either the notifier or the server. They are both tiny bits of javascript code - it's a good demonstration of the power of node.js. It's just two little files - which pull in a boatload of dependencies upon installation!

Edited February 27, 2017 by jtara92101

jratliff · February 27, 2017

That is neat, I'll have to try it out. I wanted some way to push speech to an Echo last year but when I looked into it at the time they did not allow it, only way was pushing speech to a bluetooth device connected to the Echo that would play it's own speech / audio through the Echo. Or being creative with one of the remotes and having a pi or something talk through a headphone into the remote saying "Simon Says whatever you want to say" then the connected Echo would repeat your speech.

jtara92101 · February 27, 2017

Google Home Assistant is also a Chromecast Audio receiver, so whatever you can do with any Chromecast audio receiver will work.

A bit more review reveals that google-home-notify is using a redneck duct-tape lash-up solution. It's fine for fiddling around with your home system, though. It actually uses Google Translate, which has the ability to make an MP3. So, it's set to translate english-english. There are some tricks implemented, as Google really doesn't want people using it in this way. It could stop working at any time, as it's a cat-and-mouse game, but the cat doesn't seem very interested in catching the remaining mice right now.

The TTS used by Google Home itself is way better and uses AI.

There are many alternative online and offline TTS engines/APIs that might be used for this. google-home-notifier is a good prototype that shows how to get the audio to Home (or any Chromecast Audio device) once you have it, in any case.

I think you could use the same TTS used by Home Assistant by using Assistant APIs. But not something for the average user who is not a developer. Maybe it is something that UDI could eventually include in their service.

I haven't yet found any info on an official standalone Google TTS service - only STT (voice recognition). For that, you can get 60 minutes/month free.

jtara92101 · March 8, 2017

FYI, I've forked google-home-notifier and contacted the author to see what his intentions are - i.e. does he plan on continuing to support it, does he want pull requests, etc. I haven't heard back, so thinking I will want to rename this and go my own way with it.

As naming things is one of the two hard problems in software engineering (really!), I am asking for help!

Things to consider:

1. It is really not only for Google Home Assistant. It (should) work with any Google Cast device (or devices). With a bit (very little) of change to the code.

2. It is really not just a notifier, at least with the changes I have made. It can (now) also play local audio files (from the computer running the service) or play audio files from the Internet. You can play a barking dog, or you can tune your favorite Internet radio station.

3. It is a simple web service that runs on some always-on device in your home. You can use it to easily send announcements or play some audio, by sending a POST request from some home automation controller, AV system controller, etc. etc. I'm not interested in going beyond that charter e.g. by making it into some Internet radio tuner or playlist tool, etc.

----

I'll post a link to the repo in a couple of days for those who'd like to play along.

So far, I've made some usability improvements (config file and config override from command/line or environment variables) and gotten it to play local and Internet files.

I plan on adding optional caching (so that static texts only have to be converted to audio once. Of course, texts with variables plugged-in would not benefit from caching). For static texts, though, with the ability to play local files, you can also just use whatever TTS you want to create a file then drop it into the /public directory on the server. Or simply record a file with your own or other voice. BTW, there are MANY demos on the net where you can type in some text and get TTS back. I was playing with the Watson demo today. It's still not as nice as the Google Home's own voice, but certainly much better than Google Translate (which is the current TTS provider).

I *think* it should be possible to pause/resume current audio, though it might be a bit clunky. But it should be possible to determine the current media and position, and then restore it after an announcement is made.

Finally, I should be able to add an additional TTS provider or two. If anyone knows of free services I can use, I would like to hear about them. Or a GOOD local TTS service that is free and open-source, hopefully with a nodejs module available.

I don't want to get too crazy with this, because (hopefully) it is just a band-aid until Google provides a "speak" service in the cloud for Home!

OTOH, some people have an allergy to the cloud (I'm looking at YOU Teken!) so maybe it has some lasting value even once Google provides a solution. And, yes, I was thinking of Teken when I dreamed-up the idea of caching the TTS audio so that it is less dependent on an Internet connection, and so won't have to "phone the cloud" every time it speaks.

Edited March 8, 2017 by jtara92101

ScottAvery · March 9, 2017

Nice work, I will definitely be interested if I do invest in the google ecosystem.

gnat54 · March 18, 2017

I too would be interested in any fork of google-home-notifier you're working on. (And did the original developer ever get back to you?)

(I've been using g-h-n for a couple of months & decided to buy a ngrok basic plan so I wouldn't have to worry about the tunnel changing. However, I can't get it to work!)

Edited March 18, 2017 by gnat54

jtara92101 · May 10, 2017

Well, how embarrassing! I just went to GitHub to make the repo public, and then I realized that when you fork a public project, it is public by default, and you can't even change that! (If you want to make it private, you have to make a NEW repo...) So, it's been public since I created it.

I changed the name to distinguish it from the original, but any similarities that have not already faded away probably will.

Here is the github link:

https://github.com/watusi/google-cast-public-address

I won't have much time to work on this for the next couple of months, but it is certainly much more functional than the original.

Edited May 10, 2017 by jtara92101

Sign In

Speak, Google! (speech/audio to home assistant)

Recommended Posts

jtara92101

jtara92101

Jimbo.Automates

jtara92101

Jimbo.Automates

jtara92101

jratliff

jtara92101

jtara92101

ScottAvery

gnat54

jtara92101

Join the conversation

Recently Browsing

Who's Online (See full list)

Forum Statistics

Browse

Activity