Over Christmas, I caved and ordered myself an Amazon Alexa device: the Echo Dot. Granted, this was on the heels of a conversation with a friend that there was a way for me to build my own “voice command” support. This intrigued me enough (really, it was just an excuse) to drop $50 and then see if I could build something.

A month prior, I finally put one of my Raspberry Pi’s to work in my apartment. For those who have never seen it, I have a pair of KRK’s Rokit 6 monitors. I’ve been a huge fan of the “rp” series for years (even since a college friend showed me his pair). Once I had sole say over what goes into my living room, obviously these became the centerpiece. To take advantage of this, I picked up a little Chromecast Audio to let me, or anyone else visiting, throw on some tunes without any wires.

rokit

One big problem I had was that I didn’t want to leave these expensive studio monitors on all of the time as they would obviously have no use when I’m working or sleeping. The best solution I had for a while was a lightswitch. While effective, it always generated weird looks when I said “flip the switch”. So I posed the question: could I control power to the monitor via the Raspberry Pi?

After some searching, I remembered my good friend Eric from my days at RPI mentioned the PowerSwitch Tail II. This device has the standard US 120v plug with 5V DC switching. This ultimately means that the GPIO pins on the Raspberry Pi can directly control 120v power flowing to some device. Then, the question becomes, how do I interact with the device?

I believe one of the first libraries I found to interact with the GPIO pins on the Raspberry Pi was a Python library. As such, my next idea was to see if I could build a web-server using Flask. Within the hour, I had a trivial webpage up with unstylized “on” and “off” buttons. Putting this all together, I now had the ability to visit a webpage on my phone (while on my home WiFi) and turn the monitors on and off!

rokit

Now, enter Alexa. The next question was: could I build some kind of service in which I could speak to Alexa and have it invoke that same REST call I was making via the browser on my phone? Alexa has the notion of “skills”. As of this time, an Alexa skill can be built using Amazon’s “Smart Home Skill API” or as a “Custom Skill”. While the “Smart Home API” sounded right up my alley, after spending a day or two trying to make it work, I found that it really wasn’t what I wanted to use. I had something built using AWS Lambda (which was pretty cool, actually) and was able to control my monitors through the Raspberry Pi, but then found out I needed to create an OAuth2 authentication service. What I really wanted was a service that was only for my use. So, I dumped that approach and started looking into the Custom Skills API.

The Custom Skills looked much more like what I wanted to do: write a web-service, accept certain specially formatted requests, and take some action. Amazon only provides a Java implementation as an example, but it appears that the community has developed a number of bindings in other languages. Since I was already in a Python and Flask mode, I was overjoyed to find the flask-ask library. Within an hour, I was speaking to Alexa and controlling the power of the monitors from the comfort of my couch.

One point that I’ve glossed over is security. Obviously, I don’t want some troll on the internet flipped the power on the speakers on and off every second as it will break something. Thankfully, I’ve still been maintaining my Linode server since college (shameless referral link) which is public-facing. I have a collection of services running there sitting behind nginx, secured with certificates from Let’s Encrypt (who are also awesome). So, I can easily deploy my flask-ask application to my Linode server, expose it via HTTPS (as per Amazon’s requirement), but how can I forward those requests back into my apartment to the Raspberry Pi?

After doing a little bit of reading, I realized it will be extremely simple. One of the other things I already had set up on this Linode server was an SSL VPN. Once I started working remotely, I set this up to make sure that, if there was ever any sketchy networks I was on while traveling, I could be a bit more secure. For the purposes of exposing this Raspberry Pi to my Linode server, I realized the VPN would be a secure way for the two nodes to see each other and be networked (most importantly without opening up holes in the firewall to my home network).

From a very high-level, this is what the final architecture looks like. For context, “Hal” is the name I chose to identify my service (thanks, Keith!). Sadly, this appears to be a hard requirement with the Custom skill API.

  1. “Alexa, ask Hal to turn on the speakers”
  2. A request is sent to my Alexa custom skill running on my Linode server
  3. nginx accepts the HTTPS request and forwards it to the local flask-ask server
  4. The flask-ask server sends an HTTP request to the flask app on the Raspberry Pi over the VPN
  5. The flask app on the Raspberry Pi toggles the Powerswitch Tail II

All things considered, I’m actually really happy with how this turned out (aside from the requirements on having to say “ask Hal”). The flask-ask library is awesome. I’d highly recommend it to anyone who wants to try this out on their own. Best as I can tell, it does everything as it should and is just dirt-simple to implement. The setup I already had with nginx, Let’s Encrypt, and the SSL VPN, made this entire adventure so very easy as well.

I’ve done my best to publish all of the code I wrote in addition to the configuration files to my Github account. Feedback is welcome, and I’d enjoy hearing what everyone else has done with their Alexa :)