MegaMind: A Platform for Security & Privacy Extensions for Voice Assistants
Authors: Seyed Mohammadjavad Seyed Talebi and Ardalan Amiri Sani, UC Irvine; Stefan Saroiu and Alec Wolman, Microsoft
MegaMind’s demo
MegaMind
MegaMind is a platform that enables users to deploy useful extensions on the Alexa virtual assistant. In the following document, we show how to build MegaMind from its source and deploy it on an x86 desktop. Please refer to our paper for technical details: MobiSys paper
This tutorial has been tested on Ubuntu16.04 on a VMWare virtual machine with the following properties: 20GB of storage, 2GB of RAM, and 4 CPUs.
Download MegaMind source
Let assume you want to set up MegaMind in a directory called $WD
.
First you need to make a few directories in $WD
cd $WD
mkdir MegaMind
cd MegaMind
mkdir MegaMind_Alexa_SDK MegaMind_engine
We clone a modified version of Amazon Alexa SDK into the MegaMind_Alexa_SDK
and all other source codes for MegaMind in MegaMind_engine.
First, lets download and set up MegaMind_Alexa_SDK
.
Download and setup MegaMind_Alexa_SDK
cd $WD/MegaMind/MegaMind_Alexa_SDK
mkdir application-necessities build third-party
We need to install a few dependencies first.
sudo apt-get install -y \
git gcc cmake openssl clang-format libgstreamer1.0-0 gstreamer1.0-plugins-base \
gstreamer1.0-plugins-good gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-doc gstreamer1.0-tools \
pulseaudio doxygen libsqlite3-dev repo libasound2-dev
We need to install more gstreamer packages as dependencies
sudo apt-get install -y \
libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev \
gstreamer1.0-plugins-base gstreamer1.0-plugins-good \
gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly \
gstreamer1.0-libav libgstrtspserver-1.0-dev
sudo apt-get -y install build-essential nghttp2 libnghttp2-dev libssl-dev
Then we need to download a few third-party binaries.
cd $WD/MegaMind/MegaMind_Alexa_SDK/third-party
wget https://curl.haxx.se/download/curl-7.63.0.tar.gz
tar xzf curl-7.63.0.tar.gz
cd curl-7.63.0
./configure --with-nghttp2 --prefix=/usr/local --with-ssl
make && sudo make install
sudo ldconfig
To verify curl installation please run:
curl -I https://nghttp2.org/
the successful response should look like this:
HTTP/2 200
date: Fri, 15 Dec 2017 18:13:26 GMT
content-type: text/html
last-modified: Sat, 25 Nov 2017 14:02:51 GMT
etag: "5a19780b-19e1"
accept-ranges: bytes
content-length: 6625
x-backend-header-rtt: 0.001021
strict-transport-security: max-age=31536000
server: nghttpx
via: 2 nghttpx
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
Next, we install PortAudio
cd $WD/MegaMind/MegaMind_Alexa_SDK/third-party
wget -c http://www.portaudio.com/archives/pa_stable_v190600_20161030.tgz
tar xf pa_stable_v190600_20161030.tgz
cd portaudio
./configure --without-jack && make
Next, we install the Kitt-AI wake-word detector
cd $WD/MegaMind/MegaMind_Alexa_SDK/third-party
git clone https://github.com/Kitt-AI/snowboy.git
Kitt-AI requires some packages
sudo apt-get install -y libblas-dev liblapack-dev
We need to copy one file:
cd $WD/MegaMind/MegaMind_Alexa_SDK/third-party/snowboy/resources
cp alexa/alexa-avs-sample-app/alexa.umdl .
Now that we have all the packages let’s clone the source code for MegaMind_device_SDK
:
cd $WD/MegaMind/MegaMind_Alexa_SDK/
git clone https://github.com/trusslab/megamind_alexa_sdk.git source
In order to build the SDK:
cd $WD/MegaMind/MegaMind_Alexa_SDK/build/
sudo apt install vim
vim build.sh
And insert the following lines to build.sh:
SDK_FOLDER=$WD/MegaMind/MegaMind_Alexa_SDK
cmake $SDK_FOLDER/source \
-DCMAKE_BUILD_TYPE=DEBUG \
-DSENSORY_KEY_WORD_DETECTION=OFF \
-DKITTAI_KEY_WORD_DETECTOR=ON \
-DKITTAI_KEY_WORD_DETECTOR_LIB_PATH=$SDK_FOLDER/third-party/snowboy/lib/ubuntu64/libsnowboy-detect.a \
-DKITTAI_KEY_WORD_DETECTOR_INCLUDE_DIR=$SDK_FOLDER/third-party/snowboy/include \
-DGSTREAMER_MEDIA_PLAYER=ON \
-DPORTAUDIO=ON \
-DPORTAUDIO_LIB_PATH=$SDK_FOLDER/third-party/portaudio/lib/.libs/libportaudio.a \
-DPORTAUDIO_INCLUDE_DIR=$SDK_FOLDER/third-party/portaudio/include \
-DACSDK_EMIT_SENSITIVE_LOGS=ON && make
to make MegaMind Alexa SDK
cd $WD/MegaMind/MegaMind_Alexa_SDK/build/
source build.sh
Download and install MegaMind’s components
MegaMind uses several third-party components such as speech-to-text and text-to-speech engines. In this step we install these components on the system.
Install FireJail
cd $WD/MegaMind
git clone https://github.com/netblue30/firejail.git
cd firejail
sudo apt install gawk
./configure && make && sudo make install
sudo cp /lib/x86_64-linux-gnu/libssl.so.1.0.0 /usr/lib/
Install pico2wav text-to-speech engine
cd $WD/MegaMind
sudo apt install libttspico-utils
sudo apt install sox
#
Install python3.7 and its pip
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.7
cd $WD/MegaMind
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.7 get-pip.py
sudo apt install python3.7-dev
Install required python packages
sudo python3.7 -m pip install --ignore-installed --target=/usr/lib/python3.7/ cryptography
python3.7 -m pip install halo
python3.7 -m pip install numpy
python3.7 -m pip install scipy
sudo apt install portaudio19-dev
python3.7 -m pip install pyaudio
python3.7 -m pip install deepspeech
python3.7 -m pip install webrtcvad
python3.7 -m pip install nltk
python3.7 -m pip install spacy==2.2.4
python3.7 -m pip install spacy_wordnet
python3.7 -m spacy download en
open python3.7 by typing
python3.7
in python3.7 shell please type the following commands:
>>> import nltk
>>> nltk.download( 'wordnet')
Install tmuxp
sudo apt install tmux
python3.7 -m pip install tmuxp
Download and setup MegaMind engine
cd $WD/MegaMind
git clone https://github.com/trusslab/megamind.git MegaMind_engine/
Download DeepSpeech’s pre-trained english models
cd $WD/MegaMind/MegaMind_engine/deep_speech_models
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm
curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer
Run MegaMind
First, we need to provide the Client ID for each SDK for authentication purposes.
cd $WD/MegaMind/MegaMind_Alexa_SDK/build/Integration
vim AlexaClientSDKConfig_backup.json
and copy the following text into the file:
{
"deviceInfo":{
"deviceSerialNumber":"123456",
"clientId":"XXXX CLIENT_ID XXXX",
"productId":"MegaMind_device1"
},
"cblAuthDelegate":{
"databaseFilePath":"XXWDXX/MegaMind/MegaMind_Alexa_SDK/application-necessities/cblAuthDelegate.db"
},
"miscDatabase":{
"databaseFilePath":"XXWDXX/MegaMind/MegaMind_Alexa_SDK/application-necessities/miscDatabase.db"
},
"alertsCapabilityAgent":{
"databaseFilePath":"XXWDXX/MegaMind/MegaMind_Alexa_SDK/application-necessities/alerts.db"
},
"settings":{
"databaseFilePath":"XXWDXX/MegaMind/MegaMind_Alexa_SDK/application-necessities/settings.db",
"defaultAVSClientSettings":{
"locale":"en-US"
}
},
"certifiedSender":{
"databaseFilePath":"XXWDXX/MegaMind/MegaMind_Alexa_SDK/application-necessities/certifiedSender.db"
},
"notifications":{
"databaseFilePath":"XXWDXX/MegaMind/MegaMind_Alexa_SDK/application-necessities/notifications.db"
}
}
!!! note: please (1) replace XXWDXX with absolute path of $WD. (2) please replace XXXX CLIENT_ID XXXX with your Amazon Alexa Device client ID. (please see Register AVS devices in Amazon Voice Service, for documentation on how to register for Amazon Alexa Device client ID.)
Then, we generate a run script to run the Device SDK using the above file.
cd $WD/MegaMind/MegaMind_Alexa_SDK/build
vim run.sh
we insert the following code into the run.sh
rm Integration/AlexaClientSDKConfig.json
cp Integration/AlexaClientSDKConfig_backup.json Integration/AlexaClientSDKConfig.json
./SampleApp/src/SampleApp Integration/AlexaClientSDKConfig.json ../third-party/snowboy/resources/ NONE
now to run MegaMind:
cd $WD/MegaMind/MegaMind_engine
tmuxp load mega.json
The first time you run the Alexa device SDK on a new machine, it asks you to authorize your device with an Amazon account, showing the following messages.
##############################
# NOT YET AUTHORIZED #
##############################
###############################
To Authorize, browse to https://amazon.com/us/code and enter the code:xxxx
##############################
You need to follow the instructions and authorize the device.
After authorization, you need to close all windows and run MegaMind again. To do that, you can press ‘ctrl+b’ followed by ‘:’ and type kill-session. Or you can press ‘ctrl+c’ and ‘ctrl+d’ multiple times. To run MegaMind again
cd $WD/MegaMind/MegaMind_engine
tmuxp load mega.json
Use MegaMind
After running MegaMind, a tmux session opens with three panes. The upper-left pane, is the modified Alexa SDK. The upper-right pane is MegaMind engine logs, and the lower pane is the MegaMind API ( which can be a text based API -if you use mega.json- or voice based API -if you use megaVoice.json’ to run MegaMind).
After running MegaMind, please wait until you see
##################################
# Alexa is currently idle #
##################################
in upper left pane. After seeing this you can use the lower pane to insert your commands. You first need to press ‘s’ followed by Enter key. Then you can type your command followed by Enter key. For example you can type “what time is it” to check if the Alexa device is up and running. Please note that for multi-turn conversations, you should not press ‘s’ before each of your commands, when the Alexa (upper left pane) goes to “listening” state) the MegaMind text API, automatically asks you to insert your next command.
Test MegaMind’s extensios
Before testing MegaMind extensions you need to deploy two Skills on your Alexa account. Please follow this document on how to deploy MegaMind’s Skills on your Alexa account.
To show how MegaMind’s extensions work in action, we enabled 3 simple extensions by default. A discarder that discards purchase-related utterances. A sanitizer that redacts first name of family members ( in this example, alex, steve and julia), and a companion extension which enables secure conversation with its companion skill. We also enabled two beta Alexa skills in our Alexa account to facilitate testing these extensions. The first one is ‘repeat conversation’ which simply repeats whatever you say after the keyword ‘repeat’. The second one is ‘confidential conversation’ which echos whatever you say, but uses MegaMind enabled secure channel (It sends and recieves encrypted messages hidden from AVS).
You can try the following commands to test some features of MegaMind. ( -your cmds, +skill’s responses)
- open repeat conversation
+ Welcome to mega mirror, what do you want me to repeat
- repeat it is a very nice day
+ you said it is a very nice day
- repeat please call steve
+ you said please call ---- (a random name)
- repeat buy me a car
+ your message is discarded by parental control extension
- open confidential conversation
+ Welcome to the confidential conversation skill, do you want to start a secret conversation?
- yes
+ A secure connection has established, tell me something
- how much money do i have in my bank account
+ you said how much money do i have in my bank account
- my name is julia
+ you said my name is ----- ( a random name)
- stop
+ Goodbye
You can see at the upper-left pane that Alexa is not aware of any of the above conversations and it only sees ciphertext.
If at any stage you see an error or any undefined behavior, please kill the tmux session, and run MegaMind again.
Test MegaMind’s voice API
You can test MegaMind’s voice API by running MegaMind using following commands:
cd $WD/MegaMind/MegaMind_engine
tmuxp load megaVoice.json
This time you do not need to press ‘s’ to start a session. You can simply say ‘Alexa’ to start a session, and then you can say your commands. To get a good result from DeepSpeech speech to text please use headphones in a quiet environment.
Acknowledgments
The work was supported in part by NSF Award #1846230.