Skip to content

AAASTRONAUT/Nao-Telepresence

Repository files navigation

NAO Speech pipe

1. How to run

  1. Make sure you have Naoqi(official package, can be installed from here) setup and installed in your pc.
  2. Configure the .bash_profile file preset in your home directory.
  3. Make sure your NAO robot is connected to the same network as your pc.
  4. Then run the speech_convert.sh file present in the speech_pipe directory passing NAO’s ip address which can be accessed form pressing the button in it’s chest.
  5. Your NAO robot is now ready to listen to you whenever touched on its head.
  6. General Details:

1. Naoqi

This python library is provided by Aldebaran , which provides various functions to control the various sensors and actuators in the robot.

2. Python version mismatch

Naoqi is configured for python2.7 but all the latest speech models are compatible with python3 which created a major hurdle which is solved by doing all the computationin python3 files and all the functions that involve direct interaction with the robot has to be defined in a separate python2.7 file.

python2.7 files: recording.py, touchreact_record.py and speak_nao.py

python3 files: Speech_pipe.py

  1. Latency:

The current latency of the model is ~4 seconds. The latency depends on factors like internet speed, speech model and local processing capabilities of the user, out of these three internet speed is a major factor which can increase or decrease the latency of the whose system by a factor of 2. Various speech models were tested and out of which Google’s Gemini and Openai’s chatGPT 4 shows minimum latency.

3. Installing Naoqi(for unix based systems):

  1. The Naoqi package can be installed from here.
  2. After installation of the Naoqi package, rename the folder as “pynaoqi” include the below mentioned commands into your .bash_profile file present in the home directory.

export PYTHONPATH=${PHTHONPATH}:path/to/pynaoqi/lib/python2.7/site-packages export DYLD_LIBRARY_PATH=${DYLD_LIBRARY_PATH}:path/to/pynaoqi/lib export QI_SDK_PREFIX=path/to/pynaoqi

  1. Now quit the terminal and run “source .bash_profile”

3. Description of code files inside the speech_pipe directory

  1. speech_convert.sh: This script triggers the touchreact_record.py file and monitor_nao.sh script and looks out for any changes in the uploads folder which stores the user voice input which is later passed to chatGPT.
  2. monitor_nao.sh: This script is responsible for separately calling the Speech_pipe.py and speak_nao.py file.
  3. touchreact_record.py: This file checks whether the robot is touched on it’s head. As soon as the user touches the head of the robot this file calls the Recording_thread function defined in the Recording.py file.
  4. Recording.py: This file defines the Recording_thread function which stores the users recording into the robot local home directory and then transfers the .wav audio file to the uploads folder by establishing an ssh connection.
  5. Speech_pipe.py:
    1. Speech_2_txt function: This converts the audio file in to a text file by using audio transcribe provided by Openai and moves the audio file from uploads folder to the old_file folder.
    2. model function: This function stores the response of chatGPT into ans.txt file taking the transcribed test file as input.
  6. speak_nao.py: This file simply just calls the ALTextToSpeech functionality provided by Naoqi and makes the NAO robot speak the response stored in ans.txt.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •