How to add voice commands to webpage in Javascript
By FoxLearn 2/19/2025 8:27:13 AM 55
As a developer, you're now able to create a website with voice commands that suit your needs. The HTML5 Speech Recognition API allows JavaScript to access a browser's audio stream and convert it into text. With Artyom.js, a library for managing voice commands, this task becomes simple.
Note: WebkitSpeechRecognition is currently only available in Google Chrome. While we hope it will eventually become a standard for all browsers, for now, you can only try Artyom in Chrome.
Basic Setup Using SpeechRecognition
To start, add Artyom.js
to your document within the <head>
tag. You can get the library from the official GitHub repository:
<!DOCTYPE html> <html> <head> <title>Cooking with Artyom.js</title> <!-- Important: Load Artyom in the head tag for voice resources to load properly --> <script type="text/javascript" src="path/to/artyom.min.js"></script> <script> // Create a globally accessible instance of Artyom window.artyom = new Artyom(); </script> </head> <body> <script> // Artyom is now available! </script> </body> </html>
It's important to read the documentation to understand how commands work. Artyom lets you add both simple and "smart" commands.
Normal commands: Triggered by recognized speech that matches any word in the indexes
array.
artyom.addCommands({ indexes: ["Hello", "Hey", "Hurra"], action: function(i) { // i = index of the matched word console.log("Something matches!"); } });
Smart commands: Allow you to capture parts of the spoken text, such as a variable name, for more dynamic functionality.
artyom.addCommands({ smart: true, // Mark this command as "smart" indexes: ["How many people live in *"], // '*' represents dynamic spoken text action: function(i, wildcard) { switch(wildcard) { case "Berlin": alert("Why should I know something like this?"); break; case "Paris": alert("I don't know."); break; default: alert("I don't know the city " + wildcard + ". Add more cases!"); break; } } });
You can use artyom.simulateInstruction()
to test how the voice command will behave when triggered. This allows you to verify your commands without speaking.
artyom.simulateInstruction("How many people live in Paris"); // Alert: "I don't know."
To start Artyom, use the initialize
function. Here are the basic settings you'll need to configure:
- lang: Language code for the supported Artyom language (see the documentation for available languages).
- continuous: Set to
true
for HTTPS connections to allow continuous listening, otherwise set tofalse
for one-time listening. - listen: Set to
true
to enable Artyom's listening mode. - debug: Set to
true
to log recognized speech and other information in the console.
artyom.initialize({ lang: "en-GB", // Language code (English - Great Britain) continuous: false, // Use continuous mode if you have HTTPS debug: true, // Show debug info in the console listen: true // Start listening for commands });
Once initialized, Artyom will be ready to process voice commands.
If you want to stop Artyom, use the fatality
function. This halts the Artyom instance immediately.
artyom.fatality();
Here's a basic structure of an HTML file to set up voice recognition:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>Voice Commands</title> </head> <body> <h1>Voice Command Example</h1> <button id="start">Start Voice Command</button> <script> // Check for browser support const SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition; if (SpeechRecognition) { const recognition = new SpeechRecognition(); recognition.lang = 'en-US'; recognition.interimResults = true; // Show intermediate results recognition.maxAlternatives = 1; // Limit the recognition alternatives document.getElementById("start").onclick = function () { recognition.start(); // Start voice recognition }; recognition.onstart = function () { console.log('Voice recognition started'); }; recognition.onresult = function (event) { const transcript = event.results[0][0].transcript; console.log("You said: ", transcript); // Here, you can check for specific voice commands and trigger actions if (transcript.toLowerCase().includes("hello")) { alert("Hello there!"); } else if (transcript.toLowerCase().includes("goodbye")) { alert("Goodbye!"); } }; recognition.onerror = function (event) { console.error("Error occurred in recognition: ", event.error); }; } else { console.log("Speech Recognition not supported in this browser."); } </script> </body> </html>
To make the voice command system more flexible, you can add more conditions inside the onresult
handler.
For example, you can check if the recognized text matches specific commands and trigger different actions, like controlling a light, playing music, or even navigating to different parts of your webpage.
recognition.onresult = function (event) { const transcript = event.results[0][0].transcript; console.log("You said: ", transcript); if (transcript.toLowerCase().includes("hello")) { alert("Hello, how can I assist you?"); } else if (transcript.toLowerCase().includes("play music")) { alert("Playing music now!"); } else if (transcript.toLowerCase().includes("open google")) { window.location.href = 'https://www.google.com'; } };
For a more advanced solution, you can use Artyom.js, a library that simplifies voice command handling and adds additional features like voice output (text-to-speech) and more intelligent command parsing.
- How to convert voice to text in Javascript
- LET vs VAR in JavaScript Variable Declarations
- How to capture an image in javascript
- How to Build Your Own JavaScript Library
- How to reverse a string properly in Javascript
- How to bypass 'Access-Control-Allow-Origin' error with XMLHttpRequest
- What is Hoisting in JavaScript
- How to get the client IP address in Javascript