Windows Speech Recognition (WSR) is speech recognition developed by Microsoft for Windows Vista that enables voice commands to control the desktop user interface, dictate text in electronic documents and email, navigate websites, perform keyboard shortcuts, and operate the mouse cursor. It supports custom macros to perform additional or supplementary tasks. Hotkeys for this action is Windows logo key+Ctrl+S (windows 10, windows 11)
WSR is a locally processed speech recognition platform; it does not rely on cloud computing for accuracy, dictation, or recognition, but adapts based on contexts, grammars, speech samples, training sessions, and vocabularies. It provides a personal dictionary that allows users to include or exclude words or expressions from dictation and to record pronunciations to increase recognition accuracy. Custom language models are also supported.
With Windows Vista, WSR was developed to be part of Windows, as speech recognition was previously exclusive to applications such as Windows Media Player. It is present in Windows 7, Windows 8, Windows 8.1, Windows RT, Windows 10, and Windows 11.
Microsoft was involved in speech recognition and speech synthesis research for many years before WSR. In 1993, Microsoft hired Xuedong Huang from Carnegie Mellon University to lead its speech development efforts; the company's research led to the development of the Speech API (SAPI) introduced in 1994. Speech recognition had also been used in previous Microsoft products. Office XP and Office 2003 provided speech recognition capabilities among Internet Explorer and Microsoft Office applications; it also enabled limited speech functionality in Windows 98, Windows Me, Windows NT 4.0, and Windows 2000. Windows XP Tablet PC Edition 2002 included speech recognition capabilities with the Tablet PC Input Panel, and Microsoft Plus! for Windows XP enabled voice commands for Windows Media Player. However, these all required installation of speech recognition as a separate component; before Windows Vista, Windows did not include integrated or extensive speech recognition.