IPC Classification

Class code (prefix) Descriptions Number of results
  • All sections
  • G - Physics
  • G10L - Speech analysis or synthesis; speech recognition; speech or voice processing; speech or audio coding or decoding
G10L 11/00 Determination or detection of speech or audio characteristics not restricted to a single one of groups ; G10L 15/00-G10L 21/00
G10L 11/02 Detection of presence or absence of speech signals
G10L 11/04 Pitch determination of speech signals
G10L 11/06 Discriminating between voiced and unvoiced parts of speech signals (G10L 11/04 takes precedence);;
G10L 13/00 Speech synthesis; Text to speech systems
G10L 13/02 Methods for producing synthetic speech; Speech synthesisers
G10L 13/04 Methods for producing synthetic speech; Speech synthesisers - Details of speech synthesis systems, e.g. synthesiser structure or memory management
G10L 13/06 Elementary speech units used in speech synthesisers; Concatenation rules
G10L 13/07 Concatenation rules
G10L 13/08 Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G10L 13/10 Prosody rules derived from text; Stress or intonation
G10L 13/027 Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
G10L 13/033 Voice editing, e.g. manipulating the voice of the synthesiser
G10L 13/047 Architecture of speech synthesisers
G10L 15/00 Speech recognition
G10L 15/01 Assessment or evaluation of speech recognition systems
G10L 15/02 Feature extraction for speech recognition; Selection of recognition unit
G10L 15/04 Segmentation; Word boundary detection
G10L 15/05 Word boundary detection
G10L 15/06 Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G10L 15/07 Adaptation to the speaker
G10L 15/08 Speech classification or search
G10L 15/10 Speech classification or search using distance or distortion measures between unknown speech and reference templates
G10L 15/12 Speech classification or search using dynamic programming techniques, e.g. dynamic time warping [DTW]
G10L 15/14 Speech classification or search using statistical models, e.g. Hidden Markov Models [HMM]
G10L 15/16 Speech classification or search using artificial neural networks
G10L 15/18 Speech classification or search using natural language modelling
G10L 15/19 Grammatical context, e.g. disambiguation of recognition hypotheses based on word sequence rules
G10L 15/20 Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise or of stress induced speech
G10L 15/22 Procedures used during a speech recognition process, e.g. man-machine dialog
G10L 15/24 Speech recognition using non-acoustical features
G10L 15/25 Speech recognition using non-acoustical features using position of the lips, movement of the lips or face analysis
G10L 15/26 Speech to text systems
G10L 15/28 Constructional details of speech recognition systems
G10L 15/30 Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
G10L 15/32 Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
G10L 15/34 Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing
G10L 15/065 Adaptation
G10L 15/183 Speech classification or search using natural language modelling using context dependencies, e.g. language models
G10L 15/187 Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams
G10L 15/193 Formal grammars, e.g. finite state automata, context free grammars or word networks
G10L 15/197 Probabilistic grammars, e.g. word n-grams
G10L 17/00 Speaker identification or verification
G10L 17/02 Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
G10L 17/04 Training, enrolment or model building
G10L 17/06 Decision making techniques; Pattern matching strategies
G10L 17/08 Use of distortion metrics or a particular distance between probe pattern and reference templates
G10L 17/10 Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
G10L 17/12 Score normalisation
G10L 17/14 Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
G10L 17/16 Hidden Markov models [HMM]
G10L 17/18 Artificial neural networks; Connectionist approaches
G10L 17/20 Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions
G10L 17/22 Interactive procedures; Man-machine interfaces
G10L 17/24  the user being prompted to utter a password or a predefined phrase
G10L 17/26 Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
G10L 19/00 Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
G10L 19/002 Dynamic bit allocation
G10L 19/03 Spectral prediction for preventing pre-echo; Temporary noise shaping [TNS], e.g. in MPEG2 or MPEG4
G10L 19/04 Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
G10L 19/005 Correction of errors induced by the transmission channel, if related to the coding algorithm
G10L 19/06 Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
G10L 19/07 Line spectrum pair [LSP] vocoders
G10L 19/008 Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
G10L 19/09 Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor
G10L 19/10 Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
G10L 19/012 Comfort noise or silence coding
G10L 19/13 Residual excited linear prediction [RELP]
G10L 19/14 Details not provided for in groups ; G10L 19/06-G10L 19/12, e.g. gain coding, post filtering design or vocoder structure
G10L 19/16 Vocoder architecture
G10L 19/018 Audio watermarking, i.e. embedding inaudible data in the audio signal
G10L 19/20 Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
G10L 19/022 Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring
G10L 19/24 Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
G10L 19/025 Detection of transients or attacks for time/frequency resolution switching
G10L 19/26 Pre-filtering or post-filtering
G10L 19/028 Noise substitution, e.g. substituting non-tonal spectral components by noisy source
G10L 19/032 Quantisation or dequantisation of spectral components
G10L 19/035 Scalar quantisation
G10L 19/038 Vector quantisation, e.g. TwinVQ audio
G10L 19/083 Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being an excitation gain
G10L 19/087 Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC
G10L 19/093 Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using sinusoidal excitation models
G10L 19/097 Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters using prototype waveform decomposition or prototype waveform interpolative [PWI] coders
G10L 19/107 Sparse pulse excitation, e.g. by using algebraic codebook
G10L 19/113 Regular pulse excitation
G10L 19/125 Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP] 
G10L 19/135 Vector sum excited linear prediction [VSELP]
G10L 21/00 Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
G10L 21/01 Correction of time axis
G10L 21/02 Speech enhancement, e.g. noise reduction or echo cancellation
G10L 21/003 Changing voice quality, e.g. pitch or formants
G10L 21/04 Time compression or expansion
G10L 21/06 Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
G10L 21/007 Changing voice quality, e.g. pitch or formants characterised by the process used
G10L 21/10 Transforming into visible information
G10L 21/12 Transforming into visible information by displaying time domain information
G10L 21/013 Adapting to target pitch
G10L 21/14 Transforming into visible information by displaying frequency domain information
G10L 21/16 Transforming into a non-visible representation
G10L 21/18 Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids - Details of the transformation process
G10L 21/028 Voice signal separating using properties of sound source
G10L 21/034 Automatic adjustment
G10L 21/038 Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
G10L 21/043 Time compression or expansion by changing speed
G10L 21/045 Time compression or expansion by changing speed using thinning out or insertion of a waveform
G10L 21/047 Time compression or expansion by changing speed using thinning out or insertion of a waveform characterised by the type of waveform to be thinned out or inserted
G10L 21/049 Time compression or expansion by changing speed using thinning out or insertion of a waveform characterised by the interconnection of waveforms
G10L 21/055 Time compression or expansion for synchronising with other signals, e.g. video signals
G10L 21/057 Time compression or expansion for improving intelligibility
G10L 21/0208 Noise filtering
G10L 21/0216 Noise filtering characterised by the method used for estimating noise
G10L 21/0224 Processing in the time domain
G10L 21/0232 Processing in the frequency domain
G10L 21/0264 Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 21/0272 Voice signal separating
G10L 21/0308 Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
G10L 21/0316 Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
G10L 21/0324 Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude - Details of processing therefor
G10L 21/0332 Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude - Details of processing therefor involving modification of waveforms
G10L 21/0356 Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for synchronising with other signals, e.g. video signals
G10L 21/0364 Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
G10L 21/0388 Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques - Details of processing therefor
G10L 23/00 Speech analysis not provided for in other groups of this subclass
G10L 25/00 Speech or voice analysis techniques not restricted to a single one of groups
G10L 25/03 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters
G10L 25/06 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being correlation coefficients
G10L 25/09 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being zero crossing rates
G10L 25/12 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being prediction coefficients
G10L 25/15 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being formant information
G10L 25/18 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/21 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being power information
G10L 25/24 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being the cepstrum
G10L 25/27 Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique
G10L 25/30 Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G10L 25/33 Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using fuzzy logic
G10L 25/36 Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using chaos theory
G10L 25/39 Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using genetic algorithms
G10L 25/45 Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of analysis window
G10L 25/48 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use
G10L 25/51 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination
G10L 25/54 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for retrieval
G10L 25/57 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for processing of video signals
G10L 25/60 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
G10L 25/63 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for estimating an emotional state
G10L 25/66 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
G10L 25/69 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for evaluating synthetic or decoded voice signals
G10L 25/72 Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for transmitting results of analysis
G10L 25/75 Speech or voice analysis techniques not restricted to a single one of groups for modelling vocal tract parameters
G10L 25/78 Detection of presence or absence of voice signals
G10L 25/81 Detection of presence or absence of voice signals for discriminating voice from music
G10L 25/84 Detection of presence or absence of voice signals for discriminating voice from noise
G10L 25/87 Detection of discrete points within a voice signal
G10L 25/90 Pitch determination of speech signals
G10L 25/93 Discriminating between voiced and unvoiced parts of speech signals
G10L 99/00 Subject matter not provided for in other groups of this subclass