Feel free to dive into any section that interests you :-)
Table of contents
- What is the difference between sound power and sound intensity?
- How do we convert sound pressure between dB SPL and pascals (Pa)?
- What’s the difference between dB SPL and dB(A)?
- How do density and elasticity of a medium affect sound speed?
- What is the Doppler effect, and how does it work?
- What is room impulse response (RIR), and how is it measured?
- What are the effects of reverberation in room acoustics?
- How is reverberation measured (RT60)?
- How can we simulate reverberation digitally?
- What methods are used to analyze time delay in audio signals?
- What should you consider when choosing a microphone?
- How do you calibrate a microphone?
- What is an Anti-Aliasing filter?
- What are typical sampling rates and bit ranges for audio?
- What are the common interfaces used in digital audio systems?
- How do FIR and IIR filters differ?
- What does the filtfilt function do?
- How does a preamplifier work in a microphone setup?
- How is zero-phase filtering done, and what are its benefits?
- How can we test the stability of digital filters?
- What is signal energy, and how do we calculate it?
- What are the uses of ZCR and FFT in audio analysis?
- How can we estimate the pitch of speech?
- What are common audio features, and how do we extract them?
- How can we test the similarity between two audio signals?
- What is STFT, and how is it done?
- What are the key considerations when implementing STFT?
- What is MFCC used for in audio processing?
- How does the number of quantizer levels change the dynamic range?
- How does AD-PCM work?
- What is LPC, and how does it represent speech?
- How is mu-law quantization different from linear quantization?
- How does spectral subtraction work?
- What is the Wiener filtering method?
- When is wavelet-based denoising useful?
- What is Speech Presence Probability (SPP), and how is it used?
- How is adaptive filtering used for noise reduction and echo cancellation?
- What are the challenges in sound classification?
- How is deep learning used in sound classification?
- What metrics evaluate classification models?
- What deep networks are common for speech enhancement?
- How is phase handled in speech enhancement?
- Why might MSE not be the best loss function?
- What metrics evaluate speech enhancement models?
- What’s the difference between diarization, identification, and verification?
- What networks are used for speaker recognition?
- What are speaker embeddings, and how are they used?
- How are x-vectors different from i-vectors?
- What methods are used for speech recognition?
- How is audio prepared for speech recognition?
- How are speech recognition models evaluated?
- How does Whisper use weak supervision?
- What is the Whisper model architecture?
- What are the key features and differences between Wav2Vec models?
- How does CTC encoding help Wav2Vec?
- What’s the role of Beam Search in Wav2Vec?