Publications | Jianbo Ma

Publications by categories in reversed chronological order. For more details, please visit my Google Scholar profile.

2025

Data-Driven White Noise Gain Constrained Robust Superdirective Beamformer for Speech Enhancement

Hanchen Pei, Gongping Huang, Jilu Jin, and 4 more authors

In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025

@inproceedings{pei2025data,
  title = {Data-Driven White Noise Gain Constrained Robust Superdirective Beamformer for Speech Enhancement},
  author = {Pei, Hanchen and Huang, Gongping and Jin, Jilu and Ma, Jianbo and Wu, Zhizheng and Chen, Jingdong and Benesty, Jacob},
  booktitle = {ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages = {1--5},
  year = {2025},
  organization = {IEEE},
}

Rethinking mamba in speech processing by self-supervised models

Xiangyu Zhang, Jianbo Ma, Mostafa Shahin, and 2 more authors

In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025

Bib PDF

@inproceedings{zhang2025rethinking,
  title = {Rethinking mamba in speech processing by self-supervised models},
  author = {Zhang, Xiangyu and Ma, Jianbo and Shahin, Mostafa and Ahmed, Beena and Epps, Julien},
  booktitle = {ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  pages = {1--5},
  year = {2025},
  organization = {IEEE},
}

Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models

Saksham Singh Kushwaha, Jianbo Ma, Mark R. P. Thomas, and 2 more authors

In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025

Bib PDF

@inproceedings{10888882,
  author = {Kushwaha, Saksham Singh and Ma, Jianbo and Thomas, Mark R. P. and Tian, Yapeng and Bruni, Avery},
  booktitle = {ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title = {Diff-SAGe: End-to-End Spatial Audio Generation Using Diffusion Models},
  year = {2025},
  volume = {},
  number = {},
  pages = {1-5},
  keywords = {Measurement;Scalability;Spatial audio;Semantics;Phase estimation;Noise;Diffusion models;MONOS devices;Speech processing;Spectrogram;Spatial audio generation;Ambisonics},
  doi = {10.1109/ICASSP49660.2025.10888882},
}

2024

Gotta hear them all: Sound source aware vision to audio generation

Wei Guo, Heng Wang, Jianbo Ma, and 1 more author

arXiv preprint arXiv:2411.15447, 2024

Bib PDF

@article{guo2024gotta,
  title = {Gotta hear them all: Sound source aware vision to audio generation},
  author = {Guo, Wei and Wang, Heng and Ma, Jianbo and Cai, Weidong},
  journal = {arXiv preprint arXiv:2411.15447},
  year = {2024},
}

A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model

Dongdi Zhao, Jianbo Ma, Lu Lu, and 6 more authors

arXiv preprint arXiv:2401.02673, 2024

Bib PDF

@article{zhao2024unified,
  title = {A unified multichannel far-field speech recognition system: combining neural beamforming with attention based end-to-end model},
  author = {Zhao, Dongdi and Ma, Jianbo and Lu, Lu and Li, Jinke and Ji, Xuan and Zhu, Lei and Fang, Fuming and Liu, Ming and Jiang, Feijun},
  journal = {arXiv preprint arXiv:2401.02673},
  url = {https://arxiv.org/pdf/2401.02673},
  year = {2024},
}

V2a-mapper: A lightweight solution for vision-to-audio generation by connecting foundation models

Heng Wang, Jianbo Ma, Santiago Pascual, and 2 more authors

In Proceedings of the AAAI Conference on Artificial Intelligence, 2024

Bib PDF

@inproceedings{wang2024v2a,
  title = {V2a-mapper: A lightweight solution for vision-to-audio generation by connecting foundation models},
  author = {Wang, Heng and Ma, Jianbo and Pascual, Santiago and Cartwright, Richard and Cai, Weidong},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  volume = {38},
  number = {14},
  pages = {15492--15501},
  year = {2024},
}

A low latency attention module for streaming self-supervised speech representation learning (second version of ’low latency attention’)

Jianbo Ma, Siqi Pan, Deepak Chandran, and 2 more authors

arXiv preprint arXiv:2302.13451, 2024

arXiv Bib PDF

@article{ma2023low,
  title = {A low latency attention module for streaming self-supervised speech representation learning (second version of 'low latency attention')},
  author = {Ma, Jianbo and Pan, Siqi and Chandran, Deepak and Fanelli, Andrea and Cartwright, Richard},
  journal = {arXiv preprint arXiv:2302.13451},
  year = {2024},
  url = {https://arxiv.org/abs/2302.13451},
}

2023

Low latency transformers for speech processing

Jianbo Ma, Siqi Pan, Deepak Chandran, and 2 more authors

arXiv preprint arXiv:2302.13451, 2023

arXiv Bib PDF

@article{ma2023lox,
  title = {Low latency transformers for speech processing},
  author = {Ma, Jianbo and Pan, Siqi and Chandran, Deepak and Fanelli, Andrea and Cartwright, Richard},
  journal = {arXiv preprint arXiv:2302.13451},
  year = {2023},
  url = {https://arxiv.org/abs/2302.13451},
}

2022

Hidden Markov Models and Connectionist Temporal Classification

Jianbo Ma

In , 2022

Bib PDF

2021

ASR technical report

Jianbo Ma

In , 2021

Bib PDF