Tokyo City University

Researchers Information System

inspection-site

Japanese English

TOP
Search by Faculty
or Department
Search by Keyword
Search by Research
Field
Detailed Search

Tokyo City University top

IWANO Koji

Profile Research field Research achievement Educational achievement Social contribution achievement

 

Books etc  
No.TitleAutour TypePublisherPublication dateRangeISBN
1DSP for In-Vehicle and Mobile Systems Contributor Springer-Verlag Jan. 2005 Chapter 9, pp.139-152, Noise Robust Speech Recognition Using Prosodic Information 0387229787
2Spoken Multimodal Human-Computer Dialogue in Mobile Environments Contributor Springer-Verlag Jan. 2005 Chapter 3, pp.37-53, A Robust Multimodal Speech Recognition Method Using Optical Flow Analysis 1402030738
3Text to Speech Synthesis - New Paradigms and Advances - Contributor Prentice Hall PTR Jul. 2004 Chapter 8, pp.155-173, Prosody Control for HMM-Based Japanese TTS 013145661X

 

Published Papers  
No.TitleJournalVolNoStart PageEnd PagePublication dateDOIReferee
1Multimodal Speech Recognition Using Mouth Images from Depth Camera Proc. APSIPA pp. 1233 1236 Dec. 2017 https://doi.org/10.1109/APSIPA.2017.82822271Refereed 
2Error correction using long context match for smartphone speech recognition IEICE Transactions on Information and Systems E98D 11 1932 1942 Nov. 1, 2015 https://doi.org/10.1587/transinf.2015EDP71791Refereed 
3Error Correction Using Long Context Match for Smartphone Speech Recognition IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E98D 11 1932 1942 Nov. 2015 https://doi.org/10.1587/transinf.2015EDP71791Refereed 
4AN EFFICIENT ERROR CORRECTION INTERFACE FOR SPEECH RECOGNITION ON MOBILE TOUCHSCREEN DEVICES 2014 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY SLT 2014 454 459 2014  Refereed 
5Simple Gesture-based Error Correction Interface for Smartphone Speech Recognition 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4 1194 1198 2014  Refereed 
6Feature normalization based on non-extensive statistics for speech recognition Speech Communication 55 587 599 Jun. 2013 https://doi.org/10.1016/j.specom.2013.02.0041Refereed 
7A noise-robust speech recognition approach incorporating normalized speech/non-speech likelihood into hypothesis scores SPEECH COMMUNICATION 55 377 386 Feb. 2013 https://doi.org/10.1016/j.specom.2012.10.0011Refereed 
8Detection of overlapped speech using lapel microphones in meeting Speech Communication 55 10 941 949 2013 https://doi.org/10.1016/j.specom.2013.06.0131Refereed 
9Spectral subtraction based on non-extensive statistics for speech recognition IEICE Transactions on Information and Systems E96-D 1774 1782 2013 https://doi.org/10.1587/transinf.E96.D.17741Refereed 
10Overlapped Speech Detection in Meeting Using Cross-Channel Spectral Subtraction and Spectrum Similarity 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 1498 1501 2012  Refereed 
11Q-Gaussian based spectral subtraction for robust speech recognition 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3 1254 1257 2012  Refereed 
12VAD-measure-embedded Decoder with Online Model Adaptation 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4 3122 2010  Refereed 
13Optimization of On-the-Fly Composition for WFST-Based Speech Recognition Decoders The IEICE Transactions on Information and Systems Vol.J92-D No.7 1026 1035 2009  Not refereed 
14Robust Speech Recognition Using VAD-measure-embedded Decoder INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5 2203 2009  Refereed 
15GENERALIZATION OF SPECIALIZED ON-THE-FLY COMPOSITION 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS 4317 4320 2009  Refereed 
16Differences between acoustic characteristics of spontaneous and read speech and their effects on speech recognition performance COMPUTER SPEECH AND LANGUAGE 22 171 184 Apr. 2008 https://doi.org/10.1016/j.csl.2007.07.0031Refereed 
17Evaluation of a noise-robust multi-stream speaker verification method using F(0) information IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E91D 549 557 Mar. 2008 https://doi.org/10.1093/ietisy/e9l-d.3.5491Refereed 
18Language Model Adaptation Using Machine-Translated Text for Resource-Deficient Languages EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING Vol.2008 Article ID 573832 7 pages 2008 https://doi.org/10.1155/2008/5738321Refereed 
19Thai Broadcast News Corpus Construction and Evaluation SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008 1249 1254 2008  Refereed 
20Implementation and Evaluation of Fast On-the-fly WFST Composition Algorithms INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5 2110 2113 2008  Refereed 
21Dynamic Language Model Adaptation Using Presentation Slides for Lecture Speech Recognition INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 89 92 2007  Refereed 
22Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING Vol.2007 Article ID 64506 9 pages 2007 https://doi.org/10.1155/2007/645061Refereed 
23The effect of spectral space reduction in spontaneous speech on recognition performances 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 473 2007  Refereed 
24Combining Gaussian mixture model with Global Variance term to improve the quality of an HMM-based polyglot speech synthesizer 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 1241 2007  Refereed 
25Presentation-Content Retrieval Integrated with the Speech Information IEICE Transactions on Information and Systems Vol.J90-D No.2 209 222 2007  Not refereed 
26New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer SPEECH COMMUNICATION 48 10 1227 1242 Oct. 2006 https://doi.org/10.1016/j.specom.2006.05.0031Refereed 
27Sentence-extractive automatic speech summarization and evaluation techniques SPEECH COMMUNICATION 48 1151 1161 Sep. 2006 https://doi.org/10.1016/j.specom.2006.04.0051Refereed 
28A Weight Estimation Method Using LDA for Multi-Band Speech Recognition INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5 2534 2537 2006  Refereed 
29A stream-weight and threshold estimation method using Adaboost for multi-stream speaker verification 2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS 1081 2006  Refereed 
30A stream-weight and threshold estimation method using adaboost for multi-stream speaker verification 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13 5939 5942 2006  Refereed 
31Analysis and recognition of spontaneous speech using Corpus of Spontaneous Japanese SPEECH COMMUNICATION 47 1-2 208 219 Sep. 2005 https://doi.org/10.1016/j.specom.2005.02.0101Refereed 
32Sentence extraction-based presentation summarization techniques and evaluation metrics 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5 1065 1068 2005  Refereed 
33A ROBUST MULTIMODAL SPEECH RECOGNITION METHOD USING OPTICAL FLOW ANALYSIS SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE IN MOBILE ENVIRONMENTS 28 37 53 2005  Refereed 
34Why is the recognition of spontaneous speech so hard? TEXT, SPEECH AND DIALOGUE, PROCEEDINGS 3658 22 2005  Refereed 
35A stream-weight optimization method for multi-stream HMMS based on likelihood value normalization 2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5 469 472 2005  Refereed 
36Polyglot synthesis using a mixture of monolingual corpora ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings I1 I4 2005 https://doi.org/10.1109/ICASSP.2005.14150351Refereed 
37Noise robust speech recognition using F-0 contour information IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS E87D 1102 1109 May. 2004  Refereed 
38Multi-modal speech recognition using optical-flow analysis for lip images JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 36 2-3 117 124 Feb. 2004  Refereed 
39A stream-weight optimization method for audio-visual speech recognition using multi-stream HMMS 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS 857 860 2004  Refereed 
40Unsupervised class-based language model adaptation for spontaneous speech recognition 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS 236 239 2003  Refereed 
41Parallel computing-based architecture for mixed-initiative spoken dialogue FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS 53 58 2002  Refereed 
42Ubiquitous speech processing 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS 13 16 2001  Refereed 
43Detection of prosodic word boundaries by statistical modeling of mora transitions of fundamental frequency contours and its use for continuous speech recognition ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings 1763 1766 2000 https://doi.org/10.1109/ICASSP.2000.8620941Refereed 
44Integration of Prosodic Word Boundary Detection to Unlimited-Vocabulary Speech Recognition IEICE Transactions on Information and Systems Vol.J83-D-II No.10 1977 1985 2000  Not refereed 
45A Statistical Modeling of Fundamental Frequency Contours in Moraic Unit and Its Use for the Detection of Prosodic Word Boundaries IPSJ Journal Vol.40 No.4 1356 1364 1999  Not refereed 

 

MISC  
No.TitleJournalVolNoStart PageEnd PagePublication date
1Analysis of effects of voice mimicry on speaker verification and acoustic features of the imitated voices IEICE technical report. Speech 114 411 43 48 Jan. 22, 2015 
2Error Correction Using Long Context Match for Smartphone Speech Recognition IEICE technical report. Speech 114 365 117 122 Dec. 15, 2014 
3Error Correction Using Long Context Match for Smartphone Speech Recognition IPSJ SIG Notes 2014 22 Dec. 8, 2014 
4Detecting Overlapped Speech in Meeting Recorded by Lapel Microphones 2012 Jul. 12, 2012 
5Two-pass Approach for Recognizing Code-Switching Speech Technical report of IEICE. PRMU 111 430 225 229 Feb. 2, 2012 
6Nonlinear Normalization Using q-Logarithm for Robust Speech Recognition IEICE technical report 111 153 45 50 Jul. 14, 2011 
7Noise-robust speech recognition decoder using speech/non-speech confidence measures IEICE technical report 110 81 49 54 Jun. 10, 2010 
8A Prosody Adaptation Method for HMM-based Speech Synthesis Achieving High Naturalness and Individurity 2010 12 Feb. 5, 2010 
9A mean F_0 speaker adaptation method for regression model-based F_0 contour generation IEICE technical report 109 99 87 92 Jun. 17, 2009 
10A study on prosody control for spontaneous speech synthesis 2009 23 May. 14, 2009 
11Speeding up fundamental frequency information extraction by Hough transform for noise-robust speech recognition IEICE technical report 108 422 19 24 Jan. 22, 2009 
12Improvements and evaluations of on-the-fly WFST composition in speech recognition IPSJ SIG Notes 2008 102 29 34 Oct. 17, 2008 
13Accent analysis for Mandarin large vocabulary continuous speech recognition 38 123 127 Mar. 20, 2008 
14Accent Analysis for Mandarin Large Vocabulary Continuous Speech Recognition IEICE technical report 107 551 87 91 Mar. 13, 2008 
15Initial Evaluation of the Drivers' Japanese Speech Corpus in a Car Environment IEICE technical report 107 551 93 98 Mar. 13, 2008 
16Speaker verification using multi-stream HMMs with dimensionally weighted feature vectors IEICE technical report. Speech 107 406 43 47 Dec. 13, 2007 
17A Study on Multimodal Speech Recognition for Spoken Dialogue Systems IEICE technical report 107 77 19 24 May. 24, 2007 
18A Study on the Statistical Models for HMM-Based Spontaneous Speech Synthesis IEICE technical report 107 77 13 18 May. 24, 2007 
19Using presentation slide information for lecture speech recognition IPSJ SIG Notes 2006 136 221 226 Dec. 22, 2006 
20Using presentation slide information for lecture speech recognition 106 442 43 48 Dec. 15, 2006 
21The Analysis of Acoustic and Linguistic Characteristics in Spontaneous Japanese IEICE technical report 106 78 19 24 May. 19, 2006 
22An LDA-based Weight Estimation Method for Multi-Band Speech Recognition IEICE technical report 106 78 13 18 May. 19, 2006 
23Spoken dialogue system robust against speech variations based on massively parallel computing IEICE technical report 105 494 Dec. 22, 2005 
24HMM-based speaker adaptable polyglot synthesizer : Development and evaluation IPSJ SIG Notes 2005 127 217 222 Dec. 22, 2005 
25Spoken dialogue system robust against speech variations based on massively parallel computing IPSJ SIG Notes 2005 127 91 96 Dec. 22, 2005 
26HMM-based speaker adaptable polyglot synthesizer : Development and evaluation IEICE technical report 105 494 127 132 Dec. 22, 2005 
27A threshold optimization method based on Adaboost for multi-stream speaker verification IEICE technical report 105 495 Dec. 21, 2005 
28A threshold optimization method based on Adaboost for multi-stream speaker verification IPSJ SIG Notes 2005 127 Dec. 21, 2005 
29Sentence Extraction-Based Speech Summarization Methods and Objective Evaluation Techniques IEICE technical report. Speech 105 132 Jun. 16, 2005 
30Language Model Adaptation for ASR Using Machine-Translated Data IEICE technical report. Speech 105 132 19 23 Jun. 16, 2005 
31Toward realization of HMM-based spontaneous speech synthesis IEICE technical report. Speech 105 98 25 30 May. 20, 2005 
32Analysis of cepstral features of Japanese spontaneous speech using Mahalanobis distance 2005 231 232 Mar. 8, 2005 
33Evaluation of speech summarization techniques using objective metrics 2005 Mar. 8, 2005 
34Addition of new languages to a polyglot HMM-based synthesizer 2005 197 198 Mar. 8, 2005 
35A study on automatic lecture segmentation for indexing purposes 2005 Mar. 8, 2005 
36A stream-weight optimization method for audio-visual speech recognition in real environments IPSJ SIG Notes 2005 12 29 34 Feb. 4, 2005 
37A stream-weight optimization method based on boosting for multi-stream speaker verification IEICE technical report. Natural language understanding and models of communication 104 539 85 90 Dec. 21, 2004 
38A stream - weight optimization method based on boosting for multi - stream speaker verification IPSJ SIG Notes 2004 131 175 180 Dec. 21, 2004 
39Analysis of acoustic characteristics in sponaneous speech using Corpus of spontaneous Japanese IPSJ SIG Notes 2004 103 12 Oct. 22, 2004 
40Use of F0 information for noise - robust speaker verification IPSJ SIG Notes 2004 57 31 36 May. 28, 2004 
41Use of F_0 information for noise-robust speaker verification IEICE technical report. Speech 104 87 May. 21, 2004 
42Noise - robust speech recognition using band - dependent weighted likelihood IPSJ SIG Notes 2003 124 19 24 Dec. 18, 2003 
43Investigation of a stream - weight optimization method for multi - modal speech recognition IPSJ SIG Notes 2003 124 241 246 Dec. 18, 2003 
44Investigation of a stream-weight optimization method of multi-modal speech recognition IEICE technical report. Natural language understanding and models of communication 103 517 241 246 Dec. 11, 2003 
45Noise-robust speech recognition using band-dependent weighted likelihood IEICE technical report. Natural language understanding and models of communication 103 517 19 24 Dec. 11, 2003 
46Multi-Modal Person Authentication Using Speech and Ear Images IEICE technical report. Speech 103 94 25 30 May. 30, 2003 
47A Multi - Modal Speech Recognition Using Side - Face Images IPSJ SIG Notes 2003 58 61 66 May. 27, 2003 
48Use of Prosodic Information for Noise - Robust Speech Recognition IPSJ SIG Notes 2003 58 55 60 May. 27, 2003 
49A Rapid Listening System for Presentations Using Automatic Speech Summarization Techniques IPSJ SIG Notes 2003 57 83 88 May. 26, 2003 
50Multi-modal speaker verification using speech and face images 2003 107 108 Mar. 18, 2003 
51Improvement of visual features for multi-modal speech recognition 2003 195 196 Mar. 18, 2003 
52Improving naturalness using residual exicitation for HMM-based speech synthesis 2003 241 242 Mar. 18, 2003 
53Unsupervised batch-type topic adaptation for language models 2003 129 130 Mar. 18, 2003 
54Multi-modal speaker verification using speech and ear images 2003 109 110 Mar. 18, 2003 
55Unsupervised batch - type adaptation method for language models IPSJ SIG Notes 2002 121 183 188 Dec. 16, 2002 
56Unsupervised batch-type adaptation method for language models IEICE technical report. Natural language understanding and models of communication 102 528 19 24 Dec. 13, 2002 
57Robust F_0 Extraction for Noisy Environments and Its Use for Speech Recognition Technical report of IEICE. EA 102 33 37 42 Apr. 19, 2002 
58Evaluation of multi-modal speech recognition in real environments 2002 151 152 Mar. 18, 2002 
59Parallel computing-based meeting speech recognition system with incremental on-line speaker adaptation 2002 105 106 Mar. 18, 2002 
60A study on multi - modal speech recognition using optical - flow analysis 2002 10 33 38 Feb. 1, 2002 
61Robust Pitch Extraction for Noisy Environments Using Hough Transformation IPSJ SIG Notes 2001 100 14 Oct. 19, 2001 
62A Study on F0 Contour Generation Factors Using Categorical Multiple Regression IPSJ SIG Notes 2001 100 15 20 Oct. 19, 2001 
63Pitch extraction using Hough transformation under noisy environments 2001 209 210 Oct. 1, 2001 
64A study on pitch contour generation factors using categorical multiple regression. 2001 221 222 Oct. 1, 2001 
65Multimodal speech recognition using optical-flow analysis 2001 27 28 Oct. 1, 2001 
66Meeting speech recognition system using parallel computing. 2001 113 114 Oct. 1, 2001 
67Development and Evaluation of a Spoken Dialog System Using Spontaneous Speech IPSJ SIG Notes 2001 55 79 86 Jun. 1, 2001 
68Use of Prosodic Word Boundary Information for Unlimited-Vocabulary Speech Recognition IEICE technical report. Natural language understanding and models of communication 99 524 73 78 Dec. 21, 1999 
69Use of Prosodic Word Boundary Information for Unlimited - Vocabulary Speech Recognition IPSJ SIG Notes 1999 108 205 210 Dec. 20, 1999 
70Recognition of Family and Given Names of Unlimited Vocabulary Based on Prosodic Word Boundary Detection 1999 151 152 Mar. 1, 1999 
71Recognizing Accent Types and Detecting Prosodic Word Boundaries Using Statistical Models of Moraic Transition IEICE technical report. Speech 98 106 Jun. 12, 1998 
72Expression of Accent Phrases by Statistical Models of Moraic Transition 1998 153 154 Mar. 1, 1998 
73Improvements in Syntactic Boundary Detection by Statistical Models of Moraic Transition 1997 133 134 Sep. 1, 1997 
74Detecting Syntactic Boundaries Using Statistical Models of Moraic Transition IEICE technical report. Speech 97 114 33 40 Jun. 19, 1997 

 

Conference Activities & Talks  
No.TitleConferencePublication datePromoterVenue
1Multimodal Speech Recognition and Analysis of Spontaneous Speech - Memories of Research in Furui Laboratory - Jan. 26, 2023 
2Analysis of Effects of Voice Mimicry Attack by Professional/Non-Professional Impersonators on Deep Learning Based Speaker Verification The 12th Symposium on Biometrics, Recognition and Authentication Nov. 16, 2022 
3Noise-Tolerant Time-Domain Speech Separation with Noise Bases Asia-Pacific Signal and Information Processing Association Annual Summit and Conference Dec. 16, 2021 
4Noise-robust time-domain speech separation with basis signals for noise Mar. 3, 2021 
5Team Takoyaki submission for VoxCeleb Speaker Recognition Challenge 2020 the VoxSRC Workshop 2020 Oct. 2020 
6A Kinect-based Multimodal Person Authentication System with User Existence Confirmation 電子情報通信学会技術研究報告 Mar. 11, 2018 
7Multimodal speech recognition using mouth images from depth camera Proceedings - 9th Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2017 Feb. 5, 2018 
8Neural network-based estimation of degree of feeling that natural objects appear in photographic images 電子情報通信学会技術研究報告 Sep. 28, 2017 
9口唇深度画像を利用したディープオートエンコーダに基づくマルチモーダル音声認識 日本音響学会研究発表会講演論文集(CD-ROM) Sep. 11, 2017 
10プロの物真似タレントの声真似が話者照合に与える影響と音響特徴の分析 電子情報通信学会技術研究報告 Aug. 23, 2017 
11口唇の深度画像を用いたディープオートエンコーダによるマルチモーダル音声認識 情報処理学会研究報告(Web) Jul. 20, 2017 
12日本語楽曲の旋律と歌詞のアクセントの関係分析のための自動対応付け 情報処理学会全国大会講演論文集 Mar. 16, 2017 
13話者照合におけるプロの物真似タレントの声真似攻撃の影響の分析 情報処理学会全国大会講演論文集 Mar. 16, 2017 
14話者認識と顔画像認識を用いた映像におけるマルチモーダル人物同定 日本音響学会研究発表会講演論文集(CD-ROM) Mar. 1, 2017 
15Analysis of Voice Imitation by Professional/Non-Professional Impersonators Based on Kullback–Leibler Divergence between Acoustic Models Joint Meeting of Acoustical Society of America and Acoustic Society of Japan Nov. 2016 
16複数スマートフォンで収録された会話音声の対話グループ検出と話者決定の性能改善 電子情報通信学会技術研究報告 Aug. 17, 2016 
17Music retrieval based on time structure information of musical instruments and musical instrument activity detection using Deep Neural Network 情報処理学会研究報告(Web) May. 14, 2016 
18TokyoTech at MediaEval 2016 Multimodal Person Discovery in Broadcast TV task CEUR Workshop Proceedings Jan. 1, 2016 
19An efficient error correction interface for speech recognition on mobile touchscreen devices 2014 IEEE Workshop on Spoken Language Technology, SLT 2014 - Proceedings Apr. 1, 2014 
20Simple gesture-based error correction interface for smartphone speech recognition Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH Jan. 1, 2014 
21Q-Gaussian based spectral subtraction for robust speech recognition 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 Dec. 1, 2012 
22Overlapped Speech Detection in Meeting Using Cross-Channel Spectral Subtraction and Spectrum Similarity 2012 
23An Efficient Prosody Adaptation Method and Its Application to HMM-based Speech Synthesis 2010 
24Development of a WFST based Speech Recognition System for a Resource Deficient Language Using Machine Translation 2009 
25Recent Development of WFST-Based Speech Recognition Decoder 2009 
26Robust Speech Recognition Using VAD-Measure-Embedded Decoder 2009 
27Noise Robust Speech Recognition Using Spectral Subtraction and F0 Information Extracted by Hough Transform 2009 
28Generalization of Specialized On-the-fly Composition 2009 
29Thai Broadcast News Corpus Construction and Evaluation 2008 
30Development of a Speech Recognition System for Icelandic Using Machine Translated Text 2008 
31Accent Analysis for Mandarin Large Vocabulary Continuous Speech Recognition 2008 
32Initial Evaluation of the Drivers' Japanese Speech Corpus in a Car Environment 2008 
33The Effect of Spectral Space Reduction in Spontaneous Speech on Recognition Performances 2007 
34Development of a Speech Recognition System Using a Sparse Training Corpus 2007 
35Acoustic and Linguistic Characterization of Spontaneous Speech 2007 
36Dynamic Language Model Adaptation Using Presentation Slides for Lecture Speech Recognition 2007 
37Combining Gaussian Mixture Model with Global Variance Term to Improve the Quality of an HMM-Based Polyglot Speech Synthesizer 2007 
38A Weight Estimation Method Using LDA for Multi-Band Speech Recognition 2006 
39Progress on a Speaker Adaptable Polyglot Synthesizer 2006 
40Acoustic and Linguistic Characterization of Spontaneous Speech 2006 
41A Stream-Weight and Threshold Estimation Method Using Adaboost for Multi-Stream Speaker Verification 2006 
42A Large Vocabulary Continuous Speech Recognition System for Indonesian Language 2006 
43New Approach to Polyglot Synthesis: How to Speak Any Language with Anyone's Voice 2006 
44Why is Automatic Recognition of Spontaneous Speech So Difficult? 2006 
45Multimodal Speaker Verification Using Ear Image Features Extracted by PCA and ICA 2005 
46Language Model Adaptation for Resource Deficient Language Using Translated Data 2005 
47Cross-Language Synthesis with a Polyglot Synthesizer 2005 
48Stream-Weight Optimization by LDA and Adaboost for Multi-Stream Speaker Verification 2005 
49Cluster-Based Modeling for Ubiquitous Speech Recognition 2005 
50Analysis of Spectral Space Reduction in Spontaneous Speech and Its Effects on Speech Recognition Performance 2005 
51Why Is the Recognition of Spontaneous Speech so Hard? 2005 
52Sentence Extraction-Based Presentation Summarization Techniques and Evaluation Metrics 2005 
53Sentence Extraction-Based Automatic Speech Summarization and Evaluation Techniques 2005 
54Toward Robust Multimodal Speech Recognition 2005 
55Speaker Adaptable Multilingual Synthesis 2005 
56Polyglot Synthesis Using a Mixture of Monolingual Corpora 2005 
57A Stream-Weight Optimization Method for Multi-Stream HMMs Based on Likelihood Value Normalization 2005 
58Improvement of Audio-Visual Speech Recognition in Cars 2004 
59A Stream-Weight Optimization Method for Audio-Visual Speech Recognition Using Multi-Stream HMMs 2004 
60Audio-Visual Speech Recognition Using New Lip Features Extracted from Side-Face Images 2004 
61Noise-Robust Speaker Verification Using F0 Features 2004 
62Unsupervised Class-Based Language Model Adaptation for Spontaneous Speech Recognition 2003 
63Unsupervised Language Model Adaptation Using Word Classes for Spontaneous Speech Recognition 2003 
64Noise Robust Speech Recognition Using Prosodic Information 2003 
65Audio-Visual Speech Recognition Using Lip Movement Extracted from Side-Face Images 2003 
66Audio-Visual Person Authentication Using Speech and Ear Images 2003 
67A Robust Multi-Modal Speech Recognition Method Using Optical-Flow Analysis 2002 
68Noise Robust Speech Recognition Using F0 Contour Extracted by Hough Transform 2002 
69Speech-Rate-Variable HMM-Based Japanese TTS System 2002 
70Parallel Computing-Based Architecture for Mixed-Initiative Spoken Dialogue 2002 
71Bimodal Speech Recognition Using Lip Movement Measured by Optical-Flow Analysis 2001 
72Ubiquitous Speech Processing 2001 
73Continuous Speech Recognition of Japanese Using Prosodic Word Boundaries Detected by Mora Transition Modeling of Fundamental Frequency Contours 2001 
74Detection of Prosodic Word Boundaries by Statistical Modeling of Mora Transitions of Fundamental Frequency Contours and Its Use for Continuous Speech Recognition 2000 
75Modeling and Generation of Accentual Phrase F0 Contours Based on Discrete HMMs Synchronized at Mora-unit Transitions 2000 
76Prosodic Word Boundary Detection Using Mora Transition Modeling of Fundamental Frequency Contours -Speaker Independent Experiments- 1999 
77Speaker-Independent Detection of Prosodic Word Boundary Using Mora Transition Modeling of Fundamental Frequency Contours 1999 
78Prosodic Word Boundary Detection Using Statistical Modeling of Moraic Fundamental Frequency Contours and Its Use for Continuous Speech Recognition 1999 
79Accent Type Recognition and Syntactic Boundary Detection of Japanese Using Statistical Modeling of Moraic Transitions of Fundamental Frequency Contours 1998 
80Representing Prosodic Words Using Statistical Models of Moraic Transition of Fundamental Frequency Contours of Japanese 1998 
81Detecting Phrase Boundaries by Low-Pass Filtering of Fundamental Frequency Contours 1997 
82A Method of Representing Fundamental Frequency Contours of Japanese Using Statistical Models of Moraic Transition 1997 
83Use of Prosodic Features in Speech Recognition 1996 

 

Awards & Honors  
No.Publication dateAssociationPrizeSubtitle
1Nov. 2022 The 12th Symposium on Biometrics, Recognition and Authentication Best Presentation Award Analysis of Effects of Voice Mimicry Attack by Professional/Non-Professional Impersonators on Deep Learning Based Speaker Verification 
2Nov. 2018 The 8th Symposium on Biometrics, Recognition and Authentication Best Presentation Award Analysis of Effects of Voice Mimicry Attack by Professional/Non-Professional Impersonators on Speaker Verification 

 

Research Grants & Projects  
No.Offer organizationSystem nameTitleFund classificationDate
1A Study on High-Performance Speaker Verification in Wearable Computing Environments competitive_research_funding  2009 - NOW 
2A Study on High-Performance Speech Recognition Decoder Based on Weighted Finite-State Transducers competitive_research_funding  2008 - NOW 
3Japan Society for the Promotion of Science Grants-in-Aid for Scientific Research Speech Recognition for Computer-Supported Conference Systems under Ubiquitous/Wearable Computing Environment  2000 - 2002