Resume
Education
2017 - Present
Korea University
Department of Electrical Engineering
Integrated Ph. D. program
Seoul, Korea
2011 - 2017
Department of Electrical Engineering
Hallym University
Bachelor's Degree
Chuncheon, Korea
Research interest
Speech Processing
✓ Emotion recognition
✓ Speech recognition
✓ Acoustic event recognition
✓ Machine learning
Multimodal
✓ Emotion recognition
✓ Multimodal fusion using text, audio, image
Professional skillset
Implement real-time speech recognition with English and Korea version using Kaldi toolkit
Sound source localization using GCC-PATH
GCC-NMF speech separation
DNN-based Emotion Recognition using ensemble technique
Rule-Based chatbot
Analysis of emotion recognition data based on LSTM using Influence function
Acoustic event recognition using transfer learning
Emotion recognition of speech and image fusion using Attention model
Research Experiences
Emotion recognition (2017 ~ )
- Non-speech Emotion Recognition (e.g. laugh, sigh, scream, yawn, cry, etc)
- Speech Emotion Recognition (e.g. anger, boredom, disgust, fear, happiness, neutral, sadness)
- Find and classify functions with continuous functions, including important pitch related functions, formants, energy-related functions and timing functions
- Transfer learning to emotion audio set with massive youtube audio set using VGG model
- Using CRNN(CNN-RNN), BiLSTM, Self Attention model
- Overfitting prevent Method( speaker independent-cross validation, batch normalization layer, early stopping, l2 using regularization loss, dropout)
- The OpenSMILE toolkit is utilized to extract audio feature sets, including eGemaps, IS09, IS10, IS11, IS13, MFCC, logMel features
- Emotion recognition of speech and image fusion using Attention model - Analysis of emotion recognition data based on LSTM using Influence function
• Speech recognition (2017 ~ )
- Implement real-time speech recognition with English and Korean version using Kaldi toolkit - Using Ubuntu environment shell script and python language
- The English speech recognition version used WSJ(Wall Street Journal corpus) and LDC93S6B dataset
- The Korean speech recognition version used approximately 600 hours of Database ( ETRI 2002, ETRI 2003, ETRI 2011 SMART MOBILE, SITEC CAR, PBW452, data augmentation, self dataset, etc)
- In the acoustic model, speech features of MFCC(Mel Frequency Cepstral Coefficient) are extracted
- In the language model, tri-gram model is used - Generate HCLG. FST file by fusion of language model and acoustic model using WFST based
- Learn using deep learning network
• Source localization (2017 ~ )
- Implementation of sound source localization using GCC-PATH - Using four horizontal microphone arrays
- Audio signals cross-correlation when entering the four microphones
• Speech Separation with GCC-NMF (2017)
- Source separation algorithm using GCC-NMF that combines unsupervised dictionary learning vas nonnegative matrix factorization (NMF) with spatial localization via the generalized cross correlation method
- Using publicly available data from the SISEC signal separation evaluation campaign
- Using python language
• Rule-Based Chatbot (2017 )
- Using RiveScript artificial intelligence script language in python
- Organize a scenario a hamburger shop, weather content, daily conversation in Korean version
- Configure server and client in python and C language with network socket
Projects
Korea University Robot (Speech recognition, Source localization, socket, chatbot)
로봇의 자연어 인식 및 감성 대화 기술 개발
Domestic Conferences
1. 이상현, 문성규, 이영로, 고한석, "영향 함수를 이용한 순환신경망 기반 감정인식 데이터 분석," 제 35 회 음성통신 및 신호처리 학술대회 논문집, pp.20, August, 2018.
2. 이상현, 신민규,고한석, “심층신경망 기반 앙상블 기법을 적용한 감정인식, ” 한국음향학회추계학술발표대회논문집, pp.11 , November, 2017.
3. 김재동, 이상현, 고한석, "영향함수를 이용한 심층신경망 학습 데이터 분석" 대한 전자공학회학술대회 , No.11. Vol.2018.
4. 김재동, 이상현, 고한석,"감정인식을 위한 합성곱신경망(CNN) 최적화" 대한 전자공학회학술대회 , No.6.2 Vol.2019.
5. 이상현, 김재동, 고한석, "Attention model 기반 오디오와 비디오 융합 감정 인식" No.6 Vol.2019
Domestic Journal
1. [KCI 우수등재]
이상현,김재동,고한석, "강인한 감정 특징을 위한 End-to-end 기반의 CRNN-GLU-ATT 모델," 대한전자공학회, 제57권, 10호, pp.45-55, Oct, 2020
International Journal
1. [SCI-E] Sanghyun Lee, David K. Han and Hanseok Ko, "Fusion-conBERT: Parallel Convolution and BERT Fusion for Speech Emotion Recognition ," Sensors, Vol.20, November, 2020 [IF=3.275]