A Black-Box Analysis of the Extent of Time-Scale Modification Introduced by WEBRTC Adaptive Jitter Buffer and its Impact on Listening Speech Quality

doi:10.26552/com.C.2016.1.17-22

Communications - Scientific Letters of the University of Zilina 2016, 18(1):17-22 | DOI: 10.26552/com.C.2016.1.17-22

A Black-Box Analysis of the Extent of Time-Scale Modification Introduced by WEBRTC Adaptive Jitter Buffer and its Impact on Listening Speech Quality

Yusuf Cinar¹, Hugh Melvin¹, Peter Pocta²: ¹ Discipline of Information Technology, College of Engineering & Informatics, National University of Ireland, Galway, Ireland; ² Department of Telecommunications and Multimedia, Faculty of Electrotechnical Engineering, University of Zilina, Slov

WebRTC is an open-source platform for real-time communications over the web and has been experiencing widespread adoption in recent years. WebRTC clients employ the technique of time scaling of packets to cope with the impact of network jitter and/or clock skew. A black-box study presented in this paper focuses on two aspects, namely time scale modification behaviour under different packet arrival interval and its impact on the listening quality perceived by the end user. Specifically, we examine the MOS scores predicted by the POLQA speech quality prediction model. Our tests involve both iSAC and Opus codecs, two of the widely used WebRTC codecs. In the experiment, a speech file played from one client, is directed through a network simulation before reaching the receiving client. Our results surprisingly show that the extent of time scaling is consistently higher for Opus producing shorter speech files. Regarding the consequent impact on quality, we also find that there are many cases where POLQA is reporting MOS predictions that contradict expert listener assessments.

Keywords: WebRTC; jitter; adaptive playout; time-scale modification

Published: February 29, 2016 Show citation

Cinar, Y., Melvin, H., & Pocta, P. (2016). A Black-Box Analysis of the Extent of Time-Scale Modification Introduced by WEBRTC Adaptive Jitter Buffer and its Impact on Listening Speech Quality. Communications - Scientific Letters of the University of Zilina, 18(1), 17-22. doi: 10.26552/com.C.2016.1.17-22

Share...

Download citation

Open full article

References

BERGKVIST, A., BURNETT, D., JENNINGS, C., NARAYANAN, A.: WebRTC 1.0: Real-time Communication Between Browsers. W3C Editor's Draft, W3C. Retrieved from W3C: http://www.w3.org/TR/webrtc/, 2012.
IETF. (n.d.). Real-Time Communication in WEB-browsers. Retrieved from IETF: https://tools.ietf.org/wg/rtcweb/
CINAR, Y.: An Objective Black-box Evaluation of Voice Quality within the WebRTC Project in Presence of Network Jitter, 2013.
WebRTC. (n.d.). WebRTC. Retrieved from www.webrtc.org
POCTA, P., MELVIN, H., HINES, A.: An Analysis of the Impact of Playout Delay Adjustments introduced by VoIP Jitter Buffers on Listening Speech Quality. Acta Acustica united with Acustica, 101 (3), 616-631, 2015. Go to original source...
MOON, S. B., KUROSE, J., TOWSLEY, D.: Packet Audio Playout Delay Adjustment: Performance Bounds and Algorithms. Multimedia systems, 2/1, 17-28, 1998. Go to original source...
LIU, F., KIM, J., KUO, C-C. J.: Adaptive Delay Concealment for Internet Voice Applications with Packet-based Time-scale Modification. IEEE ICASSP'2001. IEEE, 2001. Go to original source...
LIANG, Y. J., FARBER, N., GIROD, B.: Adaptive Playout Scheduling Using Time-scale Modification in Packet Voice Communications. Acoustics, Speech, and Signal Processing, 2001. Proc. of (ICASSP'01), 2001 IEEE International Conference on. 3, pp. 1445-1448, IEEE.
LIU, F., KIM, J., KUO, C-C. J.: Quality Enhancement of Packet Audio with Time-scale Modification. ITCom 2002: The Convergence of Information Technologies and Communications, pp. 163-173. International Society for Optics and Photonics, 2002. Go to original source...
SCHMIDMER, C.: POLQA Characterization for Time Scaling Conditions, ITU-T, 2011.
MELVIN, H.: The Use of Synchronised Time in Voice over IP (VoIP) Applications. PhD Thesis, University College Dublin, October 2004.
International Telecommunications Union: ITU-T Rec. P.800: Methods for subjective determination of transmission quality. Geneva, 1996.
International Telecommunications Union: ITU-T Rec. P.862: Perceptual evaluation of speech qaulity (PESQ). Geneva, 2001.
International Telecommunication Union: ITU-T Rec. G.107: The E-Model: a computational model for use in transmission planning. Geneva, 2009.
TechCrunch. (n.d.): TechCrunch. Retrieved from The WebRTC Race Begins Today: http://techcrunch.com/2015/02/28/1123773/
HINES, A., SKOGLUND, J., KOKARAM, A. C., HARTE, N.: ViSQOL: An Objective Speech Quality Model. EURASIP J. on Audio, Speech, and Music Processing, 5/17, 2015. Go to original source...
International Telecommunications Union: ITU-T Rec. 863: Perceptual objective listening quality assessment. Geneva, 2011.
WebRTC Quality of Experience. Retrieved from webrtcquality.cloudapp.net.

This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, distribution, and reproduction in any medium, provided the original publication is properly cited. No use, distribution or reproduction is permitted which does not comply with these terms.

Return to the content