عنوان

Audio-visual tracking of multiple moving speakers

پدید آورنده

Kilic, V.

موضوع

رده

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم

تماس با کتابخانه : 32910706-025

شماره کتابشناسی ملی

شماره

TLets683821

عنوان و نام پديدآور

عنوان اصلي

Audio-visual tracking of multiple moving speakers

نام عام مواد

[Thesis]

نام نخستين پديدآور

Kilic, V.

نام ساير پديدآوران

Wang, W. ; Kittler, J. ; Barnard, M.

وضعیت نشر و پخش و غیره

نام ناشر، پخش کننده و غيره

University of Surrey

تاریخ نشرو بخش و غیره

2016

یادداشتهای مربوط به پایان نامه ها

جزئيات پايان نامه و نوع درجه آن

Thesis (Ph.D.)

امتياز متن

2016

یادداشتهای مربوط به خلاصه یا چکیده

متن يادداشت

In this thesis, a novel approach is proposed for multi-speaker tracking by integrating audio and visual data in a particle filtering (PF) framework. This approach is further improved for adaptive estimation of two critical parameters of the PF, namely, the number of particles and noise variance, based on tracking error and the area occupied by the particles in the image. Here, it is assumed that the number of speakers is known and constant during the tracking. To relax this assumption, the random finite set (RFS) theory is used due to its ability in dealing with the problem of tracking a variable number of speakers. However, the computational complexity increases exponentially with the number of speakers, so probability hypothesis density (PHD) filter, which is first order approximation of the RFS, is applied with sequential Monte Carlo (SMC), namely particle filter, implementation since the computational complexity increases linearly with the number of speakers. The SMC-PHD filter in visual tracking uses three types of particles (i.e. surviving, spawned and born particles) to model the state of the speakers and to estimate the number of speakers. We propose to use audio data in the distribution of these particles to improve the visual SMC-PHD filter in terms of estimation accuracy and computational efficiency. The tracking accuracy of the proposed algorithm is further improved by using a modified mean-shift algorithm, and the extra computational complexity introduced by mean-shift is controlled with a sparse sampling technique. For quantitative evaluation, both audio and video sequences are required together with the calibration information of the cameras and microphone arrays (circular arrays). To this end, the AV16.3 dataset is used to demonstrate the performance of the proposed methods in a variety of scenarios such as occlusion and rapid movements of the speakers.

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

مستند نام اشخاص تاييد نشده

Kilic, V.

نام شخص - ( مسئولیت معنوی درجه دوم )

مستند نام اشخاص تاييد نشده

Wang, W. ; Kittler, J. ; Barnard, M.

شناسه افزوده (تنالگان)

مستند نام تنالگان تاييد نشده

University of Surrey

دسترسی و محل الکترونیکی

نام الکترونيکي

وضعیت انتشار

فرمت انتشار

اطلاعات رکورد کتابشناسی

نوع ماده

[Thesis]

کد کاربرگه

276903

اطلاعات دسترسی رکورد

سطح دسترسي

تكميل شده

عنوان Audio-visual tracking of multiple moving speakers

پدید آورنده Kilic, V.