Feature extraction and clustering techniques for digital image forensics
[Thesis]
Alfraih, Areej S.
Ho, Anthony T. S.
University of Surrey
2015
Thesis (Ph.D.)
2015
This thesis proposes an adaptive algorithm which applies feature extraction and clustering techniques for cloning detection and localization in digital images. Multiple contributions have been made to test the performance of different feature detectors for forensic use. The �first contribution is to improve a previously published algorithm by Wang et al. by localizing tampered regions using the grey-level co-occurrence matrix (GLCM) for extracting texture features from the chromatic component of an image (Cb or Cr component). The main trade-off� is a diminishing detection accuracy as the region size decreases. The second contribution is based on extracting Maximally Stable Extremal Regions (MSER) features for cloning detection, followed by k-means clustering for cloning localization. Then, for comparison purposes, we implement the same approach using Speeded Up Robust Features (SURF) and Scale-Invariant Feature Transform (SIFT). Experimental results show that we can detect and localize cloning in tampered images with an accuracy reaching 97% using MSER features. The usability and effi�cacy of our approach is verified by comparing with recent state-of-the-art approaches. For the third contribution we propose a flexible methodology for detecting cloning in images, based on the use of feature detectors. We determine whether a particular match is the result of a cloning event by clustering the matches using k-means clustering and using a Support Vector Machine (SVM) to classify the clusters. This descriptor-agnostic approach allows us to combine the results of multiple feature descriptors, increasing the potential number of keypoints in the cloned region. Results using MSER, SURF and SIFT outperform state of the art where the highest true positive rate is achieved at approximately 99.60% and the false positive rate is achieved at 1.6%, when different descriptors are combined. A statistical �filtering step, based on computing the median value of the dissimilarity matrix, is also proposed. Moreover, our algorithm uses an adaptive technique for selecting the optimal k value for each image independently, allowing our method to detect multiple cloned regions. Finally, we propose an adaptive technique that chooses feature detectors based on the type of image being tested. Some detectors are robust in detecting features in textured images while other detectors are robust in detecting features in smooth images. Combining the detectors makes them complementary to each other and can generate optimal results. The highest value for the area under ROC curve is achieved at approximately 98.87%. We also test the performance of agglomerative hierarchical clustering for cloning localization. Hierarchical and k-means clustering techniques have a similar performance for cloning localization. The True Positive Rate (TPR) for match level localization is achieved at approximately 97.59% and 96.43% for k-means and hierarchical clustering techniques, respectively. The robustness of our technique is analyzed against additive white Gaussian noise and JPEG compression. Our technique is still reliable even when using a low signal-to-noise (SNR = 20 dB) or a low JPEG compression quality factor (QF = 50).