One of the most challenging problem in computer vision community is semantic image labeling, which requires assigning a semantic class to each pixel in an image. In the literature, this problem has been effectively addressed with Random Forest, i.e., a popular classification algorithm that delivers a prediction by averaging the outcome of an ensemble of random decision trees. In this thesis we propose a novel algorithm based on the Random Forest framework. Our main contribution is the introduction of a new family of decision functions (aka split functions), which build up the decision trees of the random forest. Our decision functions resemble the way the human retina works, by mimicking an increase in the receptive field sizes towards the periphery of the retina. This results in a better visual acuity in the proximity of the center of view (aka fovea), which gradually degrades as we move off from the center.\\ The solution we propose improves the quality of the semantic image labelling, while preserving the low computational cost of the classical Random Forest approaches in both the training and inference phases. We conducted quantitative experiments on two standard datasets, namely eTRIMS Image Database and MSRCv2 Database, and the results we obtained are extremely encouraging.

Retina-inspired random forest for semantic image labelling

Lak, Kameran Majeed Mohammed
2015/2016

Abstract

One of the most challenging problem in computer vision community is semantic image labeling, which requires assigning a semantic class to each pixel in an image. In the literature, this problem has been effectively addressed with Random Forest, i.e., a popular classification algorithm that delivers a prediction by averaging the outcome of an ensemble of random decision trees. In this thesis we propose a novel algorithm based on the Random Forest framework. Our main contribution is the introduction of a new family of decision functions (aka split functions), which build up the decision trees of the random forest. Our decision functions resemble the way the human retina works, by mimicking an increase in the receptive field sizes towards the periphery of the retina. This results in a better visual acuity in the proximity of the center of view (aka fovea), which gradually degrades as we move off from the center.\\ The solution we propose improves the quality of the semantic image labelling, while preserving the low computational cost of the classical Random Forest approaches in both the training and inference phases. We conducted quantitative experiments on two standard datasets, namely eTRIMS Image Database and MSRCv2 Database, and the results we obtained are extremely encouraging.
2015-03-12
File in questo prodotto:
File Dimensione Formato  
835524-1165404.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 2.88 MB
Formato Adobe PDF
2.88 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14247/21661