One of the most challenging problem in computer vision community is semantic image labeling, which requires assigning a semantic class to each pixel in an image. In the literature, this problem has been effectively addressed with Random Forest, i.e., a popular classification algorithm that delivers a prediction by averaging the outcome of an ensemble of random decision trees. In this thesis we propose a novel algorithm based on the Random Forest framework. Our main contribution is the introduction of a new family of decision functions (aka split functions), which build up the decision trees of the random forest. Our decision functions resemble the way the human retina works, by mimicking an increase in the receptive field sizes towards the periphery of the retina. This results in a better visual acuity in the proximity of the center of view (aka fovea), which gradually degrades as we move off from the center.\\ The solution we propose improves the quality of the semantic image labelling, while preserving the low computational cost of the classical Random Forest approaches in both the training and inference phases. We conducted quantitative experiments on two standard datasets, namely eTRIMS Image Database and MSRCv2 Database, and the results we obtained are extremely encouraging.
Retina-inspired random forest for semantic image labelling
Lak, Kameran Majeed Mohammed
2015/2016
Abstract
One of the most challenging problem in computer vision community is semantic image labeling, which requires assigning a semantic class to each pixel in an image. In the literature, this problem has been effectively addressed with Random Forest, i.e., a popular classification algorithm that delivers a prediction by averaging the outcome of an ensemble of random decision trees. In this thesis we propose a novel algorithm based on the Random Forest framework. Our main contribution is the introduction of a new family of decision functions (aka split functions), which build up the decision trees of the random forest. Our decision functions resemble the way the human retina works, by mimicking an increase in the receptive field sizes towards the periphery of the retina. This results in a better visual acuity in the proximity of the center of view (aka fovea), which gradually degrades as we move off from the center.\\ The solution we propose improves the quality of the semantic image labelling, while preserving the low computational cost of the classical Random Forest approaches in both the training and inference phases. We conducted quantitative experiments on two standard datasets, namely eTRIMS Image Database and MSRCv2 Database, and the results we obtained are extremely encouraging.File | Dimensione | Formato | |
---|---|---|---|
835524-1165404.pdf
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
2.88 MB
Formato
Adobe PDF
|
2.88 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14247/21661