Inflated 3D ConvNet context analysis for violence detection

Freire-Obregon, D.; Barra, P.; Castrillon-Santana, M.; De Marsico, M.

doi:10.1007/s00138-021-01264-9

According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms state-of-the-art existing proposals. Most of those proposals consider a pre-processing step to only focus on some regions of interest in the scene, i.e., those actually containing a human subject. In this regard, this paper also reports the results of an extensive analysis on whether and how the context can affect or not the adopted classifier performance. The experiments show that context-free footage yields substantial deterioration of the classifier performance (2% to 5%) on publicly available datasets. However, they also demonstrate that performance stabilizes in context-free settings, no matter the level of context restriction applied. Finally, a cross-dataset experiment investigates the generalizability of results obtained in a single-collection experiment (same dataset used for training and testing) to cross-collection settings (different datasets used for training and testing).

Inflated 3D ConvNet context analysis for violence detection / Freire-Obregon, D., Barra, P., Castrillon-Santana, M., De Marsico, M.. - In: MACHINE VISION AND APPLICATIONS. - ISSN 0932-8092. - 33:1(2022). [10.1007/s00138-021-01264-9]

Inflated 3D ConvNet context analysis for violence detection

Freire-Obregon D.;Barra P.^{Membro del Collaboration Group};Castrillon-Santana M.;De Marsico M.^{Membro del Collaboration Group}

2022

Abstract

According to the Wall Street Journal, one billion surveillance cameras will be deployed around the world by 2021. This amount of information can be hardly managed by humans. Using a Inflated 3D ConvNet as backbone, this paper introduces a novel automatic violence detection approach that outperforms state-of-the-art existing proposals. Most of those proposals consider a pre-processing step to only focus on some regions of interest in the scene, i.e., those actually containing a human subject. In this regard, this paper also reports the results of an extensive analysis on whether and how the context can affect or not the adopted classifier performance. The experiments show that context-free footage yields substantial deterioration of the classifier performance (2% to 5%) on publicly available datasets. However, they also demonstrate that performance stabilizes in context-free settings, no matter the level of context restriction applied. Finally, a cross-dataset experiment investigates the generalizability of results obtained in a single-collection experiment (same dataset used for training and testing) to cross-collection settings (different datasets used for training and testing).

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2022
			
	Parole chiave
	
				Context analysis; I3D model; People tracking; Transfer learning; Violence detection
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Inflated 3D ConvNet context analysis for violence detection / Freire-Obregon, D., Barra, P., Castrillon-Santana, M., De Marsico, M.. - In: MACHINE VISION AND APPLICATIONS. - ISSN 0932-8092. - 33:1(2022). [10.1007/s00138-021-01264-9]

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1679402

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

49

29

Catalogo dei prodotti della ricerca