Transfer learning with generative models for object detection on limited datasets

Year: 2024

Authors: Paiano M., Martina S., Giannelli C., Caruso F.

Autors Affiliation: Univ Florence, Dept Math & Comp Sci, Viale Morgagni 67-A, I-50134 Florence, Italy; Univ Florence, Dept Phys & Astron, Via Sansone 1, I-50019 Sesto Fiorentino, Italy; Univ Florence, European Lab Nonlinear Spect LENS, Via Nello Carrara 1, I-50019 Sesto Fiorentino, Italy; Consiglio Nazl Ric CNR INO, Ist Nazl Ott, I-50019 Sesto Fiorentino, Italy.

Abstract: The availability of data is limited in some fields, especially for object detection tasks, where it is necessary to have correctly labeled bounding boxes around each object. A notable example of such data scarcity is found in the domain of marine biology, where it is useful to develop methods to automatically detect submarine species for environmental monitoring. To address this data limitation, the state-of-the-art machine learning strategies employ two main approaches. The first involves pretraining models on existing datasets before generalizing to the specific domain of interest. The second strategy is to create synthetic datasets specifically tailored to the target domain using methods like copy-paste techniques or ad-hoc simulators. The first strategy often faces a significant domain shift, while the second demands custom solutions crafted for the specific task. In response to these challenges, here we propose a transfer learning framework that is valid for a generic scenario. In this framework, generated images help to improve the performances of an object detector in a few-real data regime. This is achieved through a diffusion-based generative model that was pretrained on large generic datasets. With respect to the state-of-the-art, we find that it is not necessary to fine tune the generative model on the specific domain of interest. We believe that this is an important advance because it mitigates the labor-intensive task of manual labeling the images in object detection tasks. We validate our approach focusing on fishes in an underwater environment, and on the more common domain of cars in an urban setting. Our method achieves detection performance comparable to models trained on thousands of images, using only a few hundreds of input data. Our results pave the way for new generative AI-based protocols for machine learning applications in various domains, for instance ranging from geophysics to biology and medicine.

Journal/Review: MACHINE LEARNING-SCIENCE AND TECHNOLOGY

Volume: 5 (3)      Pages from: 35041-1  to: 35041-18

More Information: M P acknowledges the contribution of the MAREA project funded by the Tuscany Region. S M acknowledges financial support from PNRR MUR Project PE0000023-NQSTI. C G acknowledges the contribution of the National Recovery and Resilience Plan, Mission 4 Component 2 – Investment 1.4 – CN_00000013 ’CENTRO NAZIONALE HPC, BIG DATA E QUANTUM COMPUTING’, spoke 6. C G and M P are members of the INdAM research group GNCS. The INdAM-GNCS support is gratefully acknowledged. F C acknowledges financial support by the European Commission’s Horizon Europe Framework Programme under the Research and Innovation Action GA n. 101070546-MUQUABIS, by the European Union’s Horizon 2020 research and innovation programme under FET-OPEN GA n. 828946-PATHOS, by the European Defence Agency under the project Q-LAMPS Contract No. B PRJ-RT-989, and by the MUR Progetti di Ricerca di Rilevante Interesse Nazionale (PRIN) Bando 2022 – project n. 20227HSE83 (ThAI-MIA) funded by the European Union-Next Generation EU.DAS:The data that support the findings of this study are openly available at the following URL/DOI: https://doi.org/10.5281/zenodo.13121950.
KeyWords: object detection; transfer learning; generative AI; diffusion models; deep learning
DOI: 10.1088/2632-2153/ad65b5

Connecting to view paper tab on IsiWeb: Click here