| Download | - View final version: Dataset selection is critical for effective pre-training of fish detection models for underwater video (PDF, 2.1 MiB)
|
|---|
| DOI | Resolve DOI: https://doi.org/10.1093/icesjms/fsaf039 |
|---|
| Author | Search for: Ayyagari, DeviORCID identifier: https://orcid.org/0000-0001-9702-5316; Search for: Alavi, Talukder Wasi; Search for: Singh, Navlika; Search for: Barnes, Joshua1ORCID identifier: https://orcid.org/0000-0002-3371-1082; Search for: Morris, Corey; Search for: Whidden, Christopher |
|---|
| Editor | Search for: Browman, Howard |
|---|
| Affiliation | - National Research Council of Canada. Ocean, Coastal and River Engineering
|
|---|
| Funder | Search for: National Research Council of Canada; Search for: Natural Sciences and Engineering Research Council of Canada |
|---|
| Format | Text, Article |
|---|
| Subject | machine learning; transfer learning; public marine image datasets; fisheries; stock assessment; autonomous monitoring; remote video monitoring |
|---|
| Abstract | Underwater digital monitoring systems using acoustics and video have the potential to transform marine monitoring and fisheries stock assessment but generate significant amounts of data, shifting the burden from data collection to data analysis. Machine learning (ML) is a potential solution but remains underutilized for marine monitoring, partly due to the time and cost of annotating new training datasets for each marine class and habitat. This raises the pivotal question: “How can we train marine machine learning models with limited annotated data?” We catalog publicly available marine datasets annotated for detection and classification, investigating the feasibility of leveraging a fish detector trained on three existing datasets to detect fish in a new small underwater marine dataset. We compare the accuracy and training time of pre-trained models to those without pre-training. We find pre-training with OzFish yields faster convergence and comparable performance with smaller training datasets. However, pre-training with some datasets reduced performance and increased training time. We expect our catalog of publicly available marine datasets will assist in the selection of pre-training datasets. Our results underscore the need for diverse, large, publicly available marine datasets with varied habitat and class distributions to develop and integrate ML models into automated systems for monitoring marine ecosystems. |
|---|
| Publication date | 2025-04-04 |
|---|
| Publisher | Oxford Academic |
|---|
| Licence | |
|---|
| In | |
|---|
| Language | English |
|---|
| Peer reviewed | Yes |
|---|
| Export citation | Export as RIS |
|---|
| Report a correction | Report a correction (opens in a new tab) |
|---|
| Record identifier | 06d1d66f-79b7-4708-bec0-e7c2752e277b |
|---|
| Record created | 2025-05-09 |
|---|
| Record modified | 2025-11-03 |
|---|