Digital Library

cab1

 
Title:      TOWARDS USING VISION LANGUAGE MODELS FOR URBAN TREE ANALYSIS
Author(s):      Danilo Jodas, Giovani Candido, João Manesco, Gabriel Garcia, Luiz Marques Junior and João Papa
ISBN:      978-989-8704-71-9
Editors:      Paula Miranda and Pedro Isaías
Year:      2025
Edition:      Single
Keywords:      Generative Models, Large Language Models, Urban Tree, Bode, GemBode
Type:      Full Paper
First Page:      123
Last Page:      130
Language:      English
Cover:      cover          
Full Contents:      if you are a member please login Download
Paper Abstract:      In a breakthrough era of sustainable cities and carbon dioxide emission reduction, the efficient and rapid collection of crucial data on urban trees has raised the attention of municipalities and forestry managers. The standard approach for collecting the main aspects of urban trees involves fieldwork campaigns to obtain dendrometric tree information, including height and tree species, which allows for an initial assessment and further recording for catalog purposes. However, conducting field analyses is labor-intensive and time-consuming due to trees widely dispersed across urban areas. Thus, there is a significant need for computational analysis strategies to facilitate quick assessments and records of crucial urban tree information. This paper introduces an approach based on a vision language model to summarize the main aspects of the tree using an image from the street-view perspective. In addition, a new question-and-answer dataset with summaries of tree information has been created based on a 7B Portuguese language model. The dataset is subsequently used to fine-tune the PaliGemma vision language model using images of urban trees. The results demonstrated a compelling capability of the proposed approach to accurately summarize key tree attributes, such as the tree height and its species, using language models with significantly fewer parameters.
   

Social Media Links

Search

Login