2026-05-22

【學術亮點-頂級期刊論文】SMART：用於植物病害診斷之結構化文本多模態農業檢索增強 Transformer

字體大小

小

中

大

Intelligent Service: Large-scale Agricultural AI Models
【Department of Civil Engineering / Ming-Der Yang / Tenured Distinguished Professor】
智慧服務：可大規模擴展之農業AI 模型【土木工程學系楊明德終身特聘教授】

論文篇名	英文：SMART: Structured multimodal agricultural retrieval-augmented transformer for plant disease diagnosis 中文：SMART：用於植物病害診斷之結構化文本多模態農業檢索增強 Transformer
期刊名稱	Computers and Electronics in Agriculture (指標清單期刊)
發表年份, 卷數, 起迄頁數	2026,250, no.111882
作者	Yuan-Chia Chan, Yao-Chung Fan (范耀中), Hung-Chung Li (李宏中), Ming-Der Yang (楊明德)*
DOI	10.1016/j.compag.2026.111882
中文摘要	本研究提出 SMART（Structured Multimodal Agricultural Retrieval-augmented Transformer），一套用於植物病害診斷的結構化多模態檢索增強 Transformer 架構。傳統植物病害影像分類模型多依賴單一視覺訊號，容易受到田間影像背景雜訊、病徵相似、長尾類別分布與使用者輸入不完整等因素影響，導致實際應用時的診斷穩定性與解釋能力受限。為解決上述問題，本研究整合病害影像、結構化語意標註、病徵描述與相似案例檢索，建構可結合視覺特徵與文字語意的多模態診斷流程。 SMART 透過結構化病害標籤與語意錨點強化病徵表徵，並導入檢索增強機制，使模型能在推論階段參考相似影像與相關農業知識，提升對細粒度病害差異的辨識能力。相較於單純影像分類或一般文字生成式方法，本研究強調「視覺辨識、語意對齊、案例檢索與可解釋診斷」的整合，能提供更具脈絡的植物病害判斷依據。研究結果顯示，SMART 可有效改善多模態植物病害診斷表現，並展現其於智慧農業、植物保健決策支援與田間病害管理應用上的潛力。
英文摘要	Real-world plant disease diagnosis requires robustness to long-tailed class distributions and variable input quality, yet prevailing multimodal approaches do not explicitly address these challenges. This study presents SMART, a framework that integrates structured linguistic guidance with retrieval-augmented inference. Evaluation of 37,586 images across 48 disease categories yields three findings. First, structured captions outperform LLM-generated narratives, with the vision-only baseline exceeding all LLM approaches on the 20-class pilot, demonstrating that semantic precision, not descriptive complexity, drives effective vision-language alignment. Second, we distinguish between training-time and inference-time challenges: long-tailed degradation is mitigated through asymmetric loss design, while input quality variability is addressed through a dual-pathway mechanism in which semantic anchoring constrains predictions under related errors, and the retrieval-based safety net ensures bounded degradation under noisy inputs. Third, frequency-stratified analysis reveals a division of labor: semantic anchoring benefits high-frequency classes, while the retrieval-based safety net preferentially compensates rare classes. These complementary mechanisms support a human-in-the-loop workflow that transforms single-pass diagnosis into collaborative verification. SMART achieves F1 = 0.9767, surpassing fine-tuned CNNs (best F1 = 0.9118) and 22 multimodal foundation models (best F1 = 0.8904). The primary contributions of this work are: (i) a systematic benchmark of caption form-structured versus LLM-generated as an experimental variable in agricultural vision-language alignment; (ii) a factorial decomposition that quantitatively isolates the independent and complementary roles of semantic anchoring and retrieval-based compensation across frequency strata; and (iii) a statistically validated, human-in-the-loop diagnostic framework applicable to large-scale, imbalanced disease taxonomies.
發表成果與AI計畫研究主題相關性	本研究提出SMART植物病害診斷系統：透過結構化多模態學習與相似案例檢索，強化模型對植物病害影像與文字描述之間的語意對齊能力，並提升診斷結果的可解釋性與實務應用價值。

上架日期2026-05-17

返回列表

2021-11-23

農委會110年農業開放資料競賽興大土木系楊明德教授團隊獲第二名

2023-12-05

【學術亮點】台灣首例由Erysiphe euonymicola引起的冬青衛矛白粉病報告

2024-07-23

【學術亮點】使用整合學習和分層策略來預測 ESWL 治療尿道結石的效果

返回列表

ACHIEVEMENTS 成果展示

【學術亮點-頂級期刊論文】SMART：用於植物病害診斷之結構化文本多模態農業檢索增強 Transformer