Request Correction

Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.

Correction Guidelines

Click the edit button next to a field to report a correction.
Fill in the suggested correction value for each field you want to correct.
Provide your name and email so we can contact you if needed.

View all submitted correction requests

Paper Information

lrec2026-ws-lt4hala-34

Multi-Task Learning Trade-offs in Vision–Language Models for Ancient Chinese OCR: An Empirical Analysis of Parameter-Efficient Adaptation

View lrec2026-ws-lt4hala-34.pdf

Paper Fields

Click the edit button next to a field to report a correction.

Title

Multi-Task Learning Trade-offs in Vision–Language Models for Ancient Chinese OCR: An Empirical Analysis of Parameter-Efficient Adaptation

Abstract

This study evaluates the efficacy of multi-task adaptation in large-scale vision–language models (VLMs), specifically Qwen2.5-VL, for the simultaneous recognition and structural parsing of historical Chinese documents within the EvaHan2026 benchmark. Utilizing a parameter-efficient fine-tuning (PEFT) strategy via LoRA (rank 64), our framework demonstrates superior performance in layout analysis (Task B), achieving an mAP of 0.2802—a 39.6% improvement over the competitive baseline—and a Macro F1 of 0.3609. Conversely, a pronounced performance-utility trade-off is observed in printed OCR (Task A), where the character error rate (CER) escalates from 0.0618 to 0.1100 (+78% relative). This divergence highlights a critical catastrophic forgetting effect induced by gradient interference during multi-task optimization. While handwritten OCR (Task C) remains relatively stable (CER of 0.0963), our findings suggest that although unified VLM architectures excel at high-level structural detection, they encounter significant parameter capacity bottlenecks when concurrently optimizing fine-grained character-level transcription. This analysis highlights the optimization challenges when balancing spatial detection and character recognition in a unified framework.

Authors

Expand an author to correct their information. Use the remove button to request author removal, or add a new author.

PDF Attachment

You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.

Drag & drop a PDF here, or click to select

Your Information

Name

Comment

Author Declaration *

I declare that I have notified all co-authors of the proposed corrections and obtained their consent, and that all modifications adhere to research ethics standards and the LREC correction policy.

Select at least one field to correct using the edit buttons above.