-
|
Hello, I want to use docling for a custom agentic pipeline for document classification, extraction and external api tool calling. I was wondering if docling has the built in mechanisms for this task or if I could replace (what I assume is the backend model) qwen that docling uses with a finetuned Nemotron-nano-4b and if so, how? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Hi @Romaiz! I'm Dosu and I’m helping the docling team. Hey @Romaiz, great question! Docling does support integrating custom/finetuned local models, and you have several options for using something like Nemotron-nano-4b. Quick clarification: Docling's default model is actually Granite-Docling-258M (IBM's lightweight model), not Qwen — though Qwen2.5-VL-3B is available as a preset [1]. Integrating a Custom ModelOption 1: Via OpenAI-compatible API (recommended for local serving) Run your finetuned Nemotron-nano-4b with vLLM or Ollama, then point Docling to it: from docling.datamodel.pipeline_options import VlmPipelineOptions, VlmConvertOptions, ApiVlmEngineOptions, VlmEngineType
vlm_options = VlmConvertOptions.from_preset(
"granite_docling",
engine_options=ApiVlmEngineOptions(
runtime_type=VlmEngineType.API,
url="http://localhost:8000/v1/chat/completions",
params={"model": "nemotron-nano-4b", "temperature": 0.0, "max_tokens": 8192},
timeout=90,
),
)
pipeline_options = VlmPipelineOptions(vlm_options=vlm_options, enable_remote_services=True)Option 2: Direct HuggingFace integration If your model is on HuggingFace, you can load it directly via Transformers: from docling.datamodel.pipeline_options_vlm_model import InlineVlmOptions, InferenceFramework
vlm_options = InlineVlmOptions(
repo_id="your-org/nemotron-nano-4b-finetuned",
inference_framework=InferenceFramework.TRANSFORMERS,
)Note: the CLI Built-in CapabilitiesRegarding your agentic pipeline goals:
However, Docling doesn't have a built-in document-level classification system or tool-calling/agentic orchestration framework — you'd need to build that layer yourself around Docling's extraction and conversion capabilities. Key ConsiderationDocling requires To reply, just mention @dosu. Share context across your team and agents. Try Dosu. |
Beta Was this translation helpful? Give feedback.
Hi @Romaiz! I'm Dosu and I’m helping the docling team.
Hey @Romaiz, great question! Docling does support integrating custom/finetuned local models, and you have several options for using something like Nemotron-nano-4b.
Quick clarification: Docling's default model is actually Granite-Docling-258M (IBM's lightweight model), not Qwen — though Qwen2.5-VL-3B is available as a preset [1].
Integrating a Custom Model
Option 1: Via OpenAI-compatible API (recommended for local serving)
Run your finetuned Nemotron-nano-4b with vLLM or Ollama, then point Docling to it: