LandingAI has released a new set of open-source agent skills for its Agentic Document Extraction (ADE) platform, enabling coding assistants to autonomously process complex documents. According to the official GitHub repository updated in June 2026, the vision-first models achieve 99.16% accuracy on the DocVQA benchmark and can parse over 20 file formats without requiring templates.
Key Points:
- The skills teach AI agents like Claude and Cursor to write Python scripts for document parsing, extraction, and classification.
- ADE's vision-first models understand document layout, not just text, to handle complex tables and scanned forms.
- Every extracted value is traceable with bounding boxes, page coordinates, and confidence scores for full auditability.
By offering pre-built skills, LandingAI aims to solve this "document problem," which an Accounting Today analysis identified as a primary blocker for enterprise AI adoption. For those learning to harness this power, understanding foundational concepts is key, and some platforms, like the one covered in a guide to building AI agents, help beginners get started.
What Are the Core Document Skills?
The core skills focus on the fundamental operations of parsing, extracting, and classifying document content. They give an agent the ability to convert entire documents into structured Markdown or hierarchical JSON, pull specific fields using Pydantic models, and split large batches of documents by type, such as separating invoices from receipts.These skills leverage a proprietary vision-first model that interprets a document's visual layout. This allows it to succeed where traditional text-based tools fail. The system can process large files asynchronously, handling documents up to 1 GB or 6,000 pages. For precise verification, it also provides visual grounding, tying every piece of extracted data back to its exact location on the original page.
From Extraction to Automated Workflows
Beyond discrete tasks, the ADE skills provide patterns for composing end-to-end production workflows. This allows developers to instruct agents to build sophisticated pipelines, such as batch processing hundreds of documents in parallel or creating multi-step, classify-then-extract processes for handling mixed document types in a single folder.A key application is preparing data for Retrieval-Augmented Generation (RAG) systems. The skills can automate semantic chunking of documents and ingest the resulting data into vector databases like ChromaDB or FAISS. The agent can also be tasked with exporting structured results directly into DataFrames, CSV files, or a Snowflake data warehouse.
How Does ADE Compare to Traditional OCR?
LandingAI's ADE platform represents a significant evolution from traditional Optical Character Recognition (OCR) tools. While OCR focuses on digitizing text from an image, ADE is an agentic, vision-first system. It interprets the document's structure and context, enabling it to understand complex layouts autonomously without pre-defined templates.This distinction is critical for accuracy and automation. An agent using ADE can adapt its extraction plan based on the unique layout of each document it encounters. Furthermore, the ecosystem of agent skills is growing, but so are the security risks. One report detailed how a malicious skill reached over 26,000 users, highlighting the importance of using vetted skills from trusted sources like LandingAI.
| Feature | Traditional OCR | LandingAI ADE |
|---|---|---|
| Core Function | Converts image text to string data | Parses layout, structure, and text |
| Data Output | Plain text | Structured JSON/Markdown with coordinates |
| Adaptability | Often requires templates for forms | Agentic; adapts to each document autonomously |
| Traceability | None or limited | Full traceability with bounding boxes & scores |
| File Support | Primarily images (PNG, JPEG) | Over 20 formats including PDF, DOCX, PPTX |







