Parse local PDFs into agent-ready text, JSON, and screenshots with LiteParse
Run LiteParse locally to extract PDF text, spatial JSON, OCR-backed output, and page screenshots before sending documents into an agent workflow.
npx skills add agentskillexchange/skills --skill parse-local-pdfs-into-agent-ready-text-json-and-screenshots-with-liteparse
Use LiteParse when an agent needs a local, repeatable document-ingestion step before summarizing, retrieval indexing, evidence review, or visual page inspection. The operator installs the LiteParse CLI, parses PDFs into text or JSON with bounding boxes, limits work to specific page ranges when needed, and generates page screenshots for cases where layout or visual evidence matters. This is bounded to document pre-processing and agent handoff: it is not a generic LlamaIndex SDK listing or a cloud document service card.
What this skill actually does
Inputs and prerequisites: Node.js, npm or Homebrew, LiteParse CLI (`lit`), optional OCR server.
Setup notes: Install with `npm i -g @llamaindex/liteparse` or `brew tap run-llama/liteparse && brew install llamaindex-liteparse`, then run `lit parse document.pdf –format json` or `lit screenshot document.pdf -o ./screenshots`.
Source and verification boundary: use https://developers.llamaindex.ai/liteparse/ as the canonical reference before running the workflow; keep commands, API calls, CLI usage, and generated outputs reviewable against that upstream source.
Framework fit: publish this as a Multi-Framework workflow only when the operator can invoke the documented toolchain directly, rather than treating the upstream project as a generic product listing.