The Cheapest OCR API for PDFs, Images & Text

Affordable LLM/VLM-powered document extraction. Extract structured data from PDFs, images, and text files with nested schemas. Get the best price-to-performance ratio for OCR. Define your schema, get nested JSON back instantly.

3
File Types
Nested Levels
1,000
Max PDF Pages

Cheapest LLM/VLM OCR with Premium Features

Affordable pricing meets enterprise-grade document extraction

๐Ÿ’ฐ

Most Affordable Pricing

Get the best price-to-performance ratio for OCR. Pay only for what you use with transparent, competitive pricing. No hidden fees or minimum commitments.

๐Ÿ“„

Multi-Format Support

Extract from PDFs (up to 1,000 pages), images (JPEG, PNG, WebP, etc.), and text files. One API for all your document processing needs.

๐Ÿ—๏ธ

Nested Schema Support

Define complex nested objects and arrays with unlimited depth. Extract structured data matching your exact data model - no post-processing needed.

๐Ÿš€

Lightning Fast Processing

Advanced AI models for rapid document processing. Get results in seconds, not minutes. Optimized for speed and accuracy.

๐Ÿ”’

Enterprise Security

Your data stays protected with enterprise-grade security, geo-blocking, and prompt injection protection. Files processed in-memory only.

โš™๏ธ

Simple REST API

Easy integration with any language. Two simple endpoints - upload files or provide URLs. Start extracting data in minutes.

How It Works

Three simple steps to extract nested structured data

1

Define Nested Schema

Create a JSON schema with objects, arrays, and nested structures. Support for string, number, boolean, array, and object types with unlimited nesting.

โ†’
2

Send Your Document

Upload PDFs, images, or text files directly, or provide a URL. Support for multi-page PDFs up to 1,000 pages and 10MB files.

โ†’
3

Get Nested JSON

Receive perfectly structured nested JSON matching your schema. Arrays of objects, deeply nested properties - exactly as you defined.

๐Ÿ“š

See Real Examples in Action

Explore comprehensive tutorials with live examples using real recipe images and PDF documents. Learn how to extract structured data from invoices, receipts, recipes, and more with step-by-step guides.

๐Ÿณ Recipe extraction with nested arrays
๐Ÿ“„ Multi-page PDF processing
๐Ÿงพ Invoice & receipt digitization
๐Ÿ’ก Best practices & tips

Simple API Endpoints with Nested Schema Support

Extract complex nested data structures from PDFs, images, and text files. Both endpoints use the same schema format and return identical JSON structures.

POST

/json

Extract nested data from PDFs, images, or text files via URL

Nested Schema Example bash
curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/invoice.pdf",
    "returnSchema": {
      "invoice_number": {
        "key": "invoice_number",
        "description": "Invoice number",
        "type": "string"
      },
      "customer": {
        "key": "customer",
        "description": "Customer info",
        "type": "object",
        "properties": {
          "name": {
            "key": "name",
            "description": "Customer name",
            "type": "string"
          },
          "email": {
            "key": "email",
            "description": "Email address",
            "type": "string"
          }
        }
      },
      "items": {
        "key": "items",
        "description": "Line items",
        "type": "array",
        "items": {
          "key": "item",
          "description": "Item details",
          "type": "object",
          "properties": {
            "name": {
              "key": "name",
              "description": "Item name",
              "type": "string"
            },
            "price": {
              "key": "price",
              "description": "Item price",
              "type": "number"
            }
          }
        }
      }
    }
  }'
POST

/form-data

Upload PDFs, images, or text files directly with nested schemas

File Upload Example bash
curl -X POST rapidAPIURL/v1/form-data \
  -H "X-RapidAPI-Key: your-api-key" \
  -F "[email protected]" \
  -F 'returnSchema={
    "invoice_number": {
      "key": "invoice_number",
      "description": "Invoice number",
      "type": "string"
    },
    "customer": {
      "key": "customer",
      "description": "Customer info",
      "type": "object",
      "properties": {
        "name": {
          "key": "name",
          "description": "Customer name",
          "type": "string"
        },
        "email": {
          "key": "email",
          "description": "Email address",
          "type": "string"
        }
      }
    },
    "items": {
      "key": "items",
      "description": "Line items",
      "type": "array",
      "items": {
        "key": "item",
        "description": "Item details",
        "type": "object",
        "properties": {
          "name": {
            "key": "name",
            "description": "Item name",
            "type": "string"
          },
          "price": {
            "key": "price",
            "description": "Item price",
            "type": "number"
          }
        }
      }
    }
  }'

Example Response (Same for Both Endpoints)

Structured Output JSON
{
  "message": "Success",
  "returnSchema": {
    "invoice_number": "INV-2025-001",
    "customer": {
      "name": "Acme Corp",
      "email": "[email protected]"
    },
    "items": [
      {
        "name": "Web Development",
        "price": 5000
      },
      {
        "name": "Consulting",
        "price": 2500
      }
    ]
  }
}

Technical Specifications

๐Ÿ“„ File Types

Images, PDFs (1,000 pages), Text Files

๐Ÿ–ผ๏ธ Image Formats

JPEG, PNG, GIF, WebP, AVIF, BMP

๐Ÿ“ File Size

Maximum 10MB per file

๐Ÿ—๏ธ Schema Types

String, Number, Boolean, Array, Object

๐Ÿ”„ Nesting

Unlimited depth for objects & arrays

๐Ÿ’ฐ Pricing

Cheapest LLM/VLM OCR - Pay per use

The Power of Nested Schemas

Extract complex hierarchical data in a single API call - no post-processing needed

๐ŸŽฏ

Arrays of Objects

Extract repeating structures like invoice line items, recipe ingredients, or transaction lists - each with multiple properties and nested data.

items: [{ name: "...", price: 100, metadata: {...} }]
๐Ÿ—๏ธ

Unlimited Depth

Nest objects within objects to any level. Perfect for addresses, organizational hierarchies, product categories, or complex document structures.

customer โ†’ address โ†’ shipping โ†’ coordinates
๐Ÿ”„

Mixed Types

Combine strings, numbers, booleans, arrays, and objects in any structure. Get strongly-typed data matching your exact application schema.

{ name: string, count: number, active: bool, tags: [] }
โšก

No Post-Processing

Get data in the exact format your application needs. Skip parsing, transforming, and reshaping - use the response directly in your database or application.

API Response โ†’ Direct to Database

โŒ Traditional OCR APIs

  • Flat key-value pairs only
  • Manual parsing of arrays
  • Complex post-processing code
  • Error-prone transformations
  • Multiple API calls for related data

โœ… Ordliy Nested Schemas

  • Define your exact data structure
  • Get arrays of objects automatically
  • Zero post-processing needed
  • Type-safe extraction
  • Everything in one API call

Real-World Use Cases with Nested Extraction

See complex data extraction in action

๐Ÿ“„

Multi-Page PDF Invoice Extraction

Extract complex nested data from multi-page PDF invoices. Get customer details, line items as arrays of objects, and deeply nested address information - all in one API call.

PDF with Nested Schema bash
curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/invoice.pdf",
    "returnSchema": {
      "invoice_number": {
        "key": "invoice_number",
        "description": "Invoice ID",
        "type": "string"
      },
      "customer": {
        "key": "customer",
        "description": "Customer info",
        "type": "object",
        "properties": {
          "name": {"key": "name", "description": "Customer name", "type": "string"},
          "address": {
            "key": "address",
            "description": "Address",
            "type": "object",
            "properties": {
              "street": {"key": "street", "description": "Street", "type": "string"},
              "city": {"key": "city", "description": "City", "type": "string"}
            }
          }
        }
      },
      "items": {
        "key": "items",
        "description": "Line items",
        "type": "array",
        "items": {
          "key": "item",
          "description": "Item",
          "type": "object",
          "properties": {
            "description": {"key": "description", "description": "Item name", "type": "string"},
            "quantity": {"key": "quantity", "description": "Qty", "type": "number"},
            "price": {"key": "price", "description": "Unit price", "type": "number"}
          }
        }
      },
      "total": {"key": "total", "description": "Total amount", "type": "number"}
    }
  }'
Nested JSON Response JSON
{
  "message": "Success",
  "returnSchema": {
    "invoice_number": "INV-2025-1234",
    "customer": {
      "name": "Acme Corporation",
      "address": {
        "street": "123 Business Ave",
        "city": "San Francisco"
      }
    },
    "items": [
      {"description": "Web Development Services", "quantity": 40, "price": 150},
      {"description": "Cloud Hosting (Annual)", "quantity": 1, "price": 1200},
      {"description": "SSL Certificate", "quantity": 1, "price": 99}
    ],
    "total": 7299
  }
}
๐Ÿงพ

Receipt Processing

Digitize receipts with nested item arrays. Extract store info, line items with prices and quantities, and payment details - all structured perfectly.

Nested Schema Request bash
curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/receipt-grocery.jpg",
    "returnSchema": {
      "store_name": {
        "key": "store_name",
        "description": "Name of the store",
        "type": "string"
      },
      "date": {
        "key": "date",
        "description": "Purchase date",
        "type": "string"
      },
      "items": {
        "key": "items",
        "description": "Purchased items",
        "type": "array",
        "items": {
          "key": "item",
          "description": "Item details",
          "type": "object",
          "properties": {
            "name": {
              "key": "name",
              "description": "Item name",
              "type": "string"
            },
            "quantity": {
              "key": "quantity",
              "description": "Quantity",
              "type": "number"
            },
            "price": {
              "key": "price",
              "description": "Item price",
              "type": "number"
            }
          }
        }
      },
      "total": {
        "key": "total",
        "description": "Total amount",
        "type": "number"
      }
    }
  }'
Structured Response JSON
{
  "message": "Success",
  "returnSchema": {
    "store_name": "Whole Foods Market",
    "date": "10/28/2025",
    "items": [
      {"name": "Organic Bananas", "quantity": 2, "price": 3.98},
      {"name": "Almond Milk", "quantity": 1, "price": 4.99},
      {"name": "Spinach", "quantity": 1, "price": 3.49},
      {"name": "Greek Yogurt", "quantity": 3, "price": 12.00},
      {"name": "Chicken Breast", "quantity": 1.5, "price": 23.36}
    ],
    "total": 47.82
  }
}
๐Ÿ‘จโ€๐Ÿณ

Recipe Extraction

Extract recipes with nested ingredients and instructions arrays. Get each ingredient with quantity and each step as a separate object - perfect for recipe apps.

Nested Recipe Schema bash
curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/recipe-card.jpg",
    "returnSchema": {
      "recipe_name": {
        "key": "recipe_name",
        "description": "Recipe title",
        "type": "string"
      },
      "servings": {
        "key": "servings",
        "description": "Number of servings",
        "type": "number"
      },
      "prep_time": {
        "key": "prep_time",
        "description": "Prep time in minutes",
        "type": "number"
      },
      "ingredients": {
        "key": "ingredients",
        "description": "Recipe ingredients",
        "type": "array",
        "items": {
          "key": "ingredient",
          "description": "Ingredient details",
          "type": "object",
          "properties": {
            "item": {
              "key": "item",
              "description": "Ingredient name",
              "type": "string"
            },
            "amount": {
              "key": "amount",
              "description": "Amount needed",
              "type": "string"
            }
          }
        }
      },
      "instructions": {
        "key": "instructions",
        "description": "Cooking steps",
        "type": "array",
        "items": {
          "key": "step",
          "description": "Cooking step",
          "type": "string"
        }
      }
    }
  }'
Structured Recipe Output JSON
{
  "message": "Success",
  "returnSchema": {
    "recipe_name": "Homemade Margherita Pizza",
    "servings": 4,
    "prep_time": 120,
    "ingredients": [
      {"item": "Bread flour", "amount": "500g"},
      {"item": "Warm water", "amount": "325ml"},
      {"item": "Active dry yeast", "amount": "7g"},
      {"item": "Salt", "amount": "2 tsp"},
      {"item": "Olive oil", "amount": "1 tbsp"},
      {"item": "Crushed tomatoes", "amount": "400g"},
      {"item": "Fresh mozzarella", "amount": "300g"},
      {"item": "Fresh basil leaves", "amount": "handful"}
    ],
    "instructions": [
      "Mix warm water and yeast, let sit 5 minutes",
      "Combine flour and salt, add yeast mixture and olive oil",
      "Knead for 10 minutes until smooth",
      "Let rise 1.5 hours",
      "Divide into 4 balls, roll out thin",
      "Top with tomato sauce, mozzarella, and basil",
      "Bake at 475ยฐF for 12-15 minutes until crust is golden"
    ]
  }
}
๐Ÿ›๏ธ

Product Analysis

Extract detailed product info with nested specs object and features array. Perfect for e-commerce catalogs, price comparison, and inventory systems.

Nested Product Schema bash
curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/product-laptop.jpg",
    "returnSchema": {
      "product_name": {
        "key": "product_name",
        "description": "Full product name",
        "type": "string"
      },
      "brand": {
        "key": "brand",
        "description": "Brand name",
        "type": "string"
      },
      "price": {
        "key": "price",
        "description": "Price in dollars",
        "type": "number"
      },
      "specs": {
        "key": "specs",
        "description": "Technical specifications",
        "type": "object",
        "properties": {
          "processor": {"key": "processor", "description": "CPU", "type": "string"},
          "memory": {"key": "memory", "description": "RAM", "type": "string"},
          "storage": {"key": "storage", "description": "Storage", "type": "string"},
          "display": {"key": "display", "description": "Screen", "type": "string"}
        }
      },
      "features": {
        "key": "features",
        "description": "Key features",
        "type": "array",
        "items": {
          "key": "feature",
          "description": "Feature",
          "type": "string"
        }
      }
    }
  }'
Structured Product Data JSON
{
  "message": "Success",
  "returnSchema": {
    "product_name": "MacBook Pro 16-inch M3 Max",
    "brand": "Apple",
    "price": 3499.00,
    "specs": {
      "processor": "M3 Max chip",
      "memory": "36GB unified memory",
      "storage": "1TB SSD",
      "display": "16.2-inch Liquid Retina XDR"
    },
    "features": [
      "Up to 22 hours battery life",
      "Three Thunderbolt 4 ports",
      "HDMI port and SDXC card slot",
      "MagSafe 3 charging",
      "1080p FaceTime HD camera",
      "Six-speaker sound system"
    ]
  }
}
๐Ÿ“ธ

Photo Analysis

Analyze photos with nested scene details and detected objects arrays. Extract context, count items, and identify people or objects systematically.

Photo Analysis Schema bash
curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/event-photo.jpg",
    "returnSchema": {
      "scene": {
        "key": "scene",
        "description": "Scene details",
        "type": "object",
        "properties": {
          "setting": {"key": "setting", "description": "Location type", "type": "string"},
          "time_of_day": {"key": "time_of_day", "description": "Time period", "type": "string"},
          "mood": {"key": "mood", "description": "Atmosphere", "type": "string"}
        }
      },
      "people_count": {
        "key": "people_count",
        "description": "Number of people",
        "type": "number"
      },
      "activities": {
        "key": "activities",
        "description": "What people are doing",
        "type": "array",
        "items": {
          "key": "activity",
          "description": "An activity",
          "type": "string"
        }
      }
    }
  }'
Structured Analysis JSON
{
  "message": "Success",
  "returnSchema": {
    "scene": {
      "setting": "Outdoor garden party venue",
      "time_of_day": "Evening golden hour",
      "mood": "Joyful and festive"
    },
    "people_count": 28,
    "activities": [
      "Socializing and mingling",
      "Dancing",
      "Dining at decorated tables",
      "Taking photos"
    ]
  }
}

Tips & Tricks for Better Extraction

Get the most out of your API calls with these proven best practices

๐Ÿ“

Write Clear Descriptions

The more specific your field descriptions, the better the results. Instead of "date", use "Invoice date in MM/DD/YYYY format" or "Payment due date".

Good: "Customer's primary billing address including street, city, and ZIP"
Avoid: "Address"
๐ŸŽฏ

Use Appropriate Types

Choose the right data type for each field. Use "number" for quantities and prices, "boolean" for yes/no values, and "string" for text. This ensures proper data formatting.

Good: price: number, in_stock: boolean
Avoid: Everything as string
๐Ÿ—๏ธ

Structure Nested Data Wisely

Group related fields into nested objects. For repeating data like line items or ingredients, use arrays of objects with consistent properties.

Good: customer โ†’ address โ†’ { street, city, zip }
Avoid: Flat fields like customer_street, customer_city
๐Ÿ“„

Optimize PDF Quality

For best results with PDFs, ensure they're not password-protected and have clear text (not blurry scans). Native PDFs work better than scanned images.

Best: Native digital PDFs with selectable text
OK: High-quality scanned PDFs (300+ DPI)
๐Ÿ”ข

Handle Multiple Items

When extracting lists (invoice items, ingredients, etc.), always use array type with object items. This ensures each entry has all its properties properly structured.

Good: items: array of { name, qty, price }
Avoid: Single string with comma-separated values
๐Ÿ’ก

Test & Iterate

Start with a simple schema and gradually add complexity. Test with sample documents, then refine your descriptions and structure based on results.

Process: Simple schema โ†’ Test โ†’ Analyze results โ†’ Refine โ†’ Retest
๐ŸŽจ

Image Quality Matters

Use clear, well-lit images with readable text. Higher resolution images (but under 10MB) produce better results. Avoid heavily compressed or distorted images.

Ideal: 1080p+ resolution, good lighting, sharp focus
Avoid: Blurry, dark, or low-resolution images
โšก

Batch Similar Documents

Create reusable schemas for document types you process frequently. Same schema works for all invoices, all receipts, etc. Save schemas for consistent results.

Efficient: One schema template per document type
Saves: Time, ensures consistency across extractions
๐Ÿ’Ž

Pro Tip: Cost Optimization

Only request the fields you actually need. Smaller, more focused schemas process faster and use fewer tokens, reducing your costs while maintaining accuracy.

Ready for the Cheapest OCR API?

Start extracting nested data from PDFs, images, and text files. Unbeatable pricing with pay-per-use billing.