Simple API Endpoints with Nested Schema Support

Extract complex nested data structures from PDFs, images, and text files. Both endpoints use the same schema format and return identical JSON structures.

POST

/json

Extract nested data from PDFs, images, or text files via URL

Nested Schema Example bash

curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/invoice.pdf",
    "returnSchema": {
      "invoice_number": {
        "key": "invoice_number",
        "description": "Invoice number",
        "type": "string"
      },
      "customer": {
        "key": "customer",
        "description": "Customer info",
        "type": "object",
        "properties": {
          "name": {
            "key": "name",
            "description": "Customer name",
            "type": "string"
          },
          "email": {
            "key": "email",
            "description": "Email address",
            "type": "string"
          }
        }
      },
      "items": {
        "key": "items",
        "description": "Line items",
        "type": "array",
        "items": {
          "key": "item",
          "description": "Item details",
          "type": "object",
          "properties": {
            "name": {
              "key": "name",
              "description": "Item name",
              "type": "string"
            },
            "price": {
              "key": "price",
              "description": "Item price",
              "type": "number"
            }
          }
        }
      }
    }
  }'

POST

/form-data

Upload PDFs, images, or text files directly with nested schemas

File Upload Example bash

curl -X POST rapidAPIURL/v1/form-data \
  -H "X-RapidAPI-Key: your-api-key" \
  -F "[email protected]" \
  -F 'returnSchema={
    "invoice_number": {
      "key": "invoice_number",
      "description": "Invoice number",
      "type": "string"
    },
    "customer": {
      "key": "customer",
      "description": "Customer info",
      "type": "object",
      "properties": {
        "name": {
          "key": "name",
          "description": "Customer name",
          "type": "string"
        },
        "email": {
          "key": "email",
          "description": "Email address",
          "type": "string"
        }
      }
    },
    "items": {
      "key": "items",
      "description": "Line items",
      "type": "array",
      "items": {
        "key": "item",
        "description": "Item details",
        "type": "object",
        "properties": {
          "name": {
            "key": "name",
            "description": "Item name",
            "type": "string"
          },
          "price": {
            "key": "price",
            "description": "Item price",
            "type": "number"
          }
        }
      }
    }
  }'

Example Response (Same for Both Endpoints)

Structured Output JSON

{
  "message": "Success",
  "returnSchema": {
    "invoice_number": "INV-2025-001",
    "customer": {
      "name": "Acme Corp",
      "email": "[email protected]"
    },
    "items": [
      {
        "name": "Web Development",
        "price": 5000
      },
      {
        "name": "Consulting",
        "price": 2500
      }
    ]
  }
}

Real-World Use Cases with Nested Extraction

See complex data extraction in action

📄

Multi-Page PDF Invoice Extraction

Extract complex nested data from multi-page PDF invoices. Get customer details, line items as arrays of objects, and deeply nested address information - all in one API call.

PDF with Nested Schema bash

curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/invoice.pdf",
    "returnSchema": {
      "invoice_number": {
        "key": "invoice_number",
        "description": "Invoice ID",
        "type": "string"
      },
      "customer": {
        "key": "customer",
        "description": "Customer info",
        "type": "object",
        "properties": {
          "name": {"key": "name", "description": "Customer name", "type": "string"},
          "address": {
            "key": "address",
            "description": "Address",
            "type": "object",
            "properties": {
              "street": {"key": "street", "description": "Street", "type": "string"},
              "city": {"key": "city", "description": "City", "type": "string"}
            }
          }
        }
      },
      "items": {
        "key": "items",
        "description": "Line items",
        "type": "array",
        "items": {
          "key": "item",
          "description": "Item",
          "type": "object",
          "properties": {
            "description": {"key": "description", "description": "Item name", "type": "string"},
            "quantity": {"key": "quantity", "description": "Qty", "type": "number"},
            "price": {"key": "price", "description": "Unit price", "type": "number"}
          }
        }
      },
      "total": {"key": "total", "description": "Total amount", "type": "number"}
    }
  }'

Nested JSON Response JSON

{
  "message": "Success",
  "returnSchema": {
    "invoice_number": "INV-2025-1234",
    "customer": {
      "name": "Acme Corporation",
      "address": {
        "street": "123 Business Ave",
        "city": "San Francisco"
      }
    },
    "items": [
      {"description": "Web Development Services", "quantity": 40, "price": 150},
      {"description": "Cloud Hosting (Annual)", "quantity": 1, "price": 1200},
      {"description": "SSL Certificate", "quantity": 1, "price": 99}
    ],
    "total": 7299
  }
}

🧾

Receipt Processing

Digitize receipts with nested item arrays. Extract store info, line items with prices and quantities, and payment details - all structured perfectly.

Nested Schema Request bash

curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/receipt-grocery.jpg",
    "returnSchema": {
      "store_name": {
        "key": "store_name",
        "description": "Name of the store",
        "type": "string"
      },
      "date": {
        "key": "date",
        "description": "Purchase date",
        "type": "string"
      },
      "items": {
        "key": "items",
        "description": "Purchased items",
        "type": "array",
        "items": {
          "key": "item",
          "description": "Item details",
          "type": "object",
          "properties": {
            "name": {
              "key": "name",
              "description": "Item name",
              "type": "string"
            },
            "quantity": {
              "key": "quantity",
              "description": "Quantity",
              "type": "number"
            },
            "price": {
              "key": "price",
              "description": "Item price",
              "type": "number"
            }
          }
        }
      },
      "total": {
        "key": "total",
        "description": "Total amount",
        "type": "number"
      }
    }
  }'

Structured Response JSON

{
  "message": "Success",
  "returnSchema": {
    "store_name": "Whole Foods Market",
    "date": "10/28/2025",
    "items": [
      {"name": "Organic Bananas", "quantity": 2, "price": 3.98},
      {"name": "Almond Milk", "quantity": 1, "price": 4.99},
      {"name": "Spinach", "quantity": 1, "price": 3.49},
      {"name": "Greek Yogurt", "quantity": 3, "price": 12.00},
      {"name": "Chicken Breast", "quantity": 1.5, "price": 23.36}
    ],
    "total": 47.82
  }
}

👨‍🍳

Recipe Extraction

Extract recipes with nested ingredients and instructions arrays. Get each ingredient with quantity and each step as a separate object - perfect for recipe apps.

Nested Recipe Schema bash

curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/recipe-card.jpg",
    "returnSchema": {
      "recipe_name": {
        "key": "recipe_name",
        "description": "Recipe title",
        "type": "string"
      },
      "servings": {
        "key": "servings",
        "description": "Number of servings",
        "type": "number"
      },
      "prep_time": {
        "key": "prep_time",
        "description": "Prep time in minutes",
        "type": "number"
      },
      "ingredients": {
        "key": "ingredients",
        "description": "Recipe ingredients",
        "type": "array",
        "items": {
          "key": "ingredient",
          "description": "Ingredient details",
          "type": "object",
          "properties": {
            "item": {
              "key": "item",
              "description": "Ingredient name",
              "type": "string"
            },
            "amount": {
              "key": "amount",
              "description": "Amount needed",
              "type": "string"
            }
          }
        }
      },
      "instructions": {
        "key": "instructions",
        "description": "Cooking steps",
        "type": "array",
        "items": {
          "key": "step",
          "description": "Cooking step",
          "type": "string"
        }
      }
    }
  }'

Structured Recipe Output JSON

{
  "message": "Success",
  "returnSchema": {
    "recipe_name": "Homemade Margherita Pizza",
    "servings": 4,
    "prep_time": 120,
    "ingredients": [
      {"item": "Bread flour", "amount": "500g"},
      {"item": "Warm water", "amount": "325ml"},
      {"item": "Active dry yeast", "amount": "7g"},
      {"item": "Salt", "amount": "2 tsp"},
      {"item": "Olive oil", "amount": "1 tbsp"},
      {"item": "Crushed tomatoes", "amount": "400g"},
      {"item": "Fresh mozzarella", "amount": "300g"},
      {"item": "Fresh basil leaves", "amount": "handful"}
    ],
    "instructions": [
      "Mix warm water and yeast, let sit 5 minutes",
      "Combine flour and salt, add yeast mixture and olive oil",
      "Knead for 10 minutes until smooth",
      "Let rise 1.5 hours",
      "Divide into 4 balls, roll out thin",
      "Top with tomato sauce, mozzarella, and basil",
      "Bake at 475°F for 12-15 minutes until crust is golden"
    ]
  }
}

🛍️

Product Analysis

Extract detailed product info with nested specs object and features array. Perfect for e-commerce catalogs, price comparison, and inventory systems.

Nested Product Schema bash

curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/product-laptop.jpg",
    "returnSchema": {
      "product_name": {
        "key": "product_name",
        "description": "Full product name",
        "type": "string"
      },
      "brand": {
        "key": "brand",
        "description": "Brand name",
        "type": "string"
      },
      "price": {
        "key": "price",
        "description": "Price in dollars",
        "type": "number"
      },
      "specs": {
        "key": "specs",
        "description": "Technical specifications",
        "type": "object",
        "properties": {
          "processor": {"key": "processor", "description": "CPU", "type": "string"},
          "memory": {"key": "memory", "description": "RAM", "type": "string"},
          "storage": {"key": "storage", "description": "Storage", "type": "string"},
          "display": {"key": "display", "description": "Screen", "type": "string"}
        }
      },
      "features": {
        "key": "features",
        "description": "Key features",
        "type": "array",
        "items": {
          "key": "feature",
          "description": "Feature",
          "type": "string"
        }
      }
    }
  }'

Structured Product Data JSON

{
  "message": "Success",
  "returnSchema": {
    "product_name": "MacBook Pro 16-inch M3 Max",
    "brand": "Apple",
    "price": 3499.00,
    "specs": {
      "processor": "M3 Max chip",
      "memory": "36GB unified memory",
      "storage": "1TB SSD",
      "display": "16.2-inch Liquid Retina XDR"
    },
    "features": [
      "Up to 22 hours battery life",
      "Three Thunderbolt 4 ports",
      "HDMI port and SDXC card slot",
      "MagSafe 3 charging",
      "1080p FaceTime HD camera",
      "Six-speaker sound system"
    ]
  }
}

📸

Photo Analysis

Analyze photos with nested scene details and detected objects arrays. Extract context, count items, and identify people or objects systematically.

Photo Analysis Schema bash

curl -X POST rapidAPIURL/v1/json \
  -H "Content-Type: application/json" \
  -H "X-RapidAPI-Key: your-api-key" \
  -d '{
    "url": "https://example.com/event-photo.jpg",
    "returnSchema": {
      "scene": {
        "key": "scene",
        "description": "Scene details",
        "type": "object",
        "properties": {
          "setting": {"key": "setting", "description": "Location type", "type": "string"},
          "time_of_day": {"key": "time_of_day", "description": "Time period", "type": "string"},
          "mood": {"key": "mood", "description": "Atmosphere", "type": "string"}
        }
      },
      "people_count": {
        "key": "people_count",
        "description": "Number of people",
        "type": "number"
      },
      "activities": {
        "key": "activities",
        "description": "What people are doing",
        "type": "array",
        "items": {
          "key": "activity",
          "description": "An activity",
          "type": "string"
        }
      }
    }
  }'

Structured Analysis JSON

{
  "message": "Success",
  "returnSchema": {
    "scene": {
      "setting": "Outdoor garden party venue",
      "time_of_day": "Evening golden hour",
      "mood": "Joyful and festive"
    },
    "people_count": 28,
    "activities": [
      "Socializing and mingling",
      "Dancing",
      "Dining at decorated tables",
      "Taking photos"
    ]
  }
}

Tips & Tricks for Better Extraction

Get the most out of your API calls with these proven best practices

📝

Write Clear Descriptions

The more specific your field descriptions, the better the results. Instead of "date", use "Invoice date in MM/DD/YYYY format" or "Payment due date".

Good: "Customer's primary billing address including street, city, and ZIP"
Avoid: "Address"

🎯

Use Appropriate Types

Choose the right data type for each field. Use "number" for quantities and prices, "boolean" for yes/no values, and "string" for text. This ensures proper data formatting.

Good: price: number, in_stock: boolean
Avoid: Everything as string

🏗️

Structure Nested Data Wisely

Group related fields into nested objects. For repeating data like line items or ingredients, use arrays of objects with consistent properties.

Good: customer → address → { street, city, zip }
Avoid: Flat fields like customer_street, customer_city

📄

Optimize PDF Quality

For best results with PDFs, ensure they're not password-protected and have clear text (not blurry scans). Native PDFs work better than scanned images.

Best: Native digital PDFs with selectable text
OK: High-quality scanned PDFs (300+ DPI)

🔢

Handle Multiple Items

When extracting lists (invoice items, ingredients, etc.), always use array type with object items. This ensures each entry has all its properties properly structured.

Good: items: array of { name, qty, price }
Avoid: Single string with comma-separated values

💡

Test & Iterate

Start with a simple schema and gradually add complexity. Test with sample documents, then refine your descriptions and structure based on results.

Process: Simple schema → Test → Analyze results → Refine → Retest

🎨

Image Quality Matters

Use clear, well-lit images with readable text. Higher resolution images (but under 10MB) produce better results. Avoid heavily compressed or distorted images.

Ideal: 1080p+ resolution, good lighting, sharp focus
Avoid: Blurry, dark, or low-resolution images

⚡

Batch Similar Documents

Create reusable schemas for document types you process frequently. Same schema works for all invoices, all receipts, etc. Save schemas for consistent results.

Efficient: One schema template per document type
Saves: Time, ensures consistency across extractions

💎

Pro Tip: Cost Optimization

Only request the fields you actually need. Smaller, more focused schemas process faster and use fewer tokens, reducing your costs while maintaining accuracy.

The Cheapest OCR API for PDFs, Images & Text

Cheapest LLM/VLM OCR with Premium Features

Most Affordable Pricing

Multi-Format Support

Nested Schema Support

Lightning Fast Processing

Enterprise Security

Simple REST API

How It Works

Define Nested Schema

Send Your Document

Get Nested JSON

See Real Examples in Action

Simple API Endpoints with Nested Schema Support

/json

/form-data

Example Response (Same for Both Endpoints)

Technical Specifications

📄 File Types

🖼️ Image Formats

📏 File Size

🏗️ Schema Types

🔄 Nesting

💰 Pricing

The Power of Nested Schemas

Arrays of Objects

Unlimited Depth

Mixed Types

No Post-Processing

❌ Traditional OCR APIs

✅ Ordliy Nested Schemas