POST /api/v1/pdf-to-excel
PDF tables to Excel.
Actually typed.
Most PDF extractors give you strings in cells. We give you numbers as numbers, dates as dates, currencies formatted correctly, and percentages you can sum. One API call. Production-ready Excel.
How it works
Send a PDF
Upload via multipart or send base64 — your call.
AI extracts tables
Gemini reads every page, detects tables, types each column.
Get an Excel file
One sheet per table. Numbers as numbers. Dates as dates. Ready to use.
# Submit a PDF
curl -X POST https://api.contexa.works/api/v1/pdf-to-excel \
-H "x-rapidapi-key: YOUR_KEY" \
-F "file=@report.pdf"
# Response: { "jobId": "abc-123", "status": "processing" }
# Poll for result
curl https://api.contexa.works/api/v1/jobs/abc-123
# Download when complete
curl -o report.xlsx \
https://api.contexa.works/api/v1/jobs/abc-123/resultWhat sets it apart
Typed columns
Numbers, currency, percentages, dates — detected automatically, not just strings.
Multi-page tables
Tables that span pages get merged back together. No missing rows.
Unit detection
Strips suffixes like pence, bps, or % and stores them separately so your formulas work.
Scanned PDFs
Works on scanned documents and images, not just text-based PDFs.
Custom instructions
Tell the AI what to focus on: "only the first table", "convert to USD", "ignore headers".
Async processing
Submit, get a job ID, poll or get a webhook callback. Built for production pipelines.
Other extractors vs. Contexa
| Typical extractor | Contexa | |
|---|---|---|
| Cell types | Everything is a string | Numbers, dates, currency, % |
| Scanned PDFs | Text-based only | Images and scans via AI vision |
| Multi-page tables | Split across sheets | Merged automatically |
| Custom prompts | None | Natural language instructions |
| Units (pence, bps) | Left in cell as text | Stripped, stored separately |
Stop cleaning spreadsheets by hand
Upload a PDF in the playground and see the result in seconds. No signup required.
Try it now