POST /api/v1/pdf-to-excel

PDF tables to Excel.
Actually typed.

Most PDF extractors give you strings in cells. We give you numbers as numbers, dates as dates, currencies formatted correctly, and percentages you can sum. One API call. Production-ready Excel.

How it works

01

Send a PDF

Upload via multipart or send base64 — your call.

02

AI extracts tables

Gemini reads every page, detects tables, types each column.

03

Get an Excel file

One sheet per table. Numbers as numbers. Dates as dates. Ready to use.

# Submit a PDF
curl -X POST https://api.contexa.works/api/v1/pdf-to-excel \
  -H "x-rapidapi-key: YOUR_KEY" \
  -F "file=@report.pdf"

# Response: { "jobId": "abc-123", "status": "processing" }

# Poll for result
curl https://api.contexa.works/api/v1/jobs/abc-123

# Download when complete
curl -o report.xlsx \
  https://api.contexa.works/api/v1/jobs/abc-123/result

What sets it apart

Typed columns

Numbers, currency, percentages, dates — detected automatically, not just strings.

Multi-page tables

Tables that span pages get merged back together. No missing rows.

Unit detection

Strips suffixes like pence, bps, or % and stores them separately so your formulas work.

Scanned PDFs

Works on scanned documents and images, not just text-based PDFs.

Custom instructions

Tell the AI what to focus on: "only the first table", "convert to USD", "ignore headers".

Async processing

Submit, get a job ID, poll or get a webhook callback. Built for production pipelines.

Other extractors vs. Contexa

Typical extractorContexa
Cell typesEverything is a stringNumbers, dates, currency, %
Scanned PDFsText-based onlyImages and scans via AI vision
Multi-page tablesSplit across sheetsMerged automatically
Custom promptsNoneNatural language instructions
Units (pence, bps)Left in cell as textStripped, stored separately

Stop cleaning spreadsheets by hand

Upload a PDF in the playground and see the result in seconds. No signup required.

Try it now