Can I convert a PDF with multiple tables to CSV?

Yes. Online converters like Convertio process all pages and extract every table into a single CSV. In Python, pdfplumber lets you iterate over each page and extract tables individually, giving you full control over which tables to include and how to merge them.

How do I convert a scanned PDF to CSV?

Scanned PDFs contain images, not text. You need OCR (Optical Character Recognition) first. Convertio has built-in OCR — just select your language before converting. In Python, use pytesseract or pdf2image + Tesseract to extract text, then parse the table structure manually or with tabula-py.

Why does my PDF-to-CSV output have misaligned columns?

Column misalignment usually happens when the PDF uses spaces instead of actual table borders to separate data. Try a different extraction tool — pdfplumber handles borderless tables better than most. You can also define explicit column boundaries in pdfplumber using the 'explicit_vertical_lines' parameter.

Is it free to convert PDF to CSV online?

Yes. Convertio offers free PDF to CSV conversion with no registration, no watermarks, and no email required. Files are encrypted via 256-bit SSL and auto-deleted within 2 hours. The maximum file size is 100 MB.

How to Convert PDF to CSV: 4 Methods (Online, Python, Excel, Sheets)

Tables vs Plain Text: Why It Matters

Before choosing a method, check what kind of data your PDF contains. The approach depends entirely on the PDF structure:

PDF Type	What It Contains	Best Method
Native tables	Text-based PDF with visible table borders and grid lines	Any method — Convertio is fastest
Borderless tables	Columns aligned by spacing, no visible grid	Python (pdfplumber) for precision
Scanned PDF	Image of a printed page (no selectable text)	Convertio with OCR enabled
Mixed content	Tables + paragraphs + headers on the same page	Python for selective extraction

Quick test: open your PDF and try selecting text with your mouse. If you can highlight individual words, it's a native (text-based) PDF. If the entire page selects as one block, it's a scanned image — you'll need OCR.

Method 1: Convert Online with Convertio

Easy No software • Works on any device • OCR support

The fastest option for most users. Convertio handles native PDFs, borderless tables, and even scanned documents with OCR. No installation, no account required.

Go to convertio.com/pdf-to-csv
Upload your PDF — drag and drop, or click "Choose PDF File". Max 100 MB.
For scanned PDFs: select your OCR language from the dropdown before converting.
Click "Convert to CSV" — conversion takes a few seconds for most files.
Download the CSV — open it in Excel, Google Sheets, or import into your database.

Convertio processes all pages of your PDF and combines extracted data into a single CSV file. Files are encrypted during transfer and auto-deleted within 2 hours.

Method 2: Python with pdfplumber

Advanced Full control • Batch processing • Handles borderless tables

pdfplumber is the best Python library for extracting tables from PDFs. It understands both bordered and borderless tables, gives you coordinates for every character, and lets you fine-tune extraction parameters.

Install pdfplumber

Terminal

pip install pdfplumber

Basic table extraction

This script extracts all tables from every page of a PDF and writes them to a CSV file:

Python

import pdfplumber
import csv

with pdfplumber.open("invoice.pdf") as pdf:
    all_rows = []
    for page in pdf.pages:
        table = page.extract_table()
        if table:
            all_rows.extend(table)

with open("output.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(all_rows)

print(f"Extracted {len(all_rows)} rows to output.csv")

Handling borderless tables

When tables don't have visible borders, pdfplumber can still detect columns using character positions. Use extract_table() with custom settings:

Python

# For PDFs with no visible table borders
table_settings = {
    "vertical_strategy": "text",
    "horizontal_strategy": "text",
    "snap_y_tolerance": 5,
    "intersection_x_tolerance": 15,
}

with pdfplumber.open("report.pdf") as pdf:
    page = pdf.pages[0]
    table = page.extract_table(table_settings)
    for row in table:
        print(row)

Batch convert multiple PDFs

Python

import pdfplumber
import csv
from pathlib import Path

for pdf_file in Path("./invoices").glob("*.pdf"):
    csv_path = pdf_file.with_suffix(".csv")
    with pdfplumber.open(pdf_file) as pdf:
        rows = []
        for page in pdf.pages:
            table = page.extract_table()
            if table:
                rows.extend(table)
        with open(csv_path, "w", newline="") as f:
            csv.writer(f).writerows(rows)
    print(f"{pdf_file.name} -> {csv_path.name} ({len(rows)} rows)")

Method 3: Microsoft Excel (Get Data)

Medium Desktop only • Microsoft 365 (Excel for 365) • Manual steps

Microsoft 365 (Excel for 365) can import PDF files directly using the Power Query / Get Data feature. This option is not available in standalone Excel 2016 or 2019 — it requires an active Microsoft 365 subscription. It works well for simple, well-structured tables.

Open Excel and create a new blank workbook.
Go to Data → Get Data → From File → From PDF.
Select your PDF from the file browser.
Choose the table(s) you want to import from the Navigator panel. Excel will show a preview of each detected table.
Click "Load" to import the data into your worksheet.
Save as CSV: File → Save As → choose "CSV (Comma delimited) (*.csv)" as the format.

Limitation: Excel's PDF import works best with simple, bordered tables. It struggles with multi-column layouts, merged cells, and borderless tables. For complex PDFs, use Convertio or Python instead.

Method 4: Google Sheets

Easy Free • Browser-based • Requires Google account

Google Sheets doesn't import PDFs directly, but you can use Google Drive's built-in OCR to extract the text first, then copy it into Sheets.

Upload the PDF to Google Drive.
Right-click the PDF → Open with → Google Docs. Google will OCR the file and convert it to an editable document.
Select the table data in the Google Doc and copy it (Ctrl+C / Cmd+C).
Open a new Google Sheet and paste (Ctrl+V / Cmd+V). The data will fill into cells.
Clean up the data — adjust column widths, remove extra rows, fix any OCR errors.
Download as CSV: File → Download → Comma Separated Values (.csv).

Tip: Google's OCR works surprisingly well for scanned PDFs. But the table structure may not survive the copy-paste step intact. For better results with tabular data, use Convertio's direct PDF to CSV converter.

Method Comparison

Feature	Convertio	Python	Excel	Google Sheets
Difficulty	Easy	Advanced	Medium	Easy
Installation	None (browser)	Python + pip	Microsoft 365	None (browser)
Bordered tables	Excellent	Excellent	Good	Fair
Borderless tables	Good	Excellent	Poor	Poor
Scanned PDFs (OCR)	Built-in	With pytesseract	Not supported	Via Google Drive
Batch processing	One file at a time	Unlimited	One file at a time	One file at a time
Best for	Quick one-off conversions	Automation & complex PDFs	Excel users with simple tables	Quick extraction with OCR

Tips for Clean CSV Output

Check the header row. Some PDFs have multi-line headers that get split into separate CSV rows. After conversion, verify that your column headers are on a single row.
Watch for merged cells. PDF tables often merge cells for group headings. These usually become empty cells in CSV. Fill them manually or with a script after extraction.
Handle special characters. Commas, quotes, and line breaks inside cell values can break CSV parsing. Good converters (Convertio, pdfplumber) handle escaping automatically. If yours doesn't, wrap values in double quotes.
Encoding matters. Use UTF-8 encoding when saving CSV to preserve accented characters, currency symbols, and non-Latin text. In Python: open("out.csv", "w", encoding="utf-8-sig") (the -sig adds a BOM that helps Excel detect UTF-8).
Multi-page tables. When a table spans multiple PDF pages, some tools extract each page as a separate table. In Python, skip the header row on subsequent pages to avoid duplicates.

Common Issues and Fixes

Problem	Cause	Solution
Empty CSV output	Scanned PDF (image-based)	Enable OCR in Convertio or use pytesseract
All data in one column	Excel opened CSV with wrong delimiter	Use Data → Text to Columns → Delimited → Comma
Misaligned columns	Borderless table with uneven spacing	Use pdfplumber with `vertical_strategy: "text"`
Garbled characters	Wrong encoding (usually Latin-1 vs UTF-8)	Open in text editor, save as UTF-8
Duplicate headers	Multi-page table with repeated headers	In Python, skip row 0 on pages after the first

How to Convert PDF to CSV:
4 Methods That Actually Work

Convert PDF to CSV

Converting...

Conversion Complete!

Tables vs Plain Text: Why It Matters

Method 1: Convert Online with Convertio

Method 2: Python with pdfplumber

Install pdfplumber

Basic table extraction

Handling borderless tables

Batch convert multiple PDFs

Method 3: Microsoft Excel (Get Data)

Method 4: Google Sheets

Method Comparison

Tips for Clean CSV Output

Common Issues and Fixes

Ready to Convert?

Converting...

Conversion Complete!

Frequently Asked Questions

How to Convert PDF to CSV: 4 Methods That Actually Work

Convert PDF to CSV

Converting...

Conversion Complete!

Tables vs Plain Text: Why It Matters

Method 1: Convert Online with Convertio

Method 2: Python with pdfplumber

Install pdfplumber

Basic table extraction

Handling borderless tables

Batch convert multiple PDFs

Method 3: Microsoft Excel (Get Data)

Method 4: Google Sheets

Method Comparison

Tips for Clean CSV Output

Common Issues and Fixes

Ready to Convert?

Converting...

Conversion Complete!

Frequently Asked Questions

How to Convert PDF to CSV:
4 Methods That Actually Work