PDF Connector
Index PDF documents into RetainDB when your content lives in files rather than a web-native source.
Use the PDF connector when the source material is a document, handbook, report, or exported file and you want it searchable in a RetainDB project.
This is a good connector for stable documents. It is not the best fit for content that changes constantly or is better represented by a live source.
Create the source
curl -X POST "https://api.retaindb.com/v1/projects/proj_123/sources" \
-H "Authorization: Bearer $RETAINDB_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Employee Handbook",
"connector_type": "pdf",
"config": {
"file_name": "handbook.pdf",
"file_url": "https://example.com/handbook.pdf"
}
}'Start sync and check status
curl -X POST "https://api.retaindb.com/v1/sources/src_123/sync" \
-H "Authorization: Bearer $RETAINDB_API_KEY"curl "https://api.retaindb.com/v1/sources/src_123/status" \
-H "Authorization: Bearer $RETAINDB_API_KEY"What makes a PDF import go well
PDF ingestion works best when:
- the document has extractable text
- the PDF is reasonably clean and not image-only
- the document is logically one source you want to query together
Common mistakes
Using a scanned or low-quality PDF
If the text layer is weak or missing, the resulting content quality may be worse than you expect.
Uploading giant mixed-purpose PDFs first
A single massive document is harder to validate than one focused handbook or report. Start with a document where you already know what a good result looks like.
Treating PDF as a website replacement
If the same content already exists as structured HTML docs, a web connector may produce a cleaner retrieval experience.
A good first test
Pick a PDF with one memorable sentence, sync it, then query for that sentence or concept from the same project.
Next step
If your content is already plain text, use text connector. If you want the dashboard path for uploading and syncing, go to sources: add, sync, and troubleshoot.
Was this page helpful?
Your feedback helps us prioritize docs improvements weekly.