Connectors

Browse docs

Connectors

Tap to expand

Contribute

ConnectorsUpdated 2026-03-18

Sitemap Connector

Ingest a site from its sitemap when you want a cleaner, more deterministic web import than an open-ended crawl.

Use the sitemap connector when the target site already exposes a useful sitemap.xml and you want RetainDB to follow that explicit inventory instead of crawling links dynamically.

This is usually the best web connector for documentation sites with a maintained sitemap.

Use this connector when

  • the site has a valid sitemap
  • you want more control than a crawler gives you
  • you want to avoid crawling navigation dead ends or irrelevant linked pages

Create the source

bash
curl -X POST "https://api.retaindb.com/v1/projects/proj_123/sources" \
  -H "Authorization: Bearer $RETAINDB_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Acme Sitemap",
    "connector_type": "sitemap",
    "config": {
      "sitemap_url": "https://acme.com/sitemap.xml",
      "limit": 2000
    }
  }'

Why teams choose sitemap over crawl

Sitemaps are more predictable.

They help when you want:

  • a known page inventory
  • fewer accidental pages
  • simpler debugging if expected content is missing

Start sync and check status

bash
curl -X POST "https://api.retaindb.com/v1/sources/src_123/sync" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"
bash
curl "https://api.retaindb.com/v1/sources/src_123/status" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Common mistakes

Bad sitemap assumptions

Some sites expose a sitemap, but it does not actually contain the pages you care about. Check the sitemap before troubleshooting.

Oversized imports

If the sitemap is huge, start with a smaller limit and validate quality first.

Using sitemap for one page

If you only need one document, the URL connector is simpler.

Next step

If the site does not have a good sitemap, use web crawler. If you want to validate the project and source lifecycle first, review projects and sources.

Was this page helpful?

Your feedback helps us prioritize docs improvements weekly.