Skip to main content

How Polling Works

Folder monitoring uses a polling architecture rather than webhooks. This approach:

  • ✅ Works reliably across all deployment environments
  • ✅ Requires no domain verification or callback URLs
  • ✅ Handles OAuth token refresh automatically
  • ✅ Provides consistent behavior across providers

Polling Schedule

AspectValue
FrequencyEvery 5 minutes
Max monitors per run50 monitors
Max docs synced per run5 pending documents
Timeout60 seconds per cron run

Polling Process

  1. Every 5 minutes, the cron job runs
  2. Fetches active monitors ordered by least-recently-checked
  3. For each monitor:
    • Lists files in the monitored folder
    • Compares against already-synced documents
    • Creates pending document records for new files
  4. After polling, syncs up to 5 pending documents to vector store
  5. Updates monitor stats (last checked, file count, etc.)

What "New File" Means

A file is considered "new" if:

  • It exists in the monitored folder
  • It has a supported file type:
    • Google Drive: PDFs, Google Docs, Google Sheets, Google Slides
    • S3/Supabase Storage: PDF, TXT, MD, JSON, CSV, DOCX
  • Its external ID is not already in synced_documents for this source + project
warning

Files modified after initial sync are not automatically re-synced. Use manual re-sync for updated documents.