How Polling Works
Folder monitoring uses a polling architecture rather than webhooks. This approach:
- ✅ Works reliably across all deployment environments
- ✅ Requires no domain verification or callback URLs
- ✅ Handles OAuth token refresh automatically
- ✅ Provides consistent behavior across providers
Polling Schedule
| Aspect | Value |
|---|---|
| Frequency | Every 5 minutes |
| Max monitors per run | 50 monitors |
| Max docs synced per run | 5 pending documents |
| Timeout | 60 seconds per cron run |
Polling Process
- Every 5 minutes, the cron job runs
- Fetches active monitors ordered by least-recently-checked
- For each monitor:
- Lists files in the monitored folder
- Compares against already-synced documents
- Creates pending document records for new files
- After polling, syncs up to 5 pending documents to vector store
- Updates monitor stats (last checked, file count, etc.)
What "New File" Means
A file is considered "new" if:
- It exists in the monitored folder
- It has a supported file type:
- Google Drive: PDFs, Google Docs, Google Sheets, Google Slides
- S3/Supabase Storage: PDF, TXT, MD, JSON, CSV, DOCX
- Its external ID is not already in
synced_documentsfor this source + project
warning
Files modified after initial sync are not automatically re-synced. Use manual re-sync for updated documents.