6eadf3d266
- Add download_status, retry_after, queued_at columns to mls_media table - Add mls_media_log table for download attempt tracking - Rewrite media handler to queue downloads instead of immediate download - Add 700ms delay between downloads (25% buffer over 2/sec limit) - Add 3-hour backoff for rate-limited (429) responses - Add max 5 attempts before marking as permanently failed - Add wp mls media command: status, process, reset, logs - Deprecate wp mls sync media in favor of wp mls media process - Update documentation with queue system details and cron examples Media downloads are now separate from property sync: 1. wp mls sync full/incremental - syncs properties, queues media 2. wp mls media process - downloads queued media with rate limiting Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
229 lines
6.7 KiB
Markdown
229 lines
6.7 KiB
Markdown
# MLS by HansonXyz Plugin
|
|
|
|
WordPress plugin for syncing MLS Grid API data (NorthStar MLS) into local database.
|
|
|
|
## Development Rules
|
|
|
|
1. **No emojis** - nowhere in code, commits, docs, or conversation
|
|
2. **PHP 7.4+** compatible code
|
|
3. **WordPress Coding Standards**
|
|
4. Follow patterns from existing HomeProz theme
|
|
|
|
## Quick Reference
|
|
|
|
### Database Tables
|
|
|
|
All tables use `{$wpdb->prefix}mls_` prefix:
|
|
|
|
| Table | Purpose |
|
|
|-------|---------|
|
|
| `mls_properties` | Listing data |
|
|
| `mls_media` | Media files with download queue |
|
|
| `mls_media_log` | Media download attempt history |
|
|
| `mls_sync_state` | Sync progress tracking |
|
|
| `mls_rate_limits` | API usage tracking |
|
|
| `mls_sync_log` | Debug logging |
|
|
|
|
### API Configuration
|
|
|
|
Credentials in wp-config.php:
|
|
```php
|
|
define('MLSGRID_API_URL', 'https://api.mlsgrid.com/v2');
|
|
define('MLSGRID_ACCESS_TOKEN', 'your-token-here');
|
|
```
|
|
|
|
### MLS Grid API Rate Limits
|
|
|
|
MUST comply with these limits:
|
|
- 2 requests/second (500ms minimum between requests)
|
|
- 7,200 requests/hour
|
|
- 40,000 requests/day
|
|
- 4GB data/hour
|
|
|
|
Media downloads use 700ms delay (25% buffer) between requests.
|
|
|
|
### Key Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `includes/class-mls-api-client.php` | API communication, auth, gzip |
|
|
| `includes/class-mls-sync-engine.php` | Sync orchestration |
|
|
| `includes/class-mls-media-handler.php` | Media queue and download |
|
|
| `includes/class-mls-query.php` | Public query API |
|
|
| `includes/class-mls-rate-limiter.php` | Rate limit compliance |
|
|
| `cli/class-mls-cli.php` | WP-CLI commands |
|
|
|
|
### WP-CLI Commands
|
|
|
|
```bash
|
|
# Test connectivity
|
|
wp mls test connection
|
|
wp mls test auth
|
|
|
|
# Show status
|
|
wp mls status
|
|
wp mls status rate-limits
|
|
|
|
# Run property sync (queues media, does not download)
|
|
wp mls sync full [--dry-run] [--limit=N] [--verbose]
|
|
wp mls sync incremental [--dry-run] [--verbose]
|
|
wp mls sync resume --id=<sync_id>
|
|
|
|
# Media download queue (separate from property sync)
|
|
wp mls media status # Show queue stats
|
|
wp mls media process # Download queued media (rate limited)
|
|
wp mls media process --limit=50 --verbose
|
|
wp mls media reset # Reset failed downloads for retry
|
|
wp mls media logs # View download history
|
|
wp mls media logs --clear --days=7
|
|
|
|
# Statistics
|
|
wp mls stats
|
|
|
|
# Cache management
|
|
wp mls cache clear --confirm
|
|
wp mls cache cleanup
|
|
wp mls cache missing # View failed media downloads
|
|
wp mls cache missing --limit=20 # View first 20 entries
|
|
wp mls cache missing --clear # Clear the log
|
|
|
|
# Recovery commands
|
|
wp mls recovery list # Show resumable syncs
|
|
wp mls recovery auto # Auto-resume most recent failed sync
|
|
wp mls recovery cleanup # Mark stale (>1hr) syncs as failed
|
|
```
|
|
|
|
### Media Queue System
|
|
|
|
Media downloads are now queue-based and separate from property sync:
|
|
|
|
1. **Property sync** (`wp mls sync full/incremental`) queues media records
|
|
2. **Media process** (`wp mls media process`) downloads queued media with rate limiting
|
|
3. Downloads are rate-limited to 700ms between requests (under 2/sec limit)
|
|
4. Failed downloads get 3-hour backoff before retry
|
|
5. After 5 attempts, items are marked failed and logged
|
|
|
|
**Queue states:**
|
|
- `pending` - Ready for download
|
|
- `completed` - Successfully downloaded
|
|
- `failed` - Max attempts reached
|
|
|
|
**Media table columns:**
|
|
- `download_status` - pending/completed/failed
|
|
- `retry_after` - Next retry time (3hr backoff on rate limit)
|
|
- `queued_at` - When item was queued
|
|
- `download_attempts` - Attempt count (max 5)
|
|
|
|
### Progress Output
|
|
|
|
Property sync (compact mode):
|
|
- `.` = new property created
|
|
- `#` = property updated
|
|
- `x` = property deleted
|
|
- `-` = skipped (dry-run)
|
|
- `q` = media queued
|
|
- `p` = media skipped (already downloaded)
|
|
- `|` = page complete
|
|
|
|
Media process (compact mode):
|
|
- `P` = downloaded
|
|
- `B` = backoff (retry later)
|
|
- `E` = error
|
|
|
|
With --verbose: Full timestamped output.
|
|
|
|
### Missing Media Log
|
|
|
|
Permanently failed media downloads logged to: `wp-content/uploads/mls-missing-media.log`
|
|
|
|
Format: `[timestamp] listing_key | media_key | error | url`
|
|
|
|
### Sync Recovery
|
|
|
|
The sync engine saves progress after each page:
|
|
|
|
1. **Automatic state tracking**: `last_next_link` saved after each API page
|
|
2. **Stale sync detection**: Syncs running >1 hour marked as failed
|
|
3. **Resume commands**:
|
|
- `wp mls sync resume --id=<ID>` - Resume specific sync
|
|
- `wp mls recovery auto` - Auto-resume most recent failed sync
|
|
- `wp mls recovery list` - View all resumable syncs
|
|
|
|
### Recommended Cron Setup
|
|
|
|
```bash
|
|
# Property sync every 30 minutes
|
|
*/30 * * * * cd /var/www/html && wp mls recovery auto --quiet && wp mls sync incremental --allow-root >> /var/log/mls-sync.log 2>&1
|
|
|
|
# Media downloads every 5 minutes (processes up to 50 items per run)
|
|
*/5 * * * * cd /var/www/html && wp mls media process --limit=50 --quiet --allow-root >> /var/log/mls-media.log 2>&1
|
|
|
|
# Full sync weekly (Sunday 3am)
|
|
0 3 * * 0 cd /var/www/html && wp mls sync full --allow-root >> /var/log/mls-sync.log 2>&1
|
|
```
|
|
|
|
### Public API Functions
|
|
|
|
Available for themes/plugins:
|
|
|
|
```php
|
|
// Get properties with filters
|
|
$properties = mls_get_properties([
|
|
'status' => 'Active',
|
|
'city' => 'Albert Lea',
|
|
'min_price' => 100000,
|
|
'limit' => 20,
|
|
]);
|
|
|
|
// Get single property
|
|
$property = mls_get_property('NST123456');
|
|
|
|
// Get media
|
|
$media = mls_get_property_media('NST123456');
|
|
$image_url = mls_get_property_image('NST123456');
|
|
|
|
// Get distinct values
|
|
$cities = mls_get_cities('Active');
|
|
|
|
// Check data availability
|
|
if (mls_is_available()) { ... }
|
|
```
|
|
|
|
### Sync Strategy
|
|
|
|
1. **Property Sync**: Full/incremental sync downloads property data and queues media
|
|
2. **Media Queue**: Separate process downloads media with rate limiting
|
|
3. **Delete Handling**: MlgCanView=false triggers local deletion
|
|
4. **Media Storage**: Downloads to wp-content/uploads/mls-listings/
|
|
5. **Recovery**: Stores last_next_link for resume on failure
|
|
|
|
### Testing After Changes
|
|
|
|
```bash
|
|
wp mls test connection
|
|
wp mls test auth
|
|
wp mls sync full --dry-run --limit=10
|
|
wp mls media status
|
|
wp mls stats
|
|
```
|
|
|
|
### Property Data Mapping
|
|
|
|
Key fields from API to database:
|
|
|
|
| API Field | DB Column |
|
|
|-----------|-----------|
|
|
| ListingKey | listing_key |
|
|
| ListingId | listing_id |
|
|
| ListPrice | list_price |
|
|
| StandardStatus | standard_status |
|
|
| BedroomsTotal | bedrooms_total |
|
|
| BathroomsTotalInteger | bathrooms_total |
|
|
| LivingArea | living_area |
|
|
| City | city |
|
|
| ModificationTimestamp | modification_timestamp |
|
|
| PhotosChangeTimestamp | photos_change_timestamp |
|
|
| MlgCanView | mlg_can_view |
|
|
|
|
Full API response stored in `raw_data` column as JSON.
|