Files
homeproz/wp-content/plugins/mls-by-hansonxyz/docs/CLAUDE.md
T
Hanson.xyz Dev b9cddd2f64 Refactor MLS sync to Active/Pending only with on-demand media
Major changes to sync strategy following MLS Grid best practices:

- Initial sync now fetches only Active/Pending properties (~30K vs 1.3M)
- Replication (incremental) fetches all changes, deletes non-Active/Pending
- On-demand media fetching replaces background queue (avoids rate limits)
- Media downloaded and cached when first viewed, not during sync
- Updated CLI commands: wp mls media status/fetch/clear
- Comprehensive documentation with troubleshooting guide

This fixes the "Value out of range" API error caused by high $skip values.

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-15 08:25:37 -06:00

8.7 KiB

MLS by HansonXyz Plugin

WordPress plugin for syncing MLS Grid API data (NorthStar MLS) into local database.

Development Rules

  1. No emojis - nowhere in code, commits, docs, or conversation
  2. PHP 7.4+ compatible code
  3. WordPress Coding Standards
  4. Follow patterns from existing HomeProz theme

Quick Reference

Database Tables

All tables use {$wpdb->prefix}mls_ prefix:

Table Purpose
mls_properties Listing data (Active/Pending only)
mls_media Media metadata and cache status
mls_sync_state Sync progress tracking
mls_rate_limits API usage tracking
mls_sync_log Debug logging

API Configuration

Credentials in wp-config.php:

define('MLSGRID_API_URL', 'https://api.mlsgrid.com/v2');
define('MLSGRID_ACCESS_TOKEN', 'your-token-here');

MLS Grid API Rate Limits

MUST comply with these limits:

  • 2 requests/second (500ms minimum between requests)
  • 7,200 requests/hour
  • 40,000 requests/day
  • 4GB data/hour

Important: The API rejects $skip values over ~80,000. Always use @odata.nextLink for pagination, never manual $skip.

Key Files

File Purpose
includes/class-mls-api-client.php API communication, auth, gzip
includes/class-mls-sync-engine.php Sync orchestration
includes/class-mls-media-handler.php On-demand media fetch and cache
includes/class-mls-query.php Public query API
includes/class-mls-rate-limiter.php Rate limit compliance
cli/class-mls-cli.php WP-CLI commands

WP-CLI Commands

# Test connectivity
wp mls test connection
wp mls test auth

# Show status
wp mls status
wp mls status rate-limits

# Run property sync
wp mls sync full [--dry-run] [--limit=N] [--verbose]    # Initial: Active/Pending only
wp mls sync incremental [--dry-run] [--verbose]          # Replication: all changes
wp mls sync resume --id=<sync_id>

# Media cache (images fetched on-demand when viewed)
wp mls media status                        # Show cache statistics
wp mls media fetch --listing=<key>         # Pre-cache images for a listing
wp mls media fetch --listing=<key> --limit=10  # Fetch up to 10 images
wp mls media clear --listing=<key>         # Clear cached images for re-fetch

# Statistics
wp mls stats

# Cache management
wp mls cache clear --confirm
wp mls cache cleanup

# Recovery commands
wp mls recovery list              # Show resumable syncs
wp mls recovery auto              # Auto-resume most recent failed sync
wp mls recovery cleanup           # Mark stale (>1hr) syncs as failed

Sync Strategy (IMPORTANT)

The sync follows MLS Grid best practices for replication:

Initial Import (wp mls sync full)

  • Fetches ONLY Active and Pending properties
  • Filter: MlgCanView eq true and (StandardStatus eq 'Active' or StandardStatus eq 'Pending')
  • Uses @odata.nextLink for pagination (NOT $skip)
  • Stores media metadata but does NOT download images
  • ~30,000 records for NorthStar MLS (vs 1.3M total including Closed)

Replication (wp mls sync incremental)

  • Fetches ALL properties modified since last sync
  • NO filter on MlgCanView or StandardStatus - we need to see changes
  • For each record received:
    • If MlgCanView = false -> DELETE from local DB
    • If StandardStatus not in (Active, Pending) -> DELETE from local DB
    • Otherwise -> INSERT or UPDATE
  • This handles: new listings, price changes, status changes (Active->Sold), removals

Why This Approach?

  1. MLS Grid API limits $skip to ~80,000 - bulk scanning all 1.3M records fails
  2. We only care about available properties - no need to store Closed/Sold
  3. Replication is efficient - only fetches changed records
  4. Proper deletion handling - when a property sells, we remove it

Data Flow

Initial Import:
  API (Active/Pending + MlgCanView=true) -> Local DB

Replication (every 15 min):
  API (ModificationTimestamp > last_sync) -> Check each record:
    - MlgCanView=false OR Status!=Active/Pending -> DELETE locally
    - Otherwise -> UPSERT locally

Media System (On-Demand Fetching)

Per MLS Grid rules, media URLs must NOT be used directly on websites. Images must be downloaded and served from our own server.

How it works:

  1. Property sync stores media metadata (URLs, keys, order) but does NOT download images
  2. On-demand fetch: When mls_get_property_image() is called, the image is fetched and cached locally
  3. Subsequent requests serve from local cache
  4. Pre-caching: Use wp mls media fetch --listing=<key> to pre-cache specific listings

Benefits:

  • No rate limit issues from bulk downloading
  • Images cached only when needed (saves bandwidth/storage)
  • Automatic re-fetch if cache is cleared
  • Works with MLS Grid's image URL expiration

Cache location: wp-content/uploads/mls-listings/{prefix}/{listing_key}/

Progress Output

Property sync (compact mode):

  • . = new property created
  • # = property updated
  • x = property deleted
  • - = skipped (dry-run)
  • | = page complete

With --verbose: Full timestamped output.

Sync Recovery

The sync engine saves progress after each page:

  1. Automatic state tracking: last_next_link saved after each API page
  2. Stale sync detection: Syncs running >1 hour marked as failed
  3. Resume commands:
    • wp mls sync resume --id=<ID> - Resume specific sync
    • wp mls recovery auto - Auto-resume most recent failed sync
    • wp mls recovery list - View all resumable syncs
# Replication sync every 15 minutes (MLS Grid recommended)
*/15 * * * * cd /var/www/html && wp mls sync incremental --allow-root >> /var/log/mls-sync.log 2>&1

# Full re-sync weekly (Sunday 3am) - rebuilds from scratch
0 3 * * 0 cd /var/www/html && wp mls cache clear --confirm --allow-root && wp mls sync full --allow-root >> /var/log/mls-sync.log 2>&1

Note: No separate media cron needed - images are fetched on-demand when properties are viewed.

Public API Functions

Available for themes/plugins:

// Get properties with filters
$properties = mls_get_properties([
    'status' => 'Active',
    'city' => 'Albert Lea',
    'min_price' => 100000,
    'limit' => 20,
]);

// Get single property
$property = mls_get_property('NST123456');

// Get media (on-demand fetching)
$image_url = mls_get_property_image('NST123456');  // Fetches if not cached
$image_url = mls_get_property_image('NST123456', false);  // Return null if not cached

// Get all images (fetches first N on demand)
$images = mls_get_property_images('NST123456');  // Fetches first 1 if uncached
$images = mls_get_property_images('NST123456', 5);  // Fetches first 5 if uncached

// Get media metadata (no fetch)
$media = mls_get_property_media('NST123456');

// Get cache statistics
$stats = mls_get_cache_stats();  // Returns total_media, cached, uncached counts

// Get distinct values
$cities = mls_get_cities('Active');

// Check data availability
if (mls_is_available()) { ... }

Testing After Changes

wp mls test connection
wp mls test auth
wp mls sync full --dry-run --limit=10 --verbose
wp mls media status
wp mls stats

Property Data Mapping

Key fields from API to database:

API Field DB Column
ListingKey listing_key
ListingId listing_id
ListPrice list_price
StandardStatus standard_status
BedroomsTotal bedrooms_total
BathroomsTotalInteger bathrooms_total
LivingArea living_area
City city
ModificationTimestamp modification_timestamp
PhotosChangeTimestamp photos_change_timestamp
MlgCanView mlg_can_view

Full API response stored in raw_data column as JSON.

Troubleshooting

"Value out of range" error

The API is rejecting a high $skip value. This means pagination broke. Clear data and re-run initial sync:

wp mls cache clear --confirm --allow-root
wp mls sync full --allow-root

All properties showing as "Sold"

The initial sync was run without the Active/Pending filter. Clear and re-sync:

wp mls cache clear --confirm --allow-root
wp mls sync full --allow-root

Media not loading

Images are fetched on-demand. Check:

  1. wp mls media status - see cache stats
  2. wp mls media fetch --listing=<key> - manually fetch for a listing
  3. Check wp-content/uploads/mls-listings/ directory permissions

Sync taking too long

Initial sync of ~30K Active/Pending properties takes about 30-45 minutes. Use --verbose to see progress.