# MLS by HansonXyz WordPress plugin for syncing MLS Grid API data (NorthStar MLS) into a local database with WP-CLI tools and a public API for themes and plugins. ## Table of Contents - [Features](#features) - [Requirements](#requirements) - [Installation](#installation) - [Configuration](#configuration) - [Running Sync](#running-sync) - [WP-CLI Commands](#wp-cli-commands) - [Cron Setup](#cron-setup) - [Public API](#public-api) - [Database Schema](#database-schema) - [Media Handling](#media-handling) - [Garbage Collection](#garbage-collection) - [Sync Strategy](#sync-strategy) - [Error Recovery](#error-recovery) - [Troubleshooting](#troubleshooting) ## Features - Syncs Active and Pending property listings from MLS Grid API - Automatic incremental updates via replication - On-demand image fetching and local caching - Automatic WebP conversion for cached images - Disk space garbage collection for image cache - Self-healing sync with automatic error recovery - Rate limit compliance (MLS Grid limits enforced) - Resume capability for interrupted syncs - WP-CLI commands for all operations - Public PHP API for theme/plugin integration - Optimized database indexes for search queries ## Requirements - WordPress 5.0+ - PHP 7.4+ - MySQL 5.7+ or MariaDB 10.2+ - WP-CLI (for command-line operations) - MLS Grid API access token ## Installation 1. Upload the `mls-by-hansonxyz` folder to `/wp-content/plugins/` 2. Activate the plugin through WordPress admin 3. Configure API credentials (see Configuration) 4. Run initial sync: `wp mls run` ## Configuration ### API Credentials Add to your `wp-config.php`: ```php define('MLSGRID_API_URL', 'https://api.mlsgrid.com/v2'); define('MLSGRID_ACCESS_TOKEN', 'your-access-token-here'); ``` ### Image Garbage Collection (Optional) To enable automatic cleanup of old cached images when disk space is low, add to `wp-config.php`: ```php // Enable garbage collection when free space drops below 5GB define('MLS_GC_DISK_THRESHOLD', 5 * 1024 * 1024 * 1024); // 5GB in bytes ``` See [Garbage Collection](#garbage-collection) for details. ### WordPress Admin Settings Navigate to **Settings > MLS Settings** to configure: | Setting | Description | Default | |---------|-------------|---------| | Originating System | MLS identifier | `northstar` | | Auto Sync | Enable WP-Cron sync | Disabled | | Sync Interval | WP-Cron frequency | Hourly | ## Running Sync ### Smart Sync (Recommended) The `wp mls run` command handles all scenarios automatically: ```bash wp mls run # Smart sync with progress wp mls run --quiet # Status messages only wp mls run --verbose # Full API details wp mls run --silent # For cron (exit code only) ``` **Automatic behavior:** - If no data exists: runs full sync - If data exists: runs incremental sync - If previous sync failed: resumes from checkpoint - If sync already running: safely aborts ### Manual Sync Commands For more control over sync operations: ```bash # Full sync (Active/Pending properties only) wp mls sync full # Incremental sync (changes since last sync) wp mls sync incremental # Resume a specific failed sync wp mls sync resume --id= # Dry run (no changes) wp mls sync full --dry-run --limit=100 ``` ### Progress Indicators During sync, progress characters indicate activity: | Symbol | Meaning | |--------|---------| | `.` | Property created | | `#` | Property updated | | `x` | Property deleted | | `!` | Error occurred | | `\|` | Page complete | Use `--verbose` for detailed timestamped output. ## WP-CLI Commands ### Testing ```bash wp mls test connection # Test API connectivity wp mls test auth # Verify authentication ``` ### Status and Statistics ```bash wp mls status # Full status overview wp mls status rate-limits # Rate limit usage only wp mls stats # Database statistics ``` ### Sync Operations ```bash # Smart sync (recommended) wp mls run [--quiet] [--verbose] [--silent] # Manual sync wp mls sync full [--dry-run] [--limit=N] [--verbose] wp mls sync incremental [--dry-run] [--verbose] wp mls sync resume --id= ``` ### Media Management Images are fetched on-demand when properties are viewed. These commands manage the cache: ```bash wp mls media status # Cache statistics wp mls media fetch --listing= # Pre-cache a listing's images wp mls media fetch --listing= --limit=10 wp mls media clear --listing= # Clear cached images ``` ### Cache Management ```bash wp mls cache clear --confirm # Delete ALL synced data wp mls cache cleanup # Remove orphaned media files wp mls cache missing # View failed media downloads wp mls cache missing --clear # Clear the missing media log ``` ### Recovery ```bash wp mls recovery list # Show resumable syncs wp mls recovery auto # Auto-resume most recent failed sync wp mls recovery cleanup # Mark stale syncs as failed ``` ## Cron Setup ### Recommended Setup Add to system crontab (`crontab -e`): ```bash # Smart sync every 15 minutes (handles everything automatically) */15 * * * * cd /var/www/html && wp mls run --silent --allow-root >> /var/log/mls-sync.log 2>&1 ``` This single entry handles: - Initial full sync on first run - Incremental updates on subsequent runs - Automatic recovery from failures - Safe concurrent execution (aborts if already running) ### Alternative: Manual Control ```bash # Incremental sync every 15 minutes */15 * * * * cd /var/www/html && wp mls sync incremental --allow-root >> /var/log/mls-sync.log 2>&1 # Full rebuild weekly (Sunday 3am) 0 3 * * 0 cd /var/www/html && wp mls cache clear --confirm --allow-root && wp mls sync full --allow-root >> /var/log/mls-sync.log 2>&1 ``` ### Important Notes - Use `--allow-root` when running as root - MLS Grid requires refresh at least every 12 hours per IDX rules - Rate limits are handled automatically (plugin waits when approaching limits) - No separate media cron needed - images are fetched on-demand ## Public API ### Available Functions ```php // Get properties with filters $properties = mls_get_properties([ 'status' => 'Active', 'city' => 'Albert Lea', 'min_price' => 100000, 'max_price' => 500000, 'min_beds' => 3, 'property_type' => 'Residential', 'limit' => 20, 'offset' => 0, 'orderby' => 'list_price', 'order' => 'DESC', ]); // Get single property by listing key or MLS ID $property = mls_get_property('NST123456'); // Get primary image (fetches on-demand if not cached) $image_url = mls_get_property_image('NST123456'); $image_url = mls_get_property_image('NST123456', false); // Don't fetch, return null if uncached // Get all images for a listing $images = mls_get_property_images('NST123456'); // Fetch first 1 if uncached $images = mls_get_property_images('NST123456', 10); // Fetch first 10 if uncached $images = mls_get_property_images('NST123456', 0); // Don't fetch any // Get media metadata (no fetching) $media = mls_get_property_media('NST123456'); // Get distinct cities with listings $cities = mls_get_cities(); // All cities $cities = mls_get_cities('Active'); // Cities with active listings only // Get property count $count = mls_get_property_count(['status' => 'Active']); // Check if data is available if (mls_is_available()) { // Show property search } // Get cache statistics $stats = mls_get_cache_stats(); // Returns: ['total_media' => 50000, 'cached' => 1200, 'uncached' => 48800] ``` ### Query Parameters | Parameter | Type | Description | |-----------|------|-------------| | `status` | string | Active, Pending, Closed | | `property_type` | string | Residential, Land, Commercial, etc. | | `city` | string | City name | | `county` | string | County name | | `postal_code` | string | ZIP code | | `min_price` | int | Minimum list price | | `max_price` | int | Maximum list price | | `min_beds` | int | Minimum bedrooms | | `max_beds` | int | Maximum bedrooms | | `min_baths` | int | Minimum bathrooms | | `min_sqft` | int | Minimum living area | | `max_sqft` | int | Maximum living area | | `year_built_min` | int | Minimum year built | | `year_built_max` | int | Maximum year built | | `listing_key` | string | Specific listing key | | `listing_id` | string | Specific MLS ID | | `search` | string | Search address/remarks | | `limit` | int | Results per page (default: 20) | | `offset` | int | Pagination offset | | `orderby` | string | Sort field | | `order` | string | ASC or DESC | | `include_media` | bool | Include media array | | `fields` | array | Specific fields to return | ### Property Object Fields ```php $property->listing_key // Unique identifier $property->listing_id // MLS number $property->standard_status // Active, Pending, Closed $property->list_price // Current price $property->original_list_price $property->close_price // Address $property->street_number $property->street_name $property->street_suffix $property->unit_number $property->city $property->state_or_province $property->postal_code $property->county $property->latitude $property->longitude // Property details $property->property_type $property->property_sub_type $property->bedrooms_total $property->bathrooms_total $property->bathrooms_full $property->bathrooms_half $property->living_area // Square feet $property->lot_size_area $property->lot_size_units $property->year_built $property->garage_spaces // Description $property->public_remarks $property->directions // Listing info $property->list_agent_key $property->list_agent_mls_id $property->list_agent_name $property->list_office_key $property->list_office_mls_id $property->list_office_name // Dates and timestamps $property->photos_count $property->modification_timestamp $property->photos_change_timestamp $property->listing_contract_date $property->close_date $property->days_on_market $property->created_at $property->updated_at ``` ## Database Schema ### Tables All tables use the WordPress prefix (e.g., `wp_mls_properties`). #### mls_properties Main property listing data. Only Active and Pending properties are stored. | Column | Type | Description | |--------|------|-------------| | id | BIGINT | Auto-increment primary key | | listing_key | VARCHAR(50) | Unique MLS Grid key | | listing_id | VARCHAR(50) | MLS number | | standard_status | VARCHAR(30) | Active, Pending | | list_price | DECIMAL(15,2) | Current price | | city | VARCHAR(100) | City name | | latitude | DECIMAL(10,8) | GPS latitude | | longitude | DECIMAL(11,8) | GPS longitude | | ... | ... | See property fields above | | raw_data | LONGTEXT | Full API response (JSON) | | modification_timestamp | DATETIME | Last modified in MLS | | created_at | DATETIME | Record creation | | updated_at | DATETIME | Record update | **Indexes:** - `listing_key` (UNIQUE) - `listing_id` - `standard_status` - `city` - `property_type` - `list_price` - `modification_timestamp` - `bedrooms_total` - `county` - `idx_latitude` - for geo queries - `idx_longitude` - for geo queries - `idx_status_city_price` - composite for search - `idx_status_type` - composite for filtering #### mls_media Media metadata and cache status. Images are downloaded on-demand. | Column | Type | Description | |--------|------|-------------| | id | BIGINT | Auto-increment primary key | | listing_key | VARCHAR(50) | Property reference | | media_key | VARCHAR(100) | Unique media identifier | | media_type | VARCHAR(30) | Photo, Document, etc. | | media_order | INT | Display order | | media_url | VARCHAR(1000) | Original MLS Grid URL | | local_path | VARCHAR(500) | Cached file path | | local_url | VARCHAR(500) | Cached file URL | | downloaded_at | DATETIME | When cached | #### mls_sync_state Sync progress tracking for resume capability. | Column | Type | Description | |--------|------|-------------| | id | BIGINT | Sync operation ID | | sync_type | VARCHAR(30) | full, incremental | | status | VARCHAR(20) | pending, running, completed, failed | | last_next_link | VARCHAR(2000) | Resume checkpoint | | records_processed | INT | Total processed | | records_created | INT | New records | | records_updated | INT | Updated records | | records_deleted | INT | Deleted records | #### mls_rate_limits API rate limit tracking. #### mls_sync_log Debug logging for sync operations. #### mls_media_log Media download audit trail. ## Media Handling ### On-Demand Fetching Per MLS Grid rules, media URLs cannot be used directly on websites. Images must be downloaded and served from your own server. **How it works:** 1. Property sync stores media metadata (URLs, keys, order) but does NOT download images 2. When `mls_get_property_image()` is called, the image is fetched and cached locally 3. Subsequent requests serve from local cache 4. Cache location: `wp-content/uploads/mls-listings/{prefix}/{listing_key}/` **Benefits:** - No rate limit issues from bulk downloading - Images cached only when needed - Automatic re-fetch if cache cleared - Works with MLS Grid's URL expiration ### Pre-caching Images To pre-cache images for specific listings: ```bash wp mls media fetch --listing=NST123456 --limit=10 ``` ### Cache Statistics ```bash wp mls media status ``` Shows total media records, cached count, and uncached count. ## Garbage Collection The plugin includes automatic garbage collection to prevent disk space from filling up with cached MLS images. ### Enabling Garbage Collection Add to `wp-config.php`: ```php // Enable garbage collection when free space drops below 5GB define('MLS_GC_DISK_THRESHOLD', 5 * 1024 * 1024 * 1024); // 5GB in bytes ``` If `MLS_GC_DISK_THRESHOLD` is not defined, garbage collection is disabled. ### How It Works 1. After each sync (`wp mls run`), the plugin checks free disk space on the volume hosting MLS images 2. If free space is below the threshold, cleanup begins 3. Directories older than 24 hours are deleted, oldest first 4. Cleanup stops when: - Free space reaches 5GB, OR - 2GB has been deleted in this run 5. Directories modified within the last 24 hours are never deleted (protects recently accessed images) ### Behavior Summary | Setting | Value | |---------|-------| | Threshold trigger | Configurable via `MLS_GC_DISK_THRESHOLD` | | Target free space | 5GB | | Max delete per run | 2GB | | Minimum directory age | 24 hours | | Runs automatically | After every sync | ### CLI Output During sync, garbage collection status is shown: ``` Garbage Collection: Disk space OK: 12.45 GB free (threshold: 5.00 GB) ``` Or if cleanup occurs: ``` Garbage Collection: Disk space low: 3.21 GB free (threshold: 5.00 GB). Starting cleanup... Deleted: NST123456 (45.23 MB) Deleted: NST789012 (38.91 MB) ... Cleanup complete: Deleted 42 directories (1.89 GB). Free space now: 5.10 GB ``` ### Recommended Threshold For most installations, 5GB is a good threshold: ```php define('MLS_GC_DISK_THRESHOLD', 5 * 1024 * 1024 * 1024); ``` For servers with limited disk space, you may want a higher threshold to trigger cleanup earlier: ```php // Trigger cleanup when below 10GB define('MLS_GC_DISK_THRESHOLD', 10 * 1024 * 1024 * 1024); ``` ### Image Regeneration When a deleted image is requested again, it is automatically re-fetched from MLS Grid and cached. This is the normal on-demand fetching behavior - garbage collection simply clears old cached files to free disk space. ## Sync Strategy ### Initial Import (Full Sync) - Fetches ONLY Active and Pending properties - Filter: `MlgCanView eq true AND (StandardStatus eq 'Active' OR StandardStatus eq 'Pending')` - Uses `@odata.nextLink` for pagination (NOT `$skip`) - Approximately 30,000 records for NorthStar MLS - Takes 30-45 minutes on first run ### Replication (Incremental Sync) - Fetches ALL properties modified since last sync - No filter on status (need to detect changes) - For each record: - If `MlgCanView = false`: DELETE from local DB - If `StandardStatus` not Active/Pending: DELETE from local DB - Otherwise: INSERT or UPDATE ### Why This Approach? 1. MLS Grid API limits `$skip` to ~80,000 - bulk scanning fails 2. Only Active/Pending properties needed for display 3. Replication is efficient - only fetches changes 4. Proper deletion handling when properties sell ## Error Recovery ### Automatic Recovery The plugin saves progress after each API page. If a sync fails: 1. Progress is preserved in `mls_sync_state` table 2. Next `wp mls run` automatically resumes from checkpoint 3. Failed syncs older than 1 hour are marked for resume ### Manual Recovery ```bash # View resumable syncs wp mls recovery list # Auto-resume most recent wp mls recovery auto # Resume specific sync wp mls sync resume --id= # Mark stale syncs as failed wp mls recovery cleanup ``` ## Troubleshooting ### Connection Failed ```bash wp mls test connection wp mls test auth ``` Check: - API token in wp-config.php - Network connectivity - MLS Grid API status ### No Data After Sync ```bash wp mls status wp mls stats ``` Check: - Rate limits (may need to wait) - WordPress debug log for API errors - Sync state for failures ### Media Not Loading ```bash wp mls media status ``` Check: - Upload directory permissions - Disk space - MLS Grid media URL validity ### Sync Taking Too Long Initial sync of ~30K properties takes 30-45 minutes. Use `--verbose` to monitor progress. ### Rate Limit Exceeded The plugin automatically waits when approaching limits. If persistent: - Reduce sync frequency - Check for other API consumers - Contact MLS Grid support ### Clearing Data To start fresh: ```bash wp mls cache clear --confirm wp mls run ``` ### Database Issues If indexes are missing, trigger recreation: ```bash wp eval "MLS_DB::create_tables();" ``` ## Rate Limits MLS Grid enforces these limits: | Limit | Value | |-------|-------| | Per second | 2 requests | | Per hour | 7,200 requests | | Per day | 40,000 requests | | Data per hour | 4 GB | The plugin automatically: - Waits 500ms between requests - Tracks hourly/daily usage - Pauses when approaching limits - Retries with exponential backoff on 429 errors ## File Structure ``` mls-by-hansonxyz/ ├── mls-by-hansonxyz.php # Main plugin file, public API ├── uninstall.php # Cleanup on uninstall ├── README.md # This file ├── admin/ │ └── class-mls-admin.php # WordPress admin interface ├── cli/ │ └── class-mls-cli.php # WP-CLI commands ├── includes/ │ ├── class-mls-activator.php # Plugin activation │ ├── class-mls-api-client.php # MLS Grid API communication │ ├── class-mls-db.php # Database operations │ ├── class-mls-deactivator.php # Plugin deactivation │ ├── class-mls-garbage-collector.php # Disk space management │ ├── class-mls-logger.php # Event logging │ ├── class-mls-media-handler.php # On-demand image caching │ ├── class-mls-options.php # Configuration management │ ├── class-mls-query.php # Public query API │ ├── class-mls-rate-limiter.php # Rate limit compliance │ └── class-mls-sync-engine.php # Sync orchestration └── docs/ ├── API.md # MLS Grid API reference ├── CLAUDE.md # AI assistant context └── USAGE.md # User documentation ``` ## Support - Plugin logs: Settings > MLS Settings in WordPress admin - Debug log: `wp-content/debug.log` (if WP_DEBUG enabled) - MLS Grid API: support@mlsgrid.com ## License GPL-2.0+