Files
homeproz/wp-content/plugins/mls-by-hansonxyz/README.md
T
Hanson.xyz Dev c2d5b2248d Add WebP conversion and garbage collection to MLS plugin
Image handling improvements:
- Convert PNG and images >500KB to WebP format
- Resize images wider than 1600px maintaining aspect ratio
- Check for .webp version before falling back to original
- WebP quality set to 80 (equivalent to JPEG 90%)

Garbage collection for disk space management:
- New MLS_Garbage_Collector class runs after each sync
- Only active when MLS_GC_DISK_THRESHOLD defined in wp-config
- Deletes image directories older than 24 hours, oldest first
- Stops when free space reaches 5GB or 2GB deleted per run
- Protects recently accessed images from deletion

Documentation:
- Added Garbage Collection section to README
- Updated Features list and File Structure

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 20:48:57 -06:00

20 KiB
Executable File

MLS by HansonXyz

WordPress plugin for syncing MLS Grid API data (NorthStar MLS) into a local database with WP-CLI tools and a public API for themes and plugins.

Table of Contents

Features

  • Syncs Active and Pending property listings from MLS Grid API
  • Automatic incremental updates via replication
  • On-demand image fetching and local caching
  • Automatic WebP conversion for cached images
  • Disk space garbage collection for image cache
  • Self-healing sync with automatic error recovery
  • Rate limit compliance (MLS Grid limits enforced)
  • Resume capability for interrupted syncs
  • WP-CLI commands for all operations
  • Public PHP API for theme/plugin integration
  • Optimized database indexes for search queries

Requirements

  • WordPress 5.0+
  • PHP 7.4+
  • MySQL 5.7+ or MariaDB 10.2+
  • WP-CLI (for command-line operations)
  • MLS Grid API access token

Installation

  1. Upload the mls-by-hansonxyz folder to /wp-content/plugins/
  2. Activate the plugin through WordPress admin
  3. Configure API credentials (see Configuration)
  4. Run initial sync: wp mls run

Configuration

API Credentials

Add to your wp-config.php:

define('MLSGRID_API_URL', 'https://api.mlsgrid.com/v2');
define('MLSGRID_ACCESS_TOKEN', 'your-access-token-here');

Image Garbage Collection (Optional)

To enable automatic cleanup of old cached images when disk space is low, add to wp-config.php:

// Enable garbage collection when free space drops below 5GB
define('MLS_GC_DISK_THRESHOLD', 5 * 1024 * 1024 * 1024); // 5GB in bytes

See Garbage Collection for details.

WordPress Admin Settings

Navigate to Settings > MLS Settings to configure:

Setting Description Default
Originating System MLS identifier northstar
Auto Sync Enable WP-Cron sync Disabled
Sync Interval WP-Cron frequency Hourly

Running Sync

The wp mls run command handles all scenarios automatically:

wp mls run                    # Smart sync with progress
wp mls run --quiet            # Status messages only
wp mls run --verbose          # Full API details
wp mls run --silent           # For cron (exit code only)

Automatic behavior:

  • If no data exists: runs full sync
  • If data exists: runs incremental sync
  • If previous sync failed: resumes from checkpoint
  • If sync already running: safely aborts

Manual Sync Commands

For more control over sync operations:

# Full sync (Active/Pending properties only)
wp mls sync full

# Incremental sync (changes since last sync)
wp mls sync incremental

# Resume a specific failed sync
wp mls sync resume --id=<sync_id>

# Dry run (no changes)
wp mls sync full --dry-run --limit=100

Progress Indicators

During sync, progress characters indicate activity:

Symbol Meaning
. Property created
# Property updated
x Property deleted
! Error occurred
| Page complete

Use --verbose for detailed timestamped output.

WP-CLI Commands

Testing

wp mls test connection        # Test API connectivity
wp mls test auth              # Verify authentication

Status and Statistics

wp mls status                 # Full status overview
wp mls status rate-limits     # Rate limit usage only
wp mls stats                  # Database statistics

Sync Operations

# Smart sync (recommended)
wp mls run [--quiet] [--verbose] [--silent]

# Manual sync
wp mls sync full [--dry-run] [--limit=N] [--verbose]
wp mls sync incremental [--dry-run] [--verbose]
wp mls sync resume --id=<sync_id>

Media Management

Images are fetched on-demand when properties are viewed. These commands manage the cache:

wp mls media status                        # Cache statistics
wp mls media fetch --listing=<key>         # Pre-cache a listing's images
wp mls media fetch --listing=<key> --limit=10
wp mls media clear --listing=<key>         # Clear cached images

Cache Management

wp mls cache clear --confirm   # Delete ALL synced data
wp mls cache cleanup           # Remove orphaned media files
wp mls cache missing           # View failed media downloads
wp mls cache missing --clear   # Clear the missing media log

Recovery

wp mls recovery list           # Show resumable syncs
wp mls recovery auto           # Auto-resume most recent failed sync
wp mls recovery cleanup        # Mark stale syncs as failed

Cron Setup

Add to system crontab (crontab -e):

# Smart sync every 15 minutes (handles everything automatically)
*/15 * * * * cd /var/www/html && wp mls run --silent --allow-root >> /var/log/mls-sync.log 2>&1

This single entry handles:

  • Initial full sync on first run
  • Incremental updates on subsequent runs
  • Automatic recovery from failures
  • Safe concurrent execution (aborts if already running)

Alternative: Manual Control

# Incremental sync every 15 minutes
*/15 * * * * cd /var/www/html && wp mls sync incremental --allow-root >> /var/log/mls-sync.log 2>&1

# Full rebuild weekly (Sunday 3am)
0 3 * * 0 cd /var/www/html && wp mls cache clear --confirm --allow-root && wp mls sync full --allow-root >> /var/log/mls-sync.log 2>&1

Important Notes

  • Use --allow-root when running as root
  • MLS Grid requires refresh at least every 12 hours per IDX rules
  • Rate limits are handled automatically (plugin waits when approaching limits)
  • No separate media cron needed - images are fetched on-demand

Public API

Available Functions

// Get properties with filters
$properties = mls_get_properties([
    'status' => 'Active',
    'city' => 'Albert Lea',
    'min_price' => 100000,
    'max_price' => 500000,
    'min_beds' => 3,
    'property_type' => 'Residential',
    'limit' => 20,
    'offset' => 0,
    'orderby' => 'list_price',
    'order' => 'DESC',
]);

// Get single property by listing key or MLS ID
$property = mls_get_property('NST123456');

// Get primary image (fetches on-demand if not cached)
$image_url = mls_get_property_image('NST123456');
$image_url = mls_get_property_image('NST123456', false); // Don't fetch, return null if uncached

// Get all images for a listing
$images = mls_get_property_images('NST123456');      // Fetch first 1 if uncached
$images = mls_get_property_images('NST123456', 10);  // Fetch first 10 if uncached
$images = mls_get_property_images('NST123456', 0);   // Don't fetch any

// Get media metadata (no fetching)
$media = mls_get_property_media('NST123456');

// Get distinct cities with listings
$cities = mls_get_cities();           // All cities
$cities = mls_get_cities('Active');   // Cities with active listings only

// Get property count
$count = mls_get_property_count(['status' => 'Active']);

// Check if data is available
if (mls_is_available()) {
    // Show property search
}

// Get cache statistics
$stats = mls_get_cache_stats();
// Returns: ['total_media' => 50000, 'cached' => 1200, 'uncached' => 48800]

Query Parameters

Parameter Type Description
status string Active, Pending, Closed
property_type string Residential, Land, Commercial, etc.
city string City name
county string County name
postal_code string ZIP code
min_price int Minimum list price
max_price int Maximum list price
min_beds int Minimum bedrooms
max_beds int Maximum bedrooms
min_baths int Minimum bathrooms
min_sqft int Minimum living area
max_sqft int Maximum living area
year_built_min int Minimum year built
year_built_max int Maximum year built
listing_key string Specific listing key
listing_id string Specific MLS ID
search string Search address/remarks
limit int Results per page (default: 20)
offset int Pagination offset
orderby string Sort field
order string ASC or DESC
include_media bool Include media array
fields array Specific fields to return

Property Object Fields

$property->listing_key        // Unique identifier
$property->listing_id         // MLS number
$property->standard_status    // Active, Pending, Closed
$property->list_price         // Current price
$property->original_list_price
$property->close_price

// Address
$property->street_number
$property->street_name
$property->street_suffix
$property->unit_number
$property->city
$property->state_or_province
$property->postal_code
$property->county
$property->latitude
$property->longitude

// Property details
$property->property_type
$property->property_sub_type
$property->bedrooms_total
$property->bathrooms_total
$property->bathrooms_full
$property->bathrooms_half
$property->living_area        // Square feet
$property->lot_size_area
$property->lot_size_units
$property->year_built
$property->garage_spaces

// Description
$property->public_remarks
$property->directions

// Listing info
$property->list_agent_key
$property->list_agent_mls_id
$property->list_agent_name
$property->list_office_key
$property->list_office_mls_id
$property->list_office_name

// Dates and timestamps
$property->photos_count
$property->modification_timestamp
$property->photos_change_timestamp
$property->listing_contract_date
$property->close_date
$property->days_on_market
$property->created_at
$property->updated_at

Database Schema

Tables

All tables use the WordPress prefix (e.g., wp_mls_properties).

mls_properties

Main property listing data. Only Active and Pending properties are stored.

Column Type Description
id BIGINT Auto-increment primary key
listing_key VARCHAR(50) Unique MLS Grid key
listing_id VARCHAR(50) MLS number
standard_status VARCHAR(30) Active, Pending
list_price DECIMAL(15,2) Current price
city VARCHAR(100) City name
latitude DECIMAL(10,8) GPS latitude
longitude DECIMAL(11,8) GPS longitude
... ... See property fields above
raw_data LONGTEXT Full API response (JSON)
modification_timestamp DATETIME Last modified in MLS
created_at DATETIME Record creation
updated_at DATETIME Record update

Indexes:

  • listing_key (UNIQUE)
  • listing_id
  • standard_status
  • city
  • property_type
  • list_price
  • modification_timestamp
  • bedrooms_total
  • county
  • idx_latitude - for geo queries
  • idx_longitude - for geo queries
  • idx_status_city_price - composite for search
  • idx_status_type - composite for filtering

mls_media

Media metadata and cache status. Images are downloaded on-demand.

Column Type Description
id BIGINT Auto-increment primary key
listing_key VARCHAR(50) Property reference
media_key VARCHAR(100) Unique media identifier
media_type VARCHAR(30) Photo, Document, etc.
media_order INT Display order
media_url VARCHAR(1000) Original MLS Grid URL
local_path VARCHAR(500) Cached file path
local_url VARCHAR(500) Cached file URL
downloaded_at DATETIME When cached

mls_sync_state

Sync progress tracking for resume capability.

Column Type Description
id BIGINT Sync operation ID
sync_type VARCHAR(30) full, incremental
status VARCHAR(20) pending, running, completed, failed
last_next_link VARCHAR(2000) Resume checkpoint
records_processed INT Total processed
records_created INT New records
records_updated INT Updated records
records_deleted INT Deleted records

mls_rate_limits

API rate limit tracking.

mls_sync_log

Debug logging for sync operations.

mls_media_log

Media download audit trail.

Media Handling

On-Demand Fetching

Per MLS Grid rules, media URLs cannot be used directly on websites. Images must be downloaded and served from your own server.

How it works:

  1. Property sync stores media metadata (URLs, keys, order) but does NOT download images
  2. When mls_get_property_image() is called, the image is fetched and cached locally
  3. Subsequent requests serve from local cache
  4. Cache location: wp-content/uploads/mls-listings/{prefix}/{listing_key}/

Benefits:

  • No rate limit issues from bulk downloading
  • Images cached only when needed
  • Automatic re-fetch if cache cleared
  • Works with MLS Grid's URL expiration

Pre-caching Images

To pre-cache images for specific listings:

wp mls media fetch --listing=NST123456 --limit=10

Cache Statistics

wp mls media status

Shows total media records, cached count, and uncached count.

Garbage Collection

The plugin includes automatic garbage collection to prevent disk space from filling up with cached MLS images.

Enabling Garbage Collection

Add to wp-config.php:

// Enable garbage collection when free space drops below 5GB
define('MLS_GC_DISK_THRESHOLD', 5 * 1024 * 1024 * 1024); // 5GB in bytes

If MLS_GC_DISK_THRESHOLD is not defined, garbage collection is disabled.

How It Works

  1. After each sync (wp mls run), the plugin checks free disk space on the volume hosting MLS images
  2. If free space is below the threshold, cleanup begins
  3. Directories older than 24 hours are deleted, oldest first
  4. Cleanup stops when:
    • Free space reaches 5GB, OR
    • 2GB has been deleted in this run
  5. Directories modified within the last 24 hours are never deleted (protects recently accessed images)

Behavior Summary

Setting Value
Threshold trigger Configurable via MLS_GC_DISK_THRESHOLD
Target free space 5GB
Max delete per run 2GB
Minimum directory age 24 hours
Runs automatically After every sync

CLI Output

During sync, garbage collection status is shown:

Garbage Collection:
  Disk space OK: 12.45 GB free (threshold: 5.00 GB)

Or if cleanup occurs:

Garbage Collection:
  Disk space low: 3.21 GB free (threshold: 5.00 GB). Starting cleanup...
  Deleted: NST123456 (45.23 MB)
  Deleted: NST789012 (38.91 MB)
  ...
  Cleanup complete: Deleted 42 directories (1.89 GB). Free space now: 5.10 GB

For most installations, 5GB is a good threshold:

define('MLS_GC_DISK_THRESHOLD', 5 * 1024 * 1024 * 1024);

For servers with limited disk space, you may want a higher threshold to trigger cleanup earlier:

// Trigger cleanup when below 10GB
define('MLS_GC_DISK_THRESHOLD', 10 * 1024 * 1024 * 1024);

Image Regeneration

When a deleted image is requested again, it is automatically re-fetched from MLS Grid and cached. This is the normal on-demand fetching behavior - garbage collection simply clears old cached files to free disk space.

Sync Strategy

Initial Import (Full Sync)

  • Fetches ONLY Active and Pending properties
  • Filter: MlgCanView eq true AND (StandardStatus eq 'Active' OR StandardStatus eq 'Pending')
  • Uses @odata.nextLink for pagination (NOT $skip)
  • Approximately 30,000 records for NorthStar MLS
  • Takes 30-45 minutes on first run

Replication (Incremental Sync)

  • Fetches ALL properties modified since last sync
  • No filter on status (need to detect changes)
  • For each record:
    • If MlgCanView = false: DELETE from local DB
    • If StandardStatus not Active/Pending: DELETE from local DB
    • Otherwise: INSERT or UPDATE

Why This Approach?

  1. MLS Grid API limits $skip to ~80,000 - bulk scanning fails
  2. Only Active/Pending properties needed for display
  3. Replication is efficient - only fetches changes
  4. Proper deletion handling when properties sell

Error Recovery

Automatic Recovery

The plugin saves progress after each API page. If a sync fails:

  1. Progress is preserved in mls_sync_state table
  2. Next wp mls run automatically resumes from checkpoint
  3. Failed syncs older than 1 hour are marked for resume

Manual Recovery

# View resumable syncs
wp mls recovery list

# Auto-resume most recent
wp mls recovery auto

# Resume specific sync
wp mls sync resume --id=<sync_id>

# Mark stale syncs as failed
wp mls recovery cleanup

Troubleshooting

Connection Failed

wp mls test connection
wp mls test auth

Check:

  • API token in wp-config.php
  • Network connectivity
  • MLS Grid API status

No Data After Sync

wp mls status
wp mls stats

Check:

  • Rate limits (may need to wait)
  • WordPress debug log for API errors
  • Sync state for failures

Media Not Loading

wp mls media status

Check:

  • Upload directory permissions
  • Disk space
  • MLS Grid media URL validity

Sync Taking Too Long

Initial sync of ~30K properties takes 30-45 minutes. Use --verbose to monitor progress.

Rate Limit Exceeded

The plugin automatically waits when approaching limits. If persistent:

  • Reduce sync frequency
  • Check for other API consumers
  • Contact MLS Grid support

Clearing Data

To start fresh:

wp mls cache clear --confirm
wp mls run

Database Issues

If indexes are missing, trigger recreation:

wp eval "MLS_DB::create_tables();"

Rate Limits

MLS Grid enforces these limits:

Limit Value
Per second 2 requests
Per hour 7,200 requests
Per day 40,000 requests
Data per hour 4 GB

The plugin automatically:

  • Waits 500ms between requests
  • Tracks hourly/daily usage
  • Pauses when approaching limits
  • Retries with exponential backoff on 429 errors

File Structure

mls-by-hansonxyz/
├── mls-by-hansonxyz.php       # Main plugin file, public API
├── uninstall.php              # Cleanup on uninstall
├── README.md                  # This file
├── admin/
│   └── class-mls-admin.php    # WordPress admin interface
├── cli/
│   └── class-mls-cli.php      # WP-CLI commands
├── includes/
│   ├── class-mls-activator.php        # Plugin activation
│   ├── class-mls-api-client.php       # MLS Grid API communication
│   ├── class-mls-db.php               # Database operations
│   ├── class-mls-deactivator.php      # Plugin deactivation
│   ├── class-mls-garbage-collector.php # Disk space management
│   ├── class-mls-logger.php           # Event logging
│   ├── class-mls-media-handler.php    # On-demand image caching
│   ├── class-mls-options.php          # Configuration management
│   ├── class-mls-query.php            # Public query API
│   ├── class-mls-rate-limiter.php     # Rate limit compliance
│   └── class-mls-sync-engine.php      # Sync orchestration
└── docs/
    ├── API.md                 # MLS Grid API reference
    ├── CLAUDE.md              # AI assistant context
    └── USAGE.md               # User documentation

Support

  • Plugin logs: Settings > MLS Settings in WordPress admin
  • Debug log: wp-content/debug.log (if WP_DEBUG enabled)
  • MLS Grid API: support@mlsgrid.com

License

GPL-2.0+