Data Collection Overview
Our rental market database aggregates public data from authoritative sources to provide accurate, unbiased investment analysis for short-term rental (STR) and long-term rental (LTR) properties. We do not accept sponsorships, paid placements, or advertising from real estate companies.
Primary Data Sources
1. Census & Official Statistics (Demographics & Housing)
Sources: National statistical agencies and census data
Update frequency: Annually (released December each year)
Data used:
- Median household income by area
- Population and density
- Housing units, occupancy rates, vacancy rates
- Median gross rent (long-term rentals)
- Median home value (owner-occupied)
- Property tax estimates (aggregate)
Why multi-year estimates: More reliable than single-year data for smaller geographic areas. Represents rolling averages for stability.
Limitations: Lags real-time market by 1-2 years. Use as baseline, not current snapshot.
2. Inside Airbnb (Short-Term Rental Data)
Source: Inside Airbnb (independent project scraping public Airbnb listings)
Update frequency: Monthly (35+ markets)
Data used:
- Nightly rates by property type and bedroom count
- Occupancy estimates (reviews × average stay / availability)
- Listing counts and saturation
- Average cleaning fees
Occupancy model: We estimate occupancy using review-based analysis validated against actual booking data.
Gap-filling methodology: For areas without direct listing data:
- Regional aggregates (if nearby areas have data)
- Calibrated estimation from long-term rent data
- Machine learning imputation using demographic and market features
3. Property Valuation Data (Home Values)
Sources: Property valuation indices and transaction databases
Update frequency: Monthly to quarterly
Data used:
- Median home values by area
- Historical trends (1-year, 5-year appreciation)
Methodology: Uses both automated valuation models and actual transaction data for comprehensive coverage.
Fallback: If index data unavailable, we use census median home values.
4. Property Tax Data
Source: Official statistics and local tax authorities
Update frequency: Annually
Calculation:
Why this method: Captures real-world effective rate (including exemptions, caps, assessments).
Fallback: If area-level data unavailable, use regional average. If regional unavailable, use national average.
5. STR Lodging Tax Rates
Sources:
- Primary: AI-assisted research with web search (quarterly updates, 750+ jurisdictions verified)
- Secondary: Government tourism and revenue departments
- Tertiary: Hospitality industry tax databases
- Fallback: Regional averages from manual research
Priority: Local area > Region > National default (0%)
Update frequency: Quarterly via AI research layer
6. STR Regulations
Source: Comprehensive AI research system (3-tier coverage)
Update frequency: Quarterly
Coverage:
- Tier 1: 215+ major cities (AI verification + multiple sources)
- Tier 2: 500-1000+ regional jurisdictions (AI + web search)
- Tier 3: National/regional baseline policies
Data extracted:
- Night caps (annual maximum)
- Permit requirements (yes/no, cost)
- Primary residence rules (yes/no)
- Host presence requirements (yes/no)
- Outright bans (yes/no)
- Source links (government .gov URLs)
Priority: Local area > Region > National default (permissive)
7. Insurance & Risk Data
Sources: Government risk indices and environmental agencies
Update frequency: Annually
Data used:
- Flood risk scores (high/medium/low)
- Storm and natural disaster exposure (where available by country)
- Bushfire/wildfire risk (where available by country)
Insurance estimates: Regional base rates + risk surcharges (derived from market averages).
8. Utility Costs
Sources: Energy regulators and utility pricing databases
Update frequency: Monthly to quarterly
Data used:
- Electricity rates by region
- Natural gas rates by region
Calculation: Multiply rates by typical consumption for property size.
Data Quality & Validation
Automated Realism Checks
We run multiple validation checks on every dataset update:
- Regional medians vs official benchmarks: Flags significant deviations
- Bedroom monotonicity: Rent/price should increase with bedrooms
- Property type consistency: House sale ≥ apartment sale
- Financial sanity: Rent-to-price ratios within reasonable ranges
- Anomalies: Unlikely combinations (e.g. very high rent + very low price)
- Regulations: Spot-check known jurisdictions
- Occupancy patterns: Tourist areas should generally exceed rural areas
- STR tax coverage: Checks that most areas have non-zero tax data
- Property tax outliers: Flags unusually high effective rates
- Data completeness: Core columns not null/zero
Current validation status: All automated checks passing
Data Accuracy
What we aim for:
- Data sourced from authoritative government and industry sources where available
- Methodology disclosed and documented
- Automated validation checks on every update
- Government source links included for regulations where available
What we don't claim:
- Real-time accuracy (data lags by 1-12 months depending on source)
- Neighborhood-level precision (we provide area-level aggregates)
- Future predictions (we show current/historical data only)
- Legal advice (regulations are informational, not legal guidance)
If you find data errors or have questions about methodology, please contact us.