1. What data we use
Community Pulse is built from a scrape of four subreddits where people who live in, or are thinking about moving to, Dubai actually talk about neighborhoods: r/dubai, r/UAE, r/expats, and r/digitalnomad.
We pulled roughly 16,000 raw posts and comments spanning 2021 through 2026. After dropping deleted or removed items, off-topic chatter, non-Dubai locations (Yas Island, Lebanon, etc.), advertiser-heavy subreddits (dubairealestate, dubaiclassifieds, dubairentals), and items flagged as listings, we're left with roughly 12,600 items that go to the AI scorer.
After scoring and applying a confidence threshold (more on that below), 4,051 items cover 72 Dubai areas. Each area page shows you its own sample size and its trend over time.
A note on volume. The Reddit conversation about Dubai real estate has roughly tripled between 2024 and 2026. That means newer years have more data, which can make an area look like it's suddenly trending when it's really just being discussed more. We mitigate this by reporting mean sentiment (not raw counts) and suppressing any quarter with fewer than 5 on-topic mentions.
2. How we score
Every post and comment goes through Anthropic's Claude Haiku 4.5. We give the model the item text plus the area it was tagged with, and ask it to return a structured JSON object:
- on_topic — is this actually about this area, or noise?
- sentiment — -1 (very negative) to +1 (very positive)
- aspects — up to four tags from a fixed vocabulary (price, traffic, community, noise, schools, etc.)
- pros / cons — up to three short paraphrased claims each
- persona_signal — resident, considering a move, investor, tourist, professional commenter, or unknown
- confidence — 0 to 1, how sure the model is
We only use items where on_topic = true AND confidence ≥ 0.5. The rest are discarded — typically about 30–40% of the raw sample. Area pages show both numbers so you can see the drop-off.
Sentiment for each area is the weighted mean of individual-item sentiment, weighted by confidence × log(1 + Reddit upvotes). High-confidence, well-upvoted comments carry more weight than low-confidence throwaway ones — but zero-score items still count.
The 3-year sentiment trend is a linear-regression slope over the last 4 available quarters of rolling sentiment, expressed per year.
3. What this does NOT tell you
Community Pulse is useful because it's the lived-experience signal that listings sites never show. But it's also narrow in five ways you should keep in mind.
This is Reddit, not Dubai
Our sample is English-speaking and expat-heavy. Long-term Emirati residents, Arabic-speaking tenants, labour-camp residents, and most of the city's working population are under-represented or absent. Treat this as a slice, not a consensus.
Sample sizes vary — a lot
Dubai Marina has 800+ analyzed mentions. Al Barari has 40. Newer communities may have single-digit coverage. We suppress any quarterly data point with fewer than 5 items, and leaderboards require at least 30 on-topic mentions, but small-n noise is still possible in aspect breakdowns. Trust the direction more than the decimal.
AI scoring is imperfect
Claude gets most things right and often picks up sarcasm and implicit sentiment. It also makes mistakes. We paraphrase every claim rather than quoting the user verbatim, and every pull quote links to the original Reddit thread so you can judge for yourself. If a claim looks wrong, the evidence is one click away.
Generic area names over-match
Some Dubai areas have names that are also common English phrases — "City Walk", "Downtown", "The Valley", "Meadows", "Town Square". The raw scraper matches on the phrase, which can grab discussions that aren't actually about that community. Claude's on-topic filter catches most of these, but expect slightly higher noise on these areas.
Astroturfing exists — we don't filter it
Real estate brokers are active on Reddit. Some of what looks like resident enthusiasm or resident complaint is professionally posted. v1 of this system has no bot filter. We label comments as "professional commenter" when the AI can tell, and the persona mix for each area shows you how much of the discussion looks industry-driven.
4. How often this updates
The scrape-and-score pipeline runs on a periodic cadence (monthly for now; we may go weekly as costs and usefulness justify). Each area page shows its own "last updated" date near the bottom.
Re-scoring the same items is wasteful, so the pipeline is incremental: new items get scored, existing items keep their original score unless we intentionally rerun them.
Methodology version: v1.0. If the scoring prompt, confidence threshold or weighting changes, the version increments and we'll note it here.
5. Report a problem
If a pull quote misrepresents the thread it links to, or an area page draws a conclusion that the underlying posts don't support, we want to know. It sharpens the prompt and re-scores the offending items.
Email hello@dubuy.ai with:
- the area page URL
- the specific claim or quote
- what you think it should say instead (or a pointer to the Reddit thread that contradicts it)
As of this version
12,644
Raw items considered
4,051
On-topic, high-confidence
72
Dubai areas covered
v1.0
Methodology version