What The Fact
AI Model Constraints
Accepted Trade-off
LLM analysis is bounded by the model's training cutoff
“LLM analysis is based on training data (cutoff 2024) — models do not have internet access or real-time fact databases.”
Why this exists
Local language models reason from their training corpus; they cannot look up events that post-date that corpus or query a live fact store. This is a property of how the models are run, not a configuration bug.
What we do about it
Facts are taken from the live article text, not the model's memory: summaries are extractive (drawn from the published body), and the LLM is used for framing, tone and bias interpretation rather than as a source of factual claims. A retrieval-augmented fact-check path (grounding on a live evidence store) is on the research roadmap.
What The Fact
Bias & Fairness
In Progress
Bias scores may not match human expert consensus
“Bias scoring reflects the LLM's interpretation of language patterns; it may not match human expert consensus in all cases.”
Why this exists
A single model's reading of loaded language is one signal among several. Reasonable analysts — and reasonable models — can disagree on where a given article sits on the spectrum.
What we do about it
Bias is computed from multiple independent signals (LLM language analysis, outlet editorial-stance priors, and a separate DistilBERT NLI model) rather than one verdict, and the rationale is shown so readers can judge it themselves. Ongoing calibration measures the scorer against outlet editorial baselines.
What The Fact
Data Coverage
In Progress
Geolocation falls back to source province at low confidence
“Geolocation uses source province as fallback when AI confidence is low (<0.5).”
Why this exists
When the extractor cannot confidently locate a story, it attributes it to the publishing outlet's province rather than guessing a precise location it is not sure of.
What we do about it
Province-level fallbacks are labelled transparently rather than presented as precise geolocation. The L.O.C.A.L. agent's geo-extraction is being improved to raise the share of articles that clear the confidence threshold.
What The Fact
Throughput & Coverage
In Progress
Not every article receives full LLM analysis
“Not all articles receive full AI analysis — only priority articles (top 20/cycle) get LLM treatment; all others receive heuristic + source-editorial bias scoring.”
Why this exists
Full LLM analysis is compute-bound. To keep the feed current, each ingest cycle prioritises the highest-signal articles for deep analysis and applies lighter heuristic + editorial scoring to the rest.
What we do about it
Analysis throughput is being scaled across the agent fleet and a backlog drains lower-priority articles toward full coverage over time. Articles that have not yet had full LLM analysis are scored by transparent heuristic + outlet-editorial methods in the interim.
What The Fact
Bias & Fairness
Accepted Trade-off
Outlet bias profiles are rolling averages, seeded for new outlets
“Outlet bias profiles are rolling averages; new outlets are seeded from editorial stance data and refined over time.”
Why this exists
An outlet's bias profile is a moving aggregate of its analysed coverage. A newly added outlet has little history, so it starts from a documented editorial-stance prior and converges as articles accumulate.
What we do about it
This is the intended design: profiles update as evidence accumulates rather than being fixed labels. Seed priors are sourced from published editorial-stance research and are superseded by measured behaviour as the sample grows.
What The Fact
Data Coverage
Accepted Trade-off
Paywalled articles are analysed from title + summary only
“Paywalled articles (Postmedia, Torstar) have limited body text — AI analysis relies on title + Open Graph description only.”
Why this exists
Where a publisher gates the article body behind a paywall, the full text is not lawfully or technically available to ingest, so analysis is limited to the headline and the publicly-served summary metadata.
What we do about it
Paywalled items are analysed only from the publicly-available title and Open Graph description, and this reduced basis is disclosed rather than hidden. We do not bypass paywalls to obtain body text.
What The Fact
Bias & Fairness
In Progress
The DistilBERT signal is US-trained and used as a second opinion
“DistilBERT Signal 3 (US-trained NLI model) may not map cleanly to Canadian political framing; treated as a second opinion, not ground truth.”
Why this exists
The supplementary DistilBERT natural-language-inference model was trained primarily on US data, so its read of Canadian political framing can be imperfect.
What we do about it
Its output is weighted as one second-opinion signal, never as ground truth, and is cross-checked against the LLM and editorial signals. A Canadian-framing fine-tune is planned to improve its domain fit.