Skip to content

Analysis Modules

The three advanced analysis modules in src/analysis/ go beyond querying APIs — they implement proprietary algorithms on the data.

Anomaly Detector

Detects whether an entity has unusually high mention activity (news spike, sudden escalation).

Algorithm

Implements Welford’s algorithm to calculate mean and variance incrementally without storing all historical data. When a new observation is more than N standard deviations above the mean, it is considered an anomaly.

from src.analysis.anomaly_detector import check_entity_anomaly
result = await check_entity_anomaly("Conti ransomware group")
# If there is a spike in mentions in GDELT/news:
# → {"anomaly": true, "z_score": 4.2, "reason": "Spike in mentions in last 24h"}

Use cases

  • Detect if a threat actor is more active than normal
  • Alert when a domain starts appearing massively in news
  • Monitor when a target company has unusual media activity

Market Correlation

Searches for statistical correlations between an IOC or event and market signals (cryptocurrencies, indices, commodities).

from src.analysis.market_correlation import correlate_with_market
result = await correlate_with_market(
entity="LockBit ransomware",
market="bitcoin",
days=30
)

What it’s used for

  • Ransomware investigations: Is there a correlation between attacks by a group and Bitcoin movements?
  • Geopolitical analysis: How did this cyber attack affect the energy market?
  • Financial context in fraud investigation

Limitations

  • Correlation does not imply causation
  • Requires sufficient history (minimum 14 days of data)
  • Spurious correlations are common in high-volatility data

Narrative Detector

Detects and groups narrative clusters in media coverage using GDELT data.

from src.analysis.narrative_detector import detect_narratives
result = await detect_narratives("Russia Ukraine cyber operations", days=7)

What it returns

  • Narrative clusters identified (e.g. “attributions to APT groups”, “impact on critical infrastructure”)
  • Most active sources in each narrative
  • Dominant tone by narrative
  • Temporal evolution of each narrative

Use cases

  • Disinformation campaign analysis: identify coordinated narratives
  • Geopolitics context: understand how different countries cover the same event
  • Attribution tracking: how a cyber attack attribution develops in the media

Integration with the agent

The ReAct agent can invoke these modules when it detects they are relevant:

osint> Analyze if the Lazarus group is more active than normal this month
→ check_entity_anomaly("Lazarus APT")
→ gdelt_entity_search("Lazarus Group")
→ detect_narratives("Lazarus North Korea cyber")

The results are integrated into the final report as intelligence context.