Caching System
Why cache exists
OSINT APIs have rate limits and some are paid. Investigating the same IOC twice in one day wastes quota. Cache solves this: the first analysis goes to the API, subsequent ones read from local cache.
How it works
Each call to an OSINT tool passes through a transparent cache layer:
Call to virustotal_ip_lookup("185.220.101.34") ↓Does it exist in cache and not expired? ├─ YES → returns cached result (fast, no API call) └─ NO → calls real API → saves result in cache → returns resultThe cache key is the hash of (tool_name, normalized_input). The value is the complete JSON result.
Configuration
CACHE_TTL_SECONDS=86400 # Cache lifetime (default: 24 hours)To adjust the duration:
# Cache of 1 hour (for very volatile data)CACHE_TTL_SECONDS=3600
# Cache of 7 days (for long investigations with limited API budget)CACHE_TTL_SECONDS=604800
# No cache (always calls API)CACHE_TTL_SECONDS=0View cache statistics
osint> /cache statsOSINT Cache ├─ Total entries: 247 ├─ Current entries: 189 ├─ Expired entries: 58 ├─ Disk size: 2.3 MB └─ Session hit rate: 34%Clear the cache
osint> /cache clearOr directly:
rm data/osint_cache.dbThe database is automatically recreated on the next startup.
Cache per tool
Each tool has its own implicit TTL logic:
| Tool | Recommended TTL | Reason |
|---|---|---|
| VirusTotal IP/Domain | 24h | Reputation changes slowly |
| AbuseIPDB | 12h | Reports are more frequent |
| Shodan | 48h | Services change little |
| Threat Feeds | 1h | Feeds update constantly |
| WHOIS | 7 days | Records very stable |
| DNS | 1h | DNS TTLs are short |
| GDELT | 30min | News in real time |
With a global TTL of 24h you get a good balance between freshness and API call savings.
Impact on rate limiting
Batch mode with active cache can process long lists without exhausting quota:
- First run: N API calls (one per unique IOC)
- Subsequent runs within 24h: 0 API calls
This is especially useful for analyzing feeds daily — IOCs that repeat on consecutive days don’t consume additional quota.