A FastAPI service that scrapes SEC Form 4 insider trading data from openinsider.com, stores it in SQLite, and serves it through a secured, rate-limited REST API — built as a backend data layer for my trade journal.
While building a personal trade journal, I wanted to track what company insiders (executives, directors, and large shareholders) were doing with their own stocks. The idea: if a CEO is buying shares, that's a signal worth watching.
The data is publicly available via SEC Form 4 filings, and openinsider.com aggregates them well. But I needed it programmatically; queryable by ticker, date range, and trade type without hammering a third-party site on every request.
So I built a dedicated API service to scrape, store, and serve the data.
Open Insider Trades is a FastAPI service that:
routers/ ← HTTP endpoints (auth, admin, transactions)
services/ ← Business logic (scraping, auth, rate limiting)
models/ ← SQLAlchemy ORM definitions
schemas/ ← Pydantic validation models
The scraper uses BeautifulSoup to parse HTML tables from openinsider.com, extracting 13 fields per trade: filing date, trade date, ticker, company name, insider name, title, trade type, price, quantity, shares owned, delta, and total value.
Data is written to CSV files first, then bulk-imported into SQLite - keeping the scraping step decoupled from the database write.
client, admin, and super_admin. POST /auth/token # login
POST /auth/refresh # get a new access token
POST /auth/logout # revoke token
A sliding window rate limiter runs on every request. Redis is the primary store; with a SQLite fallback if Redis is unavailable. Limits are role-based:
| Role | Limit |
|---|---|
| client | 5 req/min |
| admin | 50 req/min |
| super_admin | 100 req/min |
A background scheduler runs every morning to pull the previous day's trades. It can also be triggered manually via an admin endpoint for backfilling historical data.
GET /admin/bootstrap?start_year=2013 # full historical import
GET /admin/daily_sync # manual trigger
GET /insider_trades/{ticker} # trades by ticker
GET /insider_trades?from_date=... # trades by date range
Both support optional to_date and transaction_type filters (Buy P, Sell S, Sell+OE SOE).
| Layer | Tool |
|---|---|
| Framework | FastAPI + Uvicorn |
| ORM | SQLAlchemy 2.0 |
| Database | SQLite |
| Auth | python-jose (JWT) + passlib (bcrypt) |
| Cache / Rate Limiting | Redis (SQLite fallback) |
| Scraping | BeautifulSoup4 + Requests |
| Scheduling | APScheduler |
| Containerization | Docker + Docker Compose |
FastAPI's lifespan context manager is the right place for startup tasks since connecting to Redis, seeding the super admin, and starting the scheduler all happen there in one clean block.
ThreadPoolExecutor for concurrent scraping made historical imports significantly faster. Each month is scraped in a separate thread, which matters when pulling 10+ years of data bootstrapping.
SQLite as a rate limiting fallback is pragmatic for a hobby project but there is Redis cache, but the service still works without it. Good for local dev.
HttpOnly cookies for refresh tokens prevent JavaScript from ever reading the token.