Data Model
Supershyft OS uses a combination of D1 (SQLite), R2 (object storage), and Vectorize (embeddings) for data persistence.
D1 Tables
Companies
Primary entity storing company information.
sql
CREATE TABLE companies (
id TEXT PRIMARY KEY, -- 'cmp_' + nanoid()
name TEXT NOT NULL,
country TEXT,
revenue_eur REAL, -- in M€
revenue_growth REAL, -- percentage
ebitda_margin REAL, -- percentage
employee_count INTEGER,
llm_score_overall INTEGER, -- 0-10,000
status TEXT DEFAULT 'active',
source TEXT, -- Data provenance
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT DEFAULT CURRENT_TIMESTAMP
);Indexes:
sql
CREATE INDEX idx_companies_name ON companies(name);
CREATE INDEX idx_companies_status ON companies(status);
CREATE INDEX idx_companies_source ON companies(source);Theses
Investment thesis definitions.
sql
CREATE TABLE theses (
id TEXT PRIMARY KEY, -- 'th_' + nanoid()
title TEXT NOT NULL,
code TEXT UNIQUE, -- e.g., "TH-001"
content TEXT, -- Markdown report
status TEXT DEFAULT 'active',
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT DEFAULT CURRENT_TIMESTAMP
);Dealflow
Pipeline tracking for companies.
sql
CREATE TABLE dealflow (
id TEXT PRIMARY KEY, -- 'df_' + nanoid()
company_id TEXT NOT NULL,
stage TEXT NOT NULL, -- 'initial', 'qualified', etc.
probability INTEGER, -- 0-100
estimated_value_eur REAL,
next_action TEXT,
next_action_date TEXT,
owner_email TEXT,
notes TEXT,
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (company_id) REFERENCES companies(id)
);Stages:
initial- Initial contactqualified- Meets criteriameeting- Meeting scheduleddiligence- Due diligenceoffer- Offer stageclosed_won- Deal wonclosed_lost- Deal lost
Activity Log
Audit trail for all actions.
sql
CREATE TABLE activity_log (
id TEXT PRIMARY KEY, -- 'act_' + nanoid()
user_email TEXT NOT NULL,
action TEXT NOT NULL, -- 'company_created', 'stage_changed'
entity_type TEXT, -- 'company', 'thesis'
entity_id TEXT,
metadata TEXT, -- JSON string
created_at TEXT DEFAULT CURRENT_TIMESTAMP
);Common Actions:
company_createdcompany_updateddealflow_stage_changedembeddings_generatedbulk_action_started
Bulk Actions
Background job tracking.
sql
CREATE TABLE bulk_actions (
id TEXT PRIMARY KEY, -- 'ba_' + nanoid()
user_email TEXT NOT NULL,
action_type TEXT NOT NULL, -- 'generate_embeddings'
status TEXT DEFAULT 'pending', -- 'pending', 'running', 'completed'
total_items INTEGER,
processed_items INTEGER DEFAULT 0,
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
completed_at TEXT
);Users
User profiles and preferences.
sql
CREATE TABLE users (
email TEXT PRIMARY KEY,
name TEXT,
avatar_url TEXT,
role TEXT DEFAULT 'analyst', -- 'analyst', 'partner', 'admin'
preferences TEXT, -- JSON string
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
updated_at TEXT DEFAULT CURRENT_TIMESTAMP
);R2 Storage
Company Profiles
Path: companies/{company_id}/profile.json
json
{
"id": "cmp_abc123",
"name": "Acme Corp",
"enrichment": {
"description": "AI-generated company description",
"products": ["Product A", "Product B"],
"competitors": ["Comp 1", "Comp 2"],
"technology_stack": ["React", "Node.js"]
},
"sources": {
"website": "https://acme.com",
"linkedin": "https://linkedin.com/company/acme"
}
}Source Data
Path: companies/{company_id}/source.csv
Original import data for audit trail.
Vectorize
Company Embeddings
Index: company-embeddingsDimensions: 768 (Workers AI @cf/baai/bge-base-en-v1.5)
typescript
interface VectorMetadata {
company_id: string
name: string
description: string
created_at: string
}ID Generation
All IDs use nanoid with prefixes:
cmp_- Companiesth_- Thesesdf_- Dealflowact_- Activity logba_- Bulk actions
typescript
import { nanoid } from 'nanoid'
const companyId = `cmp_${nanoid()}`Relationships
mermaid
erDiagram
COMPANIES ||--o{ DEALFLOW : "tracks"
COMPANIES ||--o{ ACTIVITY_LOG : "logs"
USERS ||--o{ ACTIVITY_LOG : "performs"
USERS ||--o{ BULK_ACTIONS : "initiates"Data Consistency
- Foreign Keys - Enforced in D1
- Transactions - Used for multi-table updates
- Optimistic Locking -
updated_attimestamp checks - Activity Logging - All mutations logged
Migrations
Located in apps/api-worker/migrations/:
bash
0001_initial.sql
0002_add_theses.sql
0003_add_dealflow.sqlApply with:
bash
wrangler d1 migrations apply supershyft-os --local