Data Model

Data Model

ScopeOut's database is structured around a four-table pipeline that carries a discovered product from a raw directory URL through to a fully scored, AI-analysed dossier.

Pipeline Overview

Directory URL
     │
     ▼
discovered_products   ← root entity, one row per product URL
     │
     ├──▶ product_scrape_data   ← raw scraped content (1:1)
     │
     ├──▶ product_scores        ← computed opportunity scores (1:1)
     │
     └──▶ product_dossiers      ← AI-generated strategic output (1:1)

Each of the three child tables is in a 1:1 relationship with discovered_products and is cascade deleted when the parent row is removed.

`discovered_products`

The root entity table. A row is inserted here as soon as a product URL is found in a supplier directory crawl.

Key Columns

Column	Type	Notes
`id`	text (PK)	UUID
`name`	text	Product name
`url`	text	Unique product URL
`source_directory`	text	The directory it was found in (e.g. `kerfuffle.com`)
`category`	text	Product category tag
`description`	text	Short description from the directory listing
`status`	enum	Lifecycle state (see below)
`created_at` / `updated_at`	timestamp	Auto-managed

Status Lifecycle

discovered
    │
    ▼
scraping ──▶ scraped
                │
                ▼
            scoring ──▶ scored
                            │
                            ▼
                   generating_dossier ──▶ complete
                            │
                            └──▶ error  (any stage)

Indexes

status
source_directory
category

`product_scrape_data`

Populated by the scraper once a product page has been fetched and parsed.

Key Columns

Column	Type	Notes
`product_id`	text (FK)	References `discovered_products.id`
`rating`	numeric(3,2)	Average star rating
`review_count`	integer	Total number of reviews
`reviews_json`	JSONB	Array of review objects (see below)
`pricing_tier`	enum	`free \| freemium \| paid \| enterprise \| unknown`
`features_json`	JSONB	String array of listed features
`raw_html`	text	Full page HTML for reprocessing

`reviews_json` Shape

[
  {
    author: string;
    rating: number;
    text: string;
    date: string;
    sentiment: string; // e.g. "positive" | "negative" | "neutral"
  }
]

Indexes

product_id
pricing_tier

`product_scores`

Stores the four-dimension opportunity scores computed for each product, plus a composite rank used to drive the dashboard.

Key Columns

Column	Type	Range	Description
`product_id`	text (FK)	—	References `discovered_products.id`
`replicability`	integer	0–100	How buildable the product is with the internal stack
`market_demand`	integer	0–100	Review volume and sentiment signal
`revenue_potential`	integer	0–100	Pricing tier and addressable market estimate
`competitive_gap`	integer	0–100	Insight derived from negative review patterns
`composite`	integer	0–100	Weighted overall opportunity score
`scoring_notes`	JSONB	—	Rationale per dimension

`scoring_notes` Shape

{
  replicability?: string;
  market_demand?: string;
  revenue_potential?: string;
  competitive_gap?: string;
}

Indexes

composite — primary sort key for the ranked dashboard

`product_dossiers`

AI-generated strategic output, created for products that clear the scoring threshold.

Key Columns

Column	Type	Notes
`product_id`	text (FK)	References `discovered_products.id`
`mission_statement`	text	Auto-drafted one-paragraph product mission
`feature_list`	JSONB	Prioritised feature array (see below)
`weakness_analysis`	text	Competitor weakness narrative from negative review data
`generated_at`	timestamp	When the dossier was generated

`feature_list` Shape

[
  {
    name: string;
    description: string;
    priority: "must-have" | "should-have" | "nice-to-have";
  }
]

TypeScript Types

All four tables expose Drizzle-inferred types for safe query authoring:

import {
  // discovered_products
  DiscoveredProduct,
  NewDiscoveredProduct,

  // product_scrape_data
  ProductScrapeData,
  NewProductScrapeData,

  // product_scores
  ProductScores,
  NewProductScores,

  // product_dossiers
  ProductDossier,
  NewProductDossier,

  // Hydrated join across all four tables
  FullProduct,
} from "@/db/schema";

FullProduct is a convenience union type for queries that join all four tables into a single hydrated result object.

Data Model

Pipeline Overview

discovered_products

Key Columns

Status Lifecycle

Indexes

product_scrape_data

Key Columns

reviews_json Shape

Indexes

product_scores

Key Columns

scoring_notes Shape

Indexes

product_dossiers

Key Columns

feature_list Shape

TypeScript Types

`discovered_products`

`product_scrape_data`

`reviews_json` Shape

`product_scores`

`scoring_notes` Shape

`product_dossiers`

`feature_list` Shape