Teachable Scraper — Media Asset Migration

Overview

When importing a course from Teachable, the platform now downloads every media asset referenced in the course content and re-uploads it to the platform's own storage (Vercel Blob). All asset URLs in the imported content are rewritten to point to the new platform-hosted locations.

This ensures that after a completed import, no content in the platform retains any dependency on Teachable's CDN.

What Gets Migrated

The scraper identifies and migrates the following asset types during an import:

Asset Type	Examples
Images	Inline images embedded in lesson pages, section headers, descriptions
Document attachments	PDFs, slide decks, and other downloadable files attached to lessons

Video links (e.g. Wistia, Vimeo, YouTube embeds) are handled separately and are not downloaded — only their embed references are preserved.

How It Works

1. Asset Extraction

During the scrape phase, the import engine traverses the full course structure — lectures, lesson pages, and attachment lists — and collects every media URL originating from Teachable's CDN.

2. Download from Teachable CDN

Each discovered asset is downloaded directly from the Teachable CDN at import time. This happens as part of the import job and requires no manual action.

3. Re-upload to Vercel Blob

Downloaded assets are uploaded to Vercel Blob under a path prefix scoped to the importing organization. The path structure isolates each tenant's assets:

/<org-id>/courses/<course-id>/assets/<filename>

This multi-tenant isolation ensures that assets belonging to one organization are never accessible under another organization's path.

4. URL Rewriting

Once an asset is successfully uploaded, every occurrence of the original Teachable CDN URL in the imported content is replaced with the new Vercel Blob URL. This rewrite is applied across:

Lesson body content (HTML/rich text)
Section and lecture descriptions
Attachment metadata

The final imported content stored in the platform contains no references to Teachable's CDN.

Storage & Tenancy

Assets are stored in Vercel Blob and served via the platform's own CDN-backed URLs.
Each organization's assets are namespaced under their unique organization ID, enforcing tenant isolation at the storage layer.
There is no shared storage between organizations — an asset uploaded during one organization's import is not accessible from another organization's namespace.

Behaviour During Import

Asset download and re-upload occurs automatically as part of the Teachable import job. No additional configuration is required.
If an individual asset download fails (e.g. the source URL is no longer accessible on Teachable's CDN), the import job records the failure for that asset and continues processing remaining content. The original URL is retained for any asset that could not be migrated.
The import is considered complete only after all reachable assets have been processed and URLs rewritten.

Before This Release

Prior to v1.0.18, the Teachable scraper imported course structure, lesson text, and embedded media references, but left all image and document URLs pointing at Teachable's CDN. This meant:

Imported content remained dependent on the source Teachable school staying active.
Assets could become inaccessible if the Teachable school was unpublished, paused, or deleted.
Organizations did not own or control the assets embedded in their imported courses.

As of v1.0.18, all media assets are fully migrated and owned by the platform at import time.

Teachable Scraper — Media Asset Migration

Teachable Scraper — Media Asset Migration

Overview

What Gets Migrated

How It Works

1. Asset Extraction

2. Download from Teachable CDN

3. Re-upload to Vercel Blob

4. URL Rewriting

Storage & Tenancy

Behaviour During Import

Before This Release

Related