Behind the Bug: Silent Duplicate Sections & Lessons During Course Import
Behind the Bug: Silent Duplicate Sections & Lessons During Course Import
Affects: Import pipeline ·
import-materialize.ts· v1.0.48
What Happened
During a routine verification pass of the course import pipeline, we found a subtle but impactful correctness issue in the way sections and lessons are written to the database when materialising an imported Teachable course.
The materializer used an onConflictDoUpdate (upsert) pattern intended to make import runs idempotent — so that re-running an import for the same course would update existing records rather than creating new ones. Unfortunately, the conflict targets specified in the code were never actually active.
The Technical Detail
An ON CONFLICT clause in SQL only fires when the database detects a uniqueness violation on an indexed column or column group. The materializer specified:
ON CONFLICT (courseId, position)→ for sectionsON CONFLICT (sectionId, courseId, position)→ for lessons
But the schema in src/db/schema.ts had no unique indexes defined for either of these combinations. Without a backing unique index, PostgreSQL (and most SQL databases) treats the conflict target as unmatchable — every statement becomes a plain INSERT, with no update path ever taken.
// import-materialize.ts (illustrative — simplified)
await db
.insert(sections)
.values(sectionData)
.onConflictDoUpdate({
target: [sections.courseId, sections.position], // ← no unique index backs this
set: { title: sectionData.title },
});
Because the conflict is never triggered, each re-run of an import job inserts a fresh set of section and lesson rows rather than updating the existing ones.
Why This Matters
Idempotency is a core property of any reliable import pipeline. Without it:
- Re-running a failed import leaves behind a mix of old and new rows.
- Re-importing updated course content from Teachable doubles (or further multiplies) the section and lesson count instead of reflecting the latest state.
- Learner-facing course pages may show duplicate chapters or lessons.
- Downstream analytics and progress tracking can produce incorrect results if they assume one row per positional slot.
The Fix Options
Option A: Add Unique Constraints (Recommended)
Adding unique constraints to the schema makes the existing upsert logic work as intended and provides an additional data integrity guarantee at the database level.
// src/db/schema.ts (illustrative)
export const sections = pgTable(
'sections',
{
id: uuid('id').primaryKey().defaultRandom(),
courseId: uuid('course_id').notNull(),
position: integer('position').notNull(),
title: text('title').notNull(),
},
(table) => ({
uqCoursePosition: unique().on(table.courseId, table.position), // ← add this
})
);
export const lessons = pgTable(
'lessons',
{
id: uuid('id').primaryKey().defaultRandom(),
sectionId: uuid('section_id').notNull(),
courseId: uuid('course_id').notNull(),
position: integer('position').notNull(),
title: text('title').notNull(),
},
(table) => ({
uqSectionCoursePosition: unique().on(
table.sectionId,
table.courseId,
table.position
), // ← add this
})
);
A corresponding database migration must be generated and applied.
Option B: SELECT-then-Update Pattern
If adding unique constraints is not desirable (for example, if position values are intentionally mutable during an import), the materializer can be changed to perform explicit duplicate detection:
// For each section being materialised:
const existing = await db.query.sections.findFirst({
where: and(
eq(sections.courseId, sectionData.courseId),
eq(sections.position, sectionData.position)
),
});
if (existing) {
await db
.update(sections)
.set({ title: sectionData.title })
.where(eq(sections.id, existing.id));
} else {
await db.insert(sections).values(sectionData);
}
This approach is more verbose but does not depend on unique indexes and allows finer-grained control over what constitutes a "match".
Recommended Action
If you have an existing deployment that has processed import jobs more than once, we recommend:
- Audit for duplicate rows — query for sections and lessons sharing the same
(course_id, position)or(section_id, course_id, position)within the same course. - Deduplicate before applying the schema constraint, otherwise the migration will fail on the unique index creation.
- Apply the schema fix (Option A or B) and deploy.
Status
This issue was caught during a verification sweep of the import pipeline. A fix is being tracked and will ship in a subsequent release.