Architecture: Declarative Animation

ScrollTube 2.0 is a State-Snapshot Engine. Unlike traditional animation libraries that rely on imperative callbacks, this engine treats the entire scroll project as a piece of data.

1. The Project Configuration (`scrolltube.json`)

The heart of every project is a JSON file that describes the entire experience. This allows the state to be:

Portable: Can be generated by an AI, a server, or a visual editor.
Serializable: Can be stored in a database or passed via API.
Versioned: Changes to the animation are tracked like code.

Core Schema Overview

The engine expects a ProjectConfiguration object (defined in src/core/types.ts):

settings: Base resolutions, scroll modes, and base path.
assets: An array of SequenceAsset with multiple variants (Mobile vs Desktop).
timeline: A map of scenes and layers.
source: (New) Relative path to the original source video file, preserved for future edits or variant regenerations.

2. Decoupled Rendering Pipeline

State management is separated from pixels.

Core Engine: A non-UI class that manages the scroll -> frame math, image preloading, and subject tracking coordinates.
React Provider: Wraps the Engine in a reactive context.
UI Components: <ScrollTubeCanvas /> and <SubjectLayer /> represent the “view.” You can have multiple view layers tied to the same engine state.

How it works:

Detection: On initialization and resize, the engine calculates the required Physical Resolution (width * devicePixelRatio) and checks the canvas element’s own dimensions.
Selection: It selects the variant that best matches or exceeds the required resolution for the current orientation (Portrait vs Landscape).
Hot-Swapping: If the container size changes or the phone is rotated, the engine immediately swaps to the better-fit image folder without losing scroll place.

3. Subject-Relative Coordinates

This is the key technological shift:

Traditional: Content is fixed at x: 50vw.
v2.0: Content is anchored to a Subject.

Each Asset Variant can contain subjectTracking data—frame-by-frame (x,y) coordinates of the main object. The engine combines the image’s “Local Coordinates” with your layer’s “Relative Offset” to calculate the final screen position.

Formula: Screen Position = (Subject Center (x,y) + Relative Offset)

This ensures the text follows the product even if the image is cropped or scaled for mobile.

4. 3D Parallax & Depth Maps

ScrollTube 2.0 introduces Depth-Aware Rendering. By supplying a grayscale depth map (where white is closer and black is farther), the WebGL shader can apply a subtle displacement effect based on mouse or gyro movement. This gives the scroll sequence a premium “3D Parallax” feel without the weight of actual 3D models.