Backup & Restore Flow - Cloudstic CLI

Backup Flow

The backup operation is orchestrated by the BackupManager in internal/engine/. Here’s the complete flow:

1. Initialization

// From AGENTS.md and internal/engine/
BackupManager acquires a shared lock
Load the previous snapshot (if any) for this source identity
Extract the previous HAMT root reference

Source identity is determined by SourceInfo:

type: e.g. “gdrive”, “local”, “onedrive”
identity: stable container identity
path_id: stable selected-root identity within the container
account: friendly display account label
path: friendly display path
drive_name: friendly container label

Previous snapshot matching prefers the new stable fields and falls back for backward compatibility:

type + identity + path_id
type + identity + path (bridge fallback)
type + volume_uuid + path (legacy fallback)
type + account + path (legacy fallback)

2. Source Scanning

Full scan (for local, sftp, gdrive, onedrive):

source.Walk(func(entry SourceEntry) error {
    // Process each file/folder
})

Incremental scan (for gdrive-changes, onedrive-changes):

changeToken := previousSnapshot.ChangeToken
source.WalkChanges(changeToken, func(entry SourceEntry) error {
    // Process only changed files
})

For each file encountered:

Look up the file ID in the old HAMT:

oldMetaRef, err := hamtTree.Lookup(oldRoot, entry.FileID)

Fast-check metadata (name, size, mtime, type, parents):
- If identical and the source doesn’t provide a content hash, carry the old hash forward
- This avoids false-positive diffs for metadata-only changes
Determine action:
- Unchanged: Re-insert into new HAMT by reference (structural sharing)
- Changed or new: Queue for upload

3. Upload Phase

Changed/new files are processed by concurrent workers (default: 10 workers):

for _, queuedFile := range uploadQueue {
    go func(file SourceEntry) {
        // 1. Stream file content
        reader := source.Open(file.FileID)
        
        // 2. Content-defined chunking
        chunks := fastCDC(reader, minSize, avgSize, maxSize)
        
        // 3. For each chunk:
        for chunk := range chunks {
            // Compute hash (HMAC-SHA256 or SHA-256)
            hash := computeChunkHash(chunk.Data, dedupKey)
            
            // Deduplicate
            if store.Exists("chunk/" + hash) {
                continue // Skip, already stored
            }
            
            // Compress and write
            compressed := zstd.Compress(chunk.Data)
            store.Put("chunk/" + hash, compressed)
        }
        
        // 4. Create content object
        contentObj := Content{
            Type: "content",
            Size: file.Size,
            Chunks: chunkRefs,
        }
        contentHash := sha256(file.RawData)
        store.Put("content/" + contentHash, json(contentObj))
        
        // 5. Create filemeta object
        fileMeta := FileMeta{
            FileID: file.FileID,
            Name: file.Name,
            Type: file.Type,
            ContentHash: contentHash,
            Size: file.Size,
            Mtime: file.Mtime,
            Parents: parentRefs,
            ...
        }
        metaRef := "filemeta/" + sha256(json(fileMeta))
        store.Put(metaRef, json(fileMeta))
        
        // 6. Insert into new HAMT
        newRoot = hamtTree.Insert(newRoot, file.FileID, metaRef)
    }(queuedFile)
}

FastCDC Chunking

From internal/engine/chunker.go and the spec:

Parameter	Value
Min size	512 KiB
Avg size	1 MiB
Max size	8 MiB

FastCDC uses a rolling hash to find content-defined boundaries:

Compute a rolling hash of a 64-byte window
When the hash matches a pattern (e.g. last 20 bits are zero), create a boundary
Enforce min/max size constraints
The final chunk may be smaller than the minimum

Content-defined chunking ensures that inserting bytes at the start of a file doesn’t invalidate all subsequent chunks. Only the chunks containing modified data change.

4. HAMT Flush

After all files are uploaded and inserted into the HAMT:

// TransactionalStore buffers nodes in memory
// Only flush the reachable subset from the final root
txnStore.Flush(newRoot)

Flush algorithm (BFS from root):

Start a queue with the root node ref
For each node in the queue:
- If it’s in the in-memory buffer, write it to persistent storage
- If it’s an internal node, add all child refs to the queue
Discard any buffered nodes that were never visited

This avoids uploading intermediate superseded nodes created during tree construction.

5. Snapshot Commit

// 1. Create snapshot object
snapshot := Snapshot{
    Version: 1,
    Created: time.Now().Format(time.RFC3339),
    Root: newRoot,
    Seq: previousSeq + 1,
    Source: sourceInfo,
    ChangeToken: newChangeToken, // For incremental sources
    ...
}
snapshotRef := "snapshot/" + sha256(json(snapshot))
store.Put(snapshotRef, json(snapshot))

// 2. Update index/latest (the commit point)
index := Index{
    LatestSnapshot: snapshotRef,
    Seq: snapshot.Seq,
}
store.Put("index/latest", json(index))

// 3. Update index/snapshots catalog
catalog = append(catalog, SnapshotSummary{...})
store.Put("index/snapshots", json(catalog))

// 4. Update index/packs catalog (if packfiles enabled)
// Automatically handled by the PackStore layer

The commit point is updating index/latest. Until this write completes, the previous snapshot remains the “latest” and the repository is in a consistent state.

6. Lock Release

The backup lock is released, allowing other operations to proceed.

Restore Flow

The restore operation is orchestrated by the RestoreManager in internal/engine/.

1. Snapshot Resolution

// Resolve snapshot by:
// - Explicit ref ("snapshot/<hash>")
// - Sequence number (42)
// - "latest" keyword
snapshot := resolveSnapshot(snapshotID)
root := snapshot.Root

2. HAMT Traversal

Walk the HAMT to collect all file metadata entries:

var entries []FileMeta
hamtTree.Walk(root, func(key, metaRef string) error {
    // Load filemeta object
    metaData := store.Get(metaRef)
    meta := json.Unmarshal(metaData)
    entries = append(entries, meta)
})

3. Topological Sort

Ensure parent directories are created before their children:

// Build parent dependency graph
graph := buildDependencyGraph(entries)

// Topological sort
sorted := topologicalSort(graph)

This handles Google Drive’s multi-parent semantics where a file can appear in multiple directories.

4. Path Reconstruction

Walk the parent chain of each entry to reconstruct the full relative path:

func buildPath(meta FileMeta) string {
    if len(meta.Parents) == 0 {
        return meta.Name // Root entry
    }
    
    // Load first parent (arbitrary choice if multi-parent)
    parentRef := meta.Parents[0]
    parentData := store.Get(parentRef)
    parentMeta := json.Unmarshal(parentData)
    
    parentPath := buildPath(parentMeta)
    return parentPath + "/" + meta.Name
}

For files with multiple parents (Google Drive shared folders), only the first parent is used to construct the primary path. Other parents are ignored during restore.

5. ZIP Archive Creation

Write entries to a ZIP archive in topologically-sorted order:

zipWriter := zip.NewWriter(output)

for _, meta := range sortedEntries {
    path := buildPath(meta)
    
    if meta.Type == "folder" {
        // Create directory entry
        header := &zip.FileHeader{
            Name: path + "/",
            Method: zip.Store,
        }
        header.SetModTime(time.Unix(meta.Mtime, 0))
        zipWriter.CreateHeader(header)
    } else {
        // Create file entry
        header := &zip.FileHeader{
            Name: path,
            Method: zip.Deflate,
        }
        header.SetModTime(time.Unix(meta.Mtime, 0))
        
        writer, _ := zipWriter.CreateHeader(header)
        
        // Load content object
        contentData := store.Get("content/" + meta.ContentHash)
        content := json.Unmarshal(contentData)
        
        // Stream chunks
        for _, chunkRef := range content.Chunks {
            // Fetch and decompress chunk
            compressedData := store.Get(chunkRef)
            rawData := zstd.Decompress(compressedData)
            writer.Write(rawData)
        }
    }
}

zipWriter.Close()

6. Output

The ZIP archive is:

Written to stdout (CLI)
Returned as a byte stream (web API)
Saved to a file (with -o flag)

Performance Optimizations

Concurrent Upload

The backup manager uses a worker pool (default: 10 concurrent workers) to parallelize file uploads:

workerPool := make(chan struct{}, 10) // Semaphore

for _, file := range uploadQueue {
    workerPool <- struct{}{} // Acquire
    go func(f SourceEntry) {
        defer func() { <-workerPool }() // Release
        processFile(f)
    }(file)
}

Chunk-Level Deduplication

Before writing a chunk:

if store.Exists("chunk/" + hash) {
    continue // Skip, already stored
}

The KeyCacheStore layer caches existence checks in a local bbolt database to avoid redundant Exists calls:

First check: network request → cache result
Subsequent checks: local cache hit (0 network requests)

Content-Level Deduplication

Before streaming a file, check if its content object already exists:

if store.Exists("content/" + file.ContentHash) {
    // Entire file upload skipped!
    // Only create new filemeta and HAMT nodes
}

For sources that provide content hashes (Google Drive MD5, OneDrive SHA1), this is a huge optimization.

Packfile Bundling

Small objects (< 512KB) are bundled into 8MB packfiles:

Reduces API calls from thousands to dozens
LRU cache (128MB) keeps hot packs in memory
Typical metadata read: 0-1 network requests after initial pack fetch

Unchanged files reuse their filemeta refs, and unchanged subtrees reuse their HAMT node refs:

Snapshot 1: 10,000 files → 10,000 filemeta + ~500 HAMT nodes
Snapshot 2: 10 files changed → 10 new filemeta + ~5 new HAMT nodes

Incremental storage: ~15 objects, not 10,510

Error Handling

Transient Errors

Network errors during upload are retried with exponential backoff:

retry.Do(
    func() error { return store.Put(key, data) },
    retry.Attempts(3),
    retry.Delay(time.Second),
    retry.DelayType(retry.BackOffDelay),
)

Permanent Errors

Permanent errors (authentication failure, permission denied) abort the backup immediately:

if isPermanentError(err) {
    return fmt.Errorf("backup failed: %w", err)
}

Partial Upload Cleanup

If a backup is interrupted, orphaned objects remain in the store but are not reachable from any snapshot. Running prune after an interrupted backup will:

Mark all reachable objects from existing snapshots
Sweep and delete any orphaned objects
Repack fragmented packfiles

Orphaned objects are harmless. They consume storage but don’t affect backup correctness. Prune reclaims this space.

Diff Operation

The diff command leverages the HAMT’s structural diff:

hamtTree.Diff(snapshot1.Root, snapshot2.Root, func(entry DiffEntry) error {
    if entry.OldValue == "" {
        fmt.Printf("+ %s\n", entry.Key) // Added
    } else if entry.NewValue == "" {
        fmt.Printf("- %s\n", entry.Key) // Removed
    } else {
        fmt.Printf("M %s\n", entry.Key) // Modified
    }
})

The diff is structural: if two subtrees have the same root hash, they’re identical and the traversal skips them entirely. This makes diff extremely fast even for snapshots with millions of files.

​Backup Flow

​1. Initialization

​2. Source Scanning

​3. Upload Phase

​FastCDC Chunking

​4. HAMT Flush

​5. Snapshot Commit

​6. Lock Release

​Restore Flow

​1. Snapshot Resolution

​2. HAMT Traversal

​3. Topological Sort

​4. Path Reconstruction

​5. ZIP Archive Creation

​6. Output

​Performance Optimizations

​Concurrent Upload

​Chunk-Level Deduplication

​Content-Level Deduplication

​Packfile Bundling

​Structural Sharing

​Error Handling

​Transient Errors

​Permanent Errors

​Partial Upload Cleanup

​Diff Operation

Backup Flow

1. Initialization

2. Source Scanning

3. Upload Phase

FastCDC Chunking

4. HAMT Flush

5. Snapshot Commit

6. Lock Release

Restore Flow

1. Snapshot Resolution

2. HAMT Traversal

3. Topological Sort

4. Path Reconstruction

5. ZIP Archive Creation

6. Output

Performance Optimizations

Concurrent Upload

Chunk-Level Deduplication

Content-Level Deduplication

Packfile Bundling

Structural Sharing

Error Handling

Transient Errors

Permanent Errors

Partial Upload Cleanup

Diff Operation