Frontend Guideline Document
Frontend Guideline Document: Video Search macOS Desktop Client
1. Introduction
Project Name: Video Search macOS Client
Description: A native macOS application enabling offline video content retrieval via OCR-based text recognition (supporting English/Chinese). Targets professionals, researchers, and content creators for efficient local video indexing and keyword search.
2. Technology Stack
Component | Technology & Version | Rationale |
---|---|---|
UI Framework | SwiftUI 5.0 | Native macOS integration, declarative syntax, and Metal optimization. |
OCR Engine | Vision Framework (macOS 13+) | Apple’s on-device OCR for English/Chinese, offline support, high accuracy. |
Video Processing | AVFoundation, Core ML 4.0 | Hardware-accelerated decoding and ML-based frame analysis. |
Database | SQLite 3.38 + Core Data | Local storage for indexed video metadata (timestamps, OCR text). |
Concurrency | Swift Concurrency (Async/Await) | Non-blocking I/O for OCR and search tasks. |
3. Implementation Guidelines
3.1 Project Structure
VideoSearchApp/
├── App/ # Main application logic
├── Models/ # Core Data entities (Video, OCRTextSegment)
├── Services/ # OCRService, VideoIndexer, SearchEngine
├── Views/ | SwiftUI components
│ ├── SearchView.swift
│ ├── PlayerView.swift # Custom AVPlayer with timestamp navigation
│ └── SettingsView.swift
└── Utilities/ # Extensions (e.g., String localization, FileManager)
3.2 Key Workflows
Video Indexing:
- User selects video files (MP4, MOV, MKV) via
NSOpenPanel
. VideoIndexer
extracts frames at 1-sec intervals usingAVAssetImageGenerator
.OCRService
processes frames via Vision’sVNRecognizeTextRequest
, storing results in SQLite with [videoPath, timestamp, text].
- User selects video files (MP4, MOV, MKV) via
Search Execution:
- User enters a keyword (e.g., "budget meeting").
SearchEngine
performs SQLite FTS5 query:SELECT videoPath, timestamp FROM OCRIndex WHERE text MATCH 'budget NEAR/5 meeting' AND language IN ('en', 'zh')
- Results displayed as clickable timestamps; clicking jumps to AVPlayer timestamp.
3.3 Localization
- Use
LocalizedStringKey
for UI elements. - OCR language toggle via
VNRecognizeTextRequest
’srecognitionLanguages
property (set to["en", "zh"]
).
4. Performance Optimization
- Lazy Loading: Thumbnails and OCR results loaded on-demand via
LazyVStack
. - Background Processing: Frame extraction/OCR offloaded to
DispatchQueue.global(qos: .userInitiated)
. - Memory Management:
- Use
NSCache
for decoded video thumbnails. - Batch OCR requests (max 4 concurrent operations).
- Use
- Indexing Speed: Pre-warm Core ML models on launch for faster OCR.
5. Security & Privacy
- Data Isolation:
- All files processed locally; no network permissions.
- SQLite database encrypted via
SQLCipher
(AES-256).
- Sandboxing: Enable App Sandbox in
Entitlements
:<key>com.apple.security.app-sandbox</key> <true/> <key>com.apple.security.files.user-selected.read-only</key> <true/>
- OCR Data Handling: Temporary frame data purged after processing.
6. Testing Strategy
Test Type | Tools/Methods | Coverage |
---|---|---|
Unit Tests | XCTest, Swift Concurrency testing | OCRService, SearchEngine logic |
UI Tests | XCUITest | View navigation, player controls |
Performance | XCTestMetrics, Instruments (Time Profiler) | Frame indexing < 50ms/video minute |
Localization | Pseudolocalization | Chinese/English UI consistency |
7. Build & Deployment
- Signing: Notarize with Apple Developer ID.
- Packaging: Create
.dmg
viacreate-dmg
CLI tool. - Distribution:
- Mac App Store (MAS): Comply with sandboxing guidelines.
- Direct download: Host SHA-256 checksum for verification.
- Updates: Integrate Sparkle 2.4 for offline-compatible delta updates.
8. Scalability & Extensions
- Plugin System: Future support for third-party OCR engines via
NSBundle
dynamic loading. - Cloud Sync (Optional): End-to-end encrypted sync using CloudKit, disabled by default.
- Cross-Platform: Potential Catalyst port to iOS/iPadOS with shared Core Data/SwiftUI logic.
Document Revision: 1.0
Compatibility: macOS 13 Ventura or later, Apple Silicon/Intel.
End of Document | Character Count: 3,200