AI selection architecture document
AI Selection Architecture Document
Project Name: Duory - AI Language Learning Companion
Version: 1.0
Date: October 26, 2023
1. Introduction
Duory is a mobile-first language learning tool integrated with Duolingo, offering translation, romanization, and kana support. This document outlines the AI architecture for real-time language processing, focusing on scalability (target: 50K+ concurrent users), low-latency responses (<500ms), and offline capability.
2. AI Feature Requirements
Feature | AI Requirement | Criticality |
---|---|---|
Translation | Multilingual NLP (50+ languages) | High |
Romanization | Phonetic transcription (e.g., Japanese → Romaji) | Medium |
Kana Support | Kanji-to-Kana conversion (Japanese) | High |
Offline Mode | On-device lightweight ML models | Medium |
Adaptive Learning | User progress analytics for personalized tips | Low |
3. Technology Selection & Justification
3.1 Core AI Frameworks
- Translation & NLP:
- Google Cloud Translation API v3 (Dynamic batch processing, 110+ languages, $20/million chars).
- Backup for Offline: ONNX Runtime with distilled mBART-50 model (45MB size, 85% accuracy offline).
- Romanization/Kana Conversion:
- PyKakasi v2.0 (Python) for Japanese romanization (Apache 2.0 license).
- Kuromoji (v0.9.0) for Kanji-to-Kana (Java/Kotlin).
- Analytics:
- TensorFlow Lite (v2.10) for on-device learning pattern detection.
3.2 Infrastructure
- Cloud: Google Cloud Run (auto-scaling, cold-start <1s).
- Mobile: React Native (v0.70) with ML model caching via React Native FS.
- API Gateway: Apigee X (rate limiting, OAuth 2.0).
4. Architecture Overview
┌─────────────┐ ┌───────────────┐ ┌──────────────────┐
│ Mobile App │──────▶│ API Gateway │──────▶│ Cloud Translation│
│ (React │ │ (Apigee X) │ │ & PyKakasi │
│ Native) │◀──────┤ │◀──────┤ (GCP Cloud Run) │
└─────────────┘ └───────────────┘ └──────────────────┘
▲ │ │ │
│ └────▶ On-Device ─────┘ │
│ Models (mBART-50) ▼
└────────── User Data Sync ───────────▶ Firestore DB (Encrypted)
5. Implementation Steps
Phase 1: Core AI Pipeline (4 Weeks)
- Integrate Cloud Translation API:
- Implement batch translation with fallback to mBART-50 when offline.
- Optimize payloads using Protocol Buffers.
- Embed PyKakasi/Kuromoji:
- Compile to WebAssembly for React Native (via Emscripten).
- Benchmark: <100ms per 100-character conversion.
- Model Caching:
- Pre-load mBART-50 during app install using React Native FS.
Phase 2: Offline & Security (3 Weeks)
- Data Encryption:
- Encrypt user logs using AES-256 (Android Keystore/iOS Keychain).
- Token-Based Access:
- Validate Duolingo API tokens via OAuth 2.0 PKCE.
- Edge Caching:
- Cache frequent translations via Cloud CDN (1-hour TTL).
Phase 3: Scalability (Ongoing)
- Auto-scale Cloud Run instances (max 100 pods).
- Monitor latency via Cloud Trace; SLO: 95% requests <500ms.
6. Security & Compliance
- Data Privacy: GDPR/CCPA compliance; anonymize user data in analytics.
- Threat Mitigation:
- DDoS: Cloud Armor (rate limiting @ 1K reqs/user/minute).
- Model Tampering: JWT-signed model updates.
7. Performance Metrics
Component | Target |
---|---|
Translation Latency | ≤400ms (online), ≤800ms (offline) |
Romanization | ≤100ms |
API Uptime | 99.95% (SLA-backed) |
Model Size | <50MB (on-device) |
8. Future Extensibility
- Voice Recognition: Add Whisper.cpp (offline ASR) in v2.0.
- Personalization: Federated learning for adaptive content (TensorFlow Federated).
- Multi-Platform: Export PyKakasi to WebAssembly for web support.
Approvals:
- Lead Architect: [Signature]
- AI/ML Engineer: [Signature]
Document Length: 3,200 characters