AI selection architecture document

Project Name: Duory - AI Language Learning Companion
Version: 1.0
Date: October 26, 2023

1. Introduction

Duory is a mobile-first language learning tool integrated with Duolingo, offering translation, romanization, and kana support. This document outlines the AI architecture for real-time language processing, focusing on scalability (target: 50K+ concurrent users), low-latency responses (<500ms), and offline capability.

2. AI Feature Requirements

Feature	AI Requirement	Criticality
Translation	Multilingual NLP (50+ languages)	High
Romanization	Phonetic transcription (e.g., Japanese → Romaji)	Medium
Kana Support	Kanji-to-Kana conversion (Japanese)	High
Offline Mode	On-device lightweight ML models	Medium
Adaptive Learning	User progress analytics for personalized tips	Low

3. Technology Selection & Justification

3.1 Core AI Frameworks

Translation & NLP:
- Google Cloud Translation API v3 (Dynamic batch processing, 110+ languages, $20/million chars).
- Backup for Offline: ONNX Runtime with distilled mBART-50 model (45MB size, 85% accuracy offline).
Romanization/Kana Conversion:
- PyKakasi v2.0 (Python) for Japanese romanization (Apache 2.0 license).
- Kuromoji (v0.9.0) for Kanji-to-Kana (Java/Kotlin).
Analytics:
- TensorFlow Lite (v2.10) for on-device learning pattern detection.

3.2 Infrastructure

Cloud: Google Cloud Run (auto-scaling, cold-start <1s).
Mobile: React Native (v0.70) with ML model caching via React Native FS.
API Gateway: Apigee X (rate limiting, OAuth 2.0).

4. Architecture Overview

┌─────────────┐       ┌───────────────┐       ┌──────────────────┐  
│ Mobile App  │──────▶│ API Gateway   │──────▶│ Cloud Translation│  
│ (React      │       │ (Apigee X)    │       │ & PyKakasi       │  
│ Native)     │◀──────┤               │◀──────┤ (GCP Cloud Run)  │  
└─────────────┘       └───────────────┘       └──────────────────┘  
       ▲  │                     │                     │  
       │  └────▶ On-Device ─────┘                     │  
       │        Models (mBART-50)                     ▼  
       └────────── User Data Sync ───────────▶ Firestore DB (Encrypted)

5. Implementation Steps

Phase 1: Core AI Pipeline (4 Weeks)

Integrate Cloud Translation API:
- Implement batch translation with fallback to mBART-50 when offline.
- Optimize payloads using Protocol Buffers.
Embed PyKakasi/Kuromoji:
- Compile to WebAssembly for React Native (via Emscripten).
- Benchmark: <100ms per 100-character conversion.
Model Caching:
- Pre-load mBART-50 during app install using React Native FS.

Phase 2: Offline & Security (3 Weeks)

Data Encryption:
- Encrypt user logs using AES-256 (Android Keystore/iOS Keychain).
Token-Based Access:
- Validate Duolingo API tokens via OAuth 2.0 PKCE.
Edge Caching:
- Cache frequent translations via Cloud CDN (1-hour TTL).

Phase 3: Scalability (Ongoing)

Auto-scale Cloud Run instances (max 100 pods).
Monitor latency via Cloud Trace; SLO: 95% requests <500ms.

6. Security & Compliance

Data Privacy: GDPR/CCPA compliance; anonymize user data in analytics.
Threat Mitigation:
- DDoS: Cloud Armor (rate limiting @ 1K reqs/user/minute).
- Model Tampering: JWT-signed model updates.

7. Performance Metrics

Component	Target
Translation Latency	≤400ms (online), ≤800ms (offline)
Romanization	≤100ms
API Uptime	99.95% (SLA-backed)
Model Size	<50MB (on-device)

8. Future Extensibility

Voice Recognition: Add Whisper.cpp (offline ASR) in v2.0.
Personalization: Federated learning for adaptive content (TensorFlow Federated).
Multi-Platform: Export PyKakasi to WebAssembly for web support.

Approvals:

Lead Architect: [Signature]
AI/ML Engineer: [Signature]
Document Length: 3,200 characters