Think & Speak
Technical Architecture & Core Features Workshop
A deep-dive into how our AI-powered English learning platform delivers real-time translation, phoneme-level feedback, and personalized learning paths — all in a single, integrated ecosystem.
React + TypeScript Next.js Azure Speech SDK WebSocket Redux Ant Design
Think & Speak §1 Platform Architecture Overview 2 / 14
1 Technology Stack & Foundation
⚛️ Core Framework
Next.js + React + TypeScript frontend with server-side rendering. Redux state management for application-wide data consistency. Ant Design component library providing accessible, consistent UI elements across all modules.
🎨 CSS Modules Strategy
All styles use .module.scss files generating unique hashed class names — preventing any style leakage between Dashboard, Course Adventure, Vocabulary Builder, Live Translation, and Assessment modules. Each component is self-contained and portable.
// Scoped to component only import styles from './index.module.scss'; <div className={styles.container}>...</div>
📊 Data Visualization
ECharts for complex charts and graphs. react-circular-progressbar for skill completion indicators. Real-time data visualization throughout the Dashboard with animated progress rings.
2 Design Philosophy & Architecture
🏗️ Component Architecture
  • Modular component design — each feature is an isolated, reusable unit
  • Page Components pattern: src/pageComponents/account/pages/login/
  • Barrel exports in Index.tsx for clean imports and tree-shaking
  • Custom hooks: useCall* pattern for standardized API layers
  • SCSS variables at :root level for theming consistency
📱 Responsive System
  • Global classes: .mobile, .tablet, .desktop
  • Fluid typography via clamp() and rem units
  • CJK typography with line-height 1.6–1.8 for Chinese character clarity
  • 44px minimum touch targets for young learners (ages 6–8)
  • 48px targets on ruggedized school tablets
🌐 Multi-Environment
  • Production: app.thinkandspeak.com
  • UAT: uat-app.thinkandspeak.com
  • next-i18next for trilingual support (EN / 粵語 / 普通话)
  • JWT-based authentication with role-based routing
Think & Speak §2 Real-Time Live Translation Engine 3 / 14
3 WebSocket Communication Layer
🔌 Connection Architecture
Persistent WebSocket connections via wsRef React refs to backend signaling server (NEXT_PUBLIC_WS_ROOM_URL). Exponential backoff reconnection — max 3 attempts, 10-second delays — prevents connection storms during widespread network failures.
// Connection lifecycle management const wsRef = useRef(null); const reconnectTimeoutRef = useRef(null); const reconnectAttemptsRef = useRef(0); // Auto-refresh tokens every 10 min autoRefreshRecognitionTokenTimeoutRef
🎤 Azure Speech SDK Integration
Microsoft Azure Cognitive Services Speech SDK for enterprise-grade recognition. Source language en-US with targets: zh-CN (Simplified Chinese) and yue-HK (Cantonese). Real-time phoneme-level data streamed to WebSocket while structured Message objects dispatched to Redux.
📡 Polling & Sync
  • Room metadata polling: 10-second intervals via isContinuouslyFetchRoomsRef
  • Attendance roster polling: 10-second intervals via isContinuouslyFetchAttendantsRef
  • Failure-count ceilings prevent infinite retry loops
  • Handles 10,000+ student rosters with pageSize: 10000
4 Speech Recognition Pipeline
🔄 State Machine
Idle Recognizing Finalizing Dispatch
Intermediate results ("recognizing" events) enable real-time subtitle streaming — don't wait for complete sentences. Final results ("recognized" events) populate session history for vocabulary extraction.
📝 Message Object Structure
{
  "originalText": "Let's explore...",
  "translations": {
    "zh-CN": "让我们探索...",
    "yue-HK": "我哋探索下..."
  },
  "timestamp": "ISO8601",
  "senderId": "host_user_01"
}
Each utterance becomes a persistent learning asset — flowing from live WebSocket into Redux store, then to session history, then to vocabulary extraction.
🛡️ Token Security
RSA PEM-encoded key pairs (genKeyPair, decryptFromServer) encrypt Azure access tokens during transit. Enterprise-grade security for school deployments.
Think & Speak §3 Session Lifecycle & Classroom Management 4 / 14
5 Room-Based Session Architecture
🏠 Room Creation
  • Room name validation: max 50 characters, non-empty constraint
  • Duration range: 1–480 minutes (covers quick reviews to extended sessions)
  • ISO 8601 datetime: YYYY-MM-DDTHH:mm:ss
  • Hierarchical Tree component for school/class/student selection
  • Composite key: ${schoolId}-${studentId} for multi-tenant data
  • handleInviteAllStudents and handleInviteAllInSchool batch operations
👥 Attendance States
Three-state model distinguishes participant engagement:
Attending
WebSocket-connected
Invited
API-authorized
Available
Eligible unassigned
📊 Attendance & Heatmap
Real-time engagement monitoring via WebSocket join/leave events. attendingStudentIds and invitedStudentIds track presence. "Question" and "understand" signals populate the Live Learning Feedback & Heatmap — teachers see class comprehension at a glance.
6 Room Status State Machine
🔄 Status Transitions
NOT_STARTED STARTED IN_PROGRESS COMPLETED
Auto-detection: onlineStudentCount > 0 triggers IN_PROGRESS. startTime comparison with dayjs library drives temporal triggers via callModifyRoomStatus API.
👤 Student Presence & Social Accountability
  • Avatar stacking with 20% overlap — up to 6 visible + overflow counter
  • Smart fallbacks: getAvatarUrl → emoji representations (👨/👩)
  • roomCardInactive: grayscale + 0.6 opacity for completed sessions
  • Dynamic theme hashing: roomCardBlue, roomCardYellow, roomCardPink
📋 RoomHeader Responsive
  • useBreakpoints hook adapts desktop ↔ mobile
  • Connection status: green (connected), blue (connecting), red (disconnected)
  • roomSelectorOptions dropdown for multi-session navigation
  • isLoadingExport state for transcript generation UX
Think & Speak §4 Smart Note Capture & MyNotebook 5 / 14
7 MyNotebook Ecosystem
📓 Nine-Category Taxonomy
📘 Lecture
📝 Notes
📅 Daily Log
❓ Questions
💬 Discussion
📋 Instructions
📄 Assignment
🔤 Vocabulary
📖 Grammar
labelToVariant mapping + slugifyTag for custom categories. transformNoteListToMyNoteBookDialogData parses comma-separated labels, counts per category, and maps to icon variants.
✏️ Dual-Mode Editing
  • Review mode: Tags displayed as static antd <Tag> components
  • Capture mode: Checkbox.Group for multi-label classification
  • Remark system: 250-character limit, autoSize: {minRows:1, maxRows:4}
  • Search: Dual-field across noteContent + comment via getCommentSearchText
  • Batch clear: Category-specific or all-notes deletion with confirmation modals
8 Session History & Vocabulary Pipeline
🔤 Word Extraction Engine
Unicode-aware regex tokenization using /[\p{L}]/gu — preserves multilingual word boundaries. Filters numeric-only tokens (/^\d+$/). Supports English-Chinese-Cantonese trilingual extraction.
// Tokenization pipeline function separateWords(text) {
  return text.split(' ')
    .filter(w => /[^\p{L}]/gu.test(w))
    .filter(w => !/^\d+$/.test(w));
}
🎯 Entity Recognition & Color Coding
  • hasWordEntity / getWordEntity — case-insensitive matching against taggedWordListData
  • Known words → green (sessionHistoryWordWithEntity)
  • New words → blue (zone of proximal development)
  • Clicking known word → LiveWordPracticeDialog (dynamic import, ssr: false)
  • Clicking unknown word → categorization modal with custom tags
Temporal Organization
  • Today: time only (HH:mm:ss)
  • Current year: month/day + time
  • Previous years: full date via date-fns
  • Chronological listing with delete capabilities
Think & Speak §5 Vocabulary Mastery & Flashcard System 6 / 14
9 Spaced Repetition Engine
🃏 Three-State Mastery System
To Practice
Active queue
Need Review
Requires attention
Mastered
Completed
convertToMastered() checks practice scores ≥ 50 before allowing mastery. Tags stored as comma-separated strings, parsed into filter arrays.
🎴 FlashCardDialog Architecture
  • Card stack metaphor with z-index layering and rotate(-10deg) / rotate(+10deg)
  • Gradient backgrounds: linear-gradient(124.76deg, #00A1FF)
  • 3D flip-card with transform-style: preserve-3d + backface-visibility: hidden
  • Rich metadata: headword, IPA transcription, part-of-speech tags
  • Audio playback with animated ripple effects on microphone button
  • Swipe left → Need Review / Swipe right → Mastered (score validation required)
10 Vocabulary Trainer — Ten Levels
📈 Progressive Difficulty System
LevelNameColor Intensity
1–2First / Simple WordsLight Blue
3–4Basic / Learning WordsBlue
5–6Intermediate / AdvancedStrong Blue
7–8Expert / Master WordsOrange
9–10Specialist / ExtremeDeep Orange
🎯 Five Training Tasks Per Word
  • Learn & Read: Listen → Read → Record pronunciation (pronunciation-only scoring)
  • Spelling I: Identify correctly spelled sentence (pronunciation + attempt)
  • Spelling II: New spelling example (pronunciation + attempt)
  • Word Usage: Choose correct sentence and read aloud (pronunciation + attempt)
  • Picture Dictation: Listen and spell the word shown (attempt-only)
Each task retriable up to 3× with score deductions. Words scoring >80 move to Mastered library. AI Smart Tips personalize suggestions based on performance patterns.
Think & Speak §6 Conversation Role Play & Pronunciation 7 / 14
11 Phoneme-Level Pronunciation Assessment
🎤 Four-Dimensional Scoring
Accuracy
Phonetic precision
Fluency
Rhythm & prosody
Completeness
Lexical coverage
Intonation
Pitch & stress
🎨 Traffic-Light Color System
  • Green (90–100): Native-like production — recommend maintaining
  • Yellow (70–89): Good but non-native — room for improvement
  • Orange (40–69): Intelligible but non-native — needs practice
  • Red (0–39): Communication-impairing — targeted remediation
Word-level spans color-coded per score. Clicking a word reveals phoneme-level breakdown — distinguish /θ/ vs /s/, /æ/ vs /ɛ/ — critical for Cantonese-speaking students.
🔊 Audio Pipeline
RecordRTC with StereoAudioRecorder, 16kHz sampling (human speech range: 300Hz–3.4kHz). 300–500ms debounce prevents double-clicks. 15-second safety timeout prevents dead air. WAV MIME preserved on upload.
12 Speaking Practice Modes
🎭 Role-Play Scenarios
  • Presentations: Structured academic speaking with timed delivery
  • Debates: Argument frameworks with opposing viewpoint practice
  • Discussions: Multi-turn conversation simulating DSE Paper 3
  • Read & Speak: Input-to-output transformation exercises
🤖 Dual AI Tutor Model
Model.GPT
General English: everyday vocabulary, casual discussions, fluency building
Model.RAG
Exam Prep: HKDSE-aligned content, structured arguments, evidence evaluation
Switch disabled during server processing — prevents mid-conversation confusion. ChatHeader integration with Model enum.
📊 Assessment Metrics
  • Interruption count — measures turn-taking etiquette (DSE criterion)
  • Phase-based analysis — Opening, Exploration, Challenge, Synthesis
  • Part A + Part B separation — mirrors official DSE scoring
  • Rank badge — overall performance level indicator
Think & Speak §7 Reading Comprehension & DSE Preparation 8 / 14
Think & Speak §8 Assessment & Personalization Engine 9 / 14
15 8-Dimension Learner Profiling
🧠 Assessment Dimensions
#DimensionOptions
1Age Groups6–8, 9–11, 12–14, 15–17, 18+
2GenderAvatar & theme customization
3Interest Topics11 categories
4Learning Personas9 types
5Motivation Factors8 categories
6Learning Styles8 methods
7Proficiency LevelScale 1–10
8Curious Subjects22 academic topics
⚠️ This Is NOT a Test
Students don't "pass" or "fail." The system learns preferences to deliver personalized content. Profiling happens once during onboarding, then informs everything.
16 Personalization Pipeline
🔄 Data Flow
Assessment Input
Learner Profile Created
Content Adaptation Difficulty Adjustment Topic Recommendations
Personalized Dashboard
📊 Age-Based Color Psychology
6–8
Warm Gold
9–11
Teal
12–14
Cool Blue
15–17
Orange
18+
Deep Orange
data-age-group attribute selectors cascade color themes through the entire interface — from authentication to Course Adventure progression to Live Translation cohort indicators.
🎓 Interesting Subjects (22 Topics)
Math, Science, Biology, Chemistry, Physics, History, Geography, Economics, Business Studies, Literature, Music, Art, PE, Computer Science, General Studies, Religious Studies, Philosophy, Psychology, Sociology, Environmental Science, Astronomy, Law Basics — includes HKDSE-relevant areas.
Think & Speak §9 Dashboard & Progress Analytics 10 / 14
17 Student Dashboard Metrics
127
Practice Sessions
342
Words Mastered
48h
Time Spent
8
Courses Done
⚙️ Technical Architecture
  • useCallGetUserDashboard — custom API hook for real-time data
  • Redux store synchronization for state consistency
  • useModuleGuardian — access control for module sequencing
  • Optimistic UI updates with rollback on API failure
  • Skeleton loading states during data fetch
🏆 Gamification Elements
  • Points system earned through completions, streaks, and scores
  • Achievement badges unlocking at milestones
  • Login streak tracking with celebration animations
  • Progress bars for each module with animated fills
18 Multi-Tab Reporting Suite
📋 Three Report Categories
🎓 Courses Report
Enrollment tracking, completion rates, skill development (Fluency, Accuracy, Completeness, Intonation)
🎤 Scripts Report
Speaking practice monitoring, script mastery, best performance highlighting, improvement trends
📚 VocabBank Report
5-skill radar chart: Vocab Guide, Spelling L1, Spelling L2, Vocab Usage, Dictation
⏱️ Time Range & Comparison
  • Preset filters: 7 days, 30 days, 90 days, all-time
  • Period comparison: current vs. previous statistical analysis
  • Circular progress visualization for overall performance
  • AI-powered insights: summary and improvement suggestions
🎯 Spaced Repetition Integration
Ebbinghaus forgetting curve optimization: 1 day → 3 days → 1 week → 2 weeks → 1 month → 3 months. Students using spaced repetition retain vocabulary 3× better than mass exposure. "Words Mastered" requires 4+ successful review cycles.
Think & Speak §10 AI Tutor & Thinking Lab 11 / 14
19 AI Tutor System
🤖 Dual Tutor Modes
ChatBox/ChatFooter component with dual-model system (GPT vs. RAG) and input mode flexibility (text/audio). Adapts to proficiency — beginners type while advanced students use speech-to-speech.
Model.GPT
General English conversation, fluency, everyday vocabulary
Model.RAG
Exam Prep: HKDSE content, structured arguments, evidence quality
💭 Conversational UX
  • Auto-scrolling: scrollContainer.scrollTo({top: scrollHeight}) mimics eye contact
  • Activity state awareness: isUserSending, isSystemThinking, isSystemAnswering
  • "Clear History" hidden during active AI explanation — protects impulsive young learners
  • Message count logged to analytics before clearing
  • Sticky messages: system/host messages pinned to screen for broadcast instructions
20 Thinking Lab — Critical Thinking
🧪 Debate & Argumentation Module
  • Distinct learning vertical for debate and critical thinking
  • External bridge to nexus-ari-ai.replit.app for specialized debate environments
  • Structured arguments: opening, evidence, counterpoint, conclusion
  • Academic vocabulary for economics, science, social issues
  • Real-time feedback on argument structure and evidence quality
🎨 Mascot Emotional Feedback (ARI)
Oops
0–50% Encouraging
Keep Up
51–70% Supportive
Amazing!
71–100% Celebrating
Affective Filter reduction (Krashen): anxiety reduction through friendly, non-textual feedback. Blue mascot creates emotional safety — no harsh red X marks.
Skip Button — Interruptibility
opacity: 0opacity: 100 on hover. Students can interrupt AI mid-explanation — just like interjecting in human conversation. Builds speaking confidence through agency.
Think & Speak §11 Read & Speak & Scripts Module 12 / 14
21 Read & Speak Module
📖 Daily Reading Comprehension
  • AI selects readings based on assessment interests/topics
  • Text complexity dynamically adjusts to student proficiency
  • Guided Learning Mode with AI hints and scaffolding
  • Exam Mode with timed practice under authentic conditions
📝 Personalized Script Practice
  • Teachers upload scripts in bulk; students can upload their own
  • Listening and recording functions with pronunciation comparison
  • Practice record history and scores tracked per script
  • Top progress bar: total practice hours used vs. remaining
🔄 Batch Script Generation
Multi-word selections (≥5 words) auto-added to Scripts sidebar via pendingMultiSelectCommitRef with 400ms debounce. 180-character minimum ensures substantive practice passages. Authenticated discourse becomes personalized speaking material.
22 Learning Sidebar — Command Center
📜 Scripts Tab (Orange)
  • Contextual sentence collection for Read & Speak practice
  • Source room reference tracking and timestamps
  • Multi-word selection auto-add (≥5 words, 400ms debounce)
  • Script history archival with frequency tracking
📚 Categories Tab (Green)
  • Tag-based taxonomy: Subject, Theme, Difficulty, Custom Tags
  • Default "To Practice" tag auto-filtering
  • toPracticeCount: filteredWordList.filter(item => item.tags.includes('To Practice')).length
  • Flashcard queue management per category
📓 My Notebook Integration
myNotebookDialogOpenInfo.edit({roomId}) — seamless transition from live translation to personal study. Complete learning loop: listen → capture → review → master. Decorative CSS layers (polygons, stars, card layers) create game-like flashcard aesthetic.
Think & Speak §12 UI Architecture & Visual Design 13 / 14
23 Cognitive Load Management
PropertyDesktopMobile
Font Size24px18px
Line Height32px26px
Max Width75%90%
Padding Top60px24px
Expanded line height prevents crowding during bilingual EN+ZH reading. 75% max-width = ~66 characters (optimal reading line length).
🌐 CJK Font Stack
font-family: system-ui,
  "Segoe UI", Roboto,
  "Noto Sans CJK SC",
  "Noto Sans CJK TC",
  sans-serif;
24 AI Presence Indicators
🤖 nodRocket Animation
@keyframes nodRocket {
  0% { transform: rotate(0deg); }
  25% { transform: rotate(-5deg); }
  75% { transform: rotate(5deg); }
  100% { transform: rotate(0deg); }
}
Blue mascot physically "nods" while processing — mimics human teacher non-verbal cues. Addresses the "uncanny valley" — system lag feels like active engagement, not dead time.
🎨 Iconography System
  • IconNote (34×34px) — in-context capture with pencil overlay
  • IconMyNotes (16×16px) — stroke-based for dense note lists
  • IconVocabulary — abstract "A" formation, stroke-linecap="round"
  • IconQuestions (22×22px) — concentric circles for confusion signals
  • IconExport (40×40px) — folder with upward arrow for learning assets
  • All icons use aria-hidden="true" and currentColor strategy
25 Animation & Performance
LinesAnimation (9s loop)
  • SVG SMIL animation (<animateTransform>) — GPU-accelerated 60fps
  • Three opacity layers: 0.55 / 0.35 / 0.22 (hint levels)
  • Sine wave paths: organic, non-linear motion
  • Pill-shaped clip path: rx="30" ry="30"
🚀 Performance Optimizations
  • content-visibility: auto for off-screen elements
  • contain: layout style paint isolation
  • will-change: transform for GPU acceleration
  • Dynamic imports with ssr: false for heavy components
  • Debounced search (500ms) and message deduplication
Accessibility
  • WCAG 2.1 AA: aria-hidden on decorative SVGs
  • prefers-reduced-motion support
  • Keyboard navigation: onPressEnter, autoFocus
  • 44px+ touch targets for young learners
Think & Speak §13 Summary & Getting Started 14 / 14
26 The Closed-Loop Learning System
🔄 Listen → Capture → Review → Master
🎤 LISTEN — Real-time bilingual subtitles (EN ↔ ZH)
📝 CAPTURE — Smart notes + vocabulary extraction
📖 REVIEW — Flashcards + spaced repetition
🏆 MASTER — Measurable progress + AI insights
🎯 For Your Classroom
  • Reduced prep time: Automated vocabulary extraction from lessons
  • Real-time engagement: Live feedback on student understanding
  • Differentiated instruction: AI adapts to each student's level
  • Multilingual support: Teach diverse classrooms without language barriers
  • Every class = a learning asset: Structured, reusable content
27 Getting Started
🚀 Demo Credentials
RoleEmailPassword
Teachertest_teacher@seechange-edu.comAa123456
Studenttest_student1@seechange-edu.comAa123456
🌐 Access URLs
app.thinkandspeak.com
Production Environment
uat-app.thinkandspeak.com
UAT / Testing Environment
💬 Next Steps
  • Try the platform with the test credentials above
  • Explore the Live Translation feature with a colleague
  • Check the Dashboard analytics after completing a practice session
  • Contact us for school-wide deployment setup
"Teach with confidence. Learn without limits."
📋 Speaker Notes — Slide 1
Navigate  |  N Speaker Notes  |  Space Next  |  Esc Close Notes  |  1-9 Toggle Accordion