Think & Speak — Teacher Workshop Technical Deep-Dive

Think & Speak

Technical Architecture & Core Features Workshop

A deep-dive into how our AI-powered English learning platform delivers real-time translation, phoneme-level feedback, and personalized learning paths — all in a single, integrated ecosystem.

React + TypeScript Next.js Azure Speech SDK WebSocket Redux Ant Design

1 Technology Stack & Foundation

⚛️ Core Framework

Next.js + React + TypeScript frontend with server-side rendering. Redux state management for application-wide data consistency. Ant Design component library providing accessible, consistent UI elements across all modules.

🎨 CSS Modules Strategy

All styles use .module.scss files generating unique hashed class names — preventing any style leakage between Dashboard, Course Adventure, Vocabulary Builder, Live Translation, and Assessment modules. Each component is self-contained and portable.

// Scoped to component only import styles from './index.module.scss'; <div className={styles.container}>...</div>

📊 Data Visualization

ECharts for complex charts and graphs. react-circular-progressbar for skill completion indicators. Real-time data visualization throughout the Dashboard with animated progress rings.

2 Design Philosophy & Architecture

🏗️ Component Architecture

Modular component design — each feature is an isolated, reusable unit
Page Components pattern: src/pageComponents/account/pages/login/
Barrel exports in Index.tsx for clean imports and tree-shaking
Custom hooks: useCall* pattern for standardized API layers
SCSS variables at :root level for theming consistency

📱 Responsive System

Global classes: .mobile, .tablet, .desktop
Fluid typography via clamp() and rem units
CJK typography with line-height 1.6–1.8 for Chinese character clarity
44px minimum touch targets for young learners (ages 6–8)
48px targets on ruggedized school tablets

🌐 Multi-Environment

Production: app.thinkandspeak.com
UAT: uat-app.thinkandspeak.com
next-i18next for trilingual support (EN / 粵語 / 普通话)
JWT-based authentication with role-based routing

3 WebSocket Communication Layer

🔌 Connection Architecture

Persistent WebSocket connections via wsRef React refs to backend signaling server (NEXT_PUBLIC_WS_ROOM_URL). Exponential backoff reconnection — max 3 attempts, 10-second delays — prevents connection storms during widespread network failures.

// Connection lifecycle management const wsRef = useRef(null); const reconnectTimeoutRef = useRef(null); const reconnectAttemptsRef = useRef(0); // Auto-refresh tokens every 10 min autoRefreshRecognitionTokenTimeoutRef

🎤 Azure Speech SDK Integration

Microsoft Azure Cognitive Services Speech SDK for enterprise-grade recognition. Source language en-US with targets: zh-CN (Simplified Chinese) and yue-HK (Cantonese). Real-time phoneme-level data streamed to WebSocket while structured Message objects dispatched to Redux.

📡 Polling & Sync

Room metadata polling: 10-second intervals via isContinuouslyFetchRoomsRef
Attendance roster polling: 10-second intervals via isContinuouslyFetchAttendantsRef
Failure-count ceilings prevent infinite retry loops
Handles 10,000+ student rosters with pageSize: 10000

4 Speech Recognition Pipeline

🔄 State Machine

Idle → Recognizing → Finalizing → Dispatch

Intermediate results ("recognizing" events) enable real-time subtitle streaming — don't wait for complete sentences. Final results ("recognized" events) populate session history for vocabulary extraction.

📝 Message Object Structure

{
  "originalText": "Let's explore...",
  "translations": {
    "zh-CN": "让我们探索...",
    "yue-HK": "我哋探索下..."
  },
  "timestamp": "ISO8601",
  "senderId": "host_user_01"
}

Each utterance becomes a persistent learning asset — flowing from live WebSocket into Redux store, then to session history, then to vocabulary extraction.

🛡️ Token Security

RSA PEM-encoded key pairs (genKeyPair, decryptFromServer) encrypt Azure access tokens during transit. Enterprise-grade security for school deployments.

5 Room-Based Session Architecture

🏠 Room Creation

Room name validation: max 50 characters, non-empty constraint
Duration range: 1–480 minutes (covers quick reviews to extended sessions)
ISO 8601 datetime: YYYY-MM-DDTHH:mm:ss
Hierarchical Tree component for school/class/student selection
Composite key: ${schoolId}-${studentId} for multi-tenant data
handleInviteAllStudents and handleInviteAllInSchool batch operations

👥 Attendance States

Three-state model distinguishes participant engagement:

Attending

WebSocket-connected

Invited

API-authorized

Available

Eligible unassigned

📊 Attendance & Heatmap

Real-time engagement monitoring via WebSocket join/leave events. attendingStudentIds and invitedStudentIds track presence. "Question" and "understand" signals populate the Live Learning Feedback & Heatmap — teachers see class comprehension at a glance.

6 Room Status State Machine

🔄 Status Transitions

NOT_STARTED → STARTED → IN_PROGRESS → COMPLETED

Auto-detection: onlineStudentCount > 0 triggers IN_PROGRESS. startTime comparison with dayjs library drives temporal triggers via callModifyRoomStatus API.

👤 Student Presence & Social Accountability

Avatar stacking with 20% overlap — up to 6 visible + overflow counter
Smart fallbacks: getAvatarUrl → emoji representations (👨/👩)
roomCardInactive: grayscale + 0.6 opacity for completed sessions
Dynamic theme hashing: roomCardBlue, roomCardYellow, roomCardPink

📋 RoomHeader Responsive

useBreakpoints hook adapts desktop ↔ mobile
Connection status: green (connected), blue (connecting), red (disconnected)
roomSelectorOptions dropdown for multi-session navigation
isLoadingExport state for transcript generation UX

7 MyNotebook Ecosystem

📓 Nine-Category Taxonomy

📘 Lecture

📝 Notes

📅 Daily Log

❓ Questions

💬 Discussion

📋 Instructions

📄 Assignment

🔤 Vocabulary

📖 Grammar

labelToVariant mapping + slugifyTag for custom categories. transformNoteListToMyNoteBookDialogData parses comma-separated labels, counts per category, and maps to icon variants.

✏️ Dual-Mode Editing

Review mode: Tags displayed as static antd <Tag> components
Capture mode: Checkbox.Group for multi-label classification
Remark system: 250-character limit, autoSize: {minRows:1, maxRows:4}
Search: Dual-field across noteContent + comment via getCommentSearchText
Batch clear: Category-specific or all-notes deletion with confirmation modals

8 Session History & Vocabulary Pipeline

🔤 Word Extraction Engine

Unicode-aware regex tokenization using /[\p{L}]/gu — preserves multilingual word boundaries. Filters numeric-only tokens (/^\d+$/). Supports English-Chinese-Cantonese trilingual extraction.

// Tokenization pipeline function separateWords(text) {
  return text.split(' ')
    .filter(w => /[^\p{L}]/gu.test(w))
    .filter(w => !/^\d+$/.test(w));
}

🎯 Entity Recognition & Color Coding

hasWordEntity / getWordEntity — case-insensitive matching against taggedWordListData
Known words → green (sessionHistoryWordWithEntity)
New words → blue (zone of proximal development)
Clicking known word → LiveWordPracticeDialog (dynamic import, ssr: false)
Clicking unknown word → categorization modal with custom tags

⏰ Temporal Organization

Today: time only (HH:mm:ss)
Current year: month/day + time
Previous years: full date via date-fns
Chronological listing with delete capabilities

9 Spaced Repetition Engine

🃏 Three-State Mastery System

To Practice

Active queue

Need Review

Requires attention

Mastered

Completed

convertToMastered() checks practice scores ≥ 50 before allowing mastery. Tags stored as comma-separated strings, parsed into filter arrays.

🎴 FlashCardDialog Architecture

Card stack metaphor with z-index layering and rotate(-10deg) / rotate(+10deg)
Gradient backgrounds: linear-gradient(124.76deg, #00A1FF)
3D flip-card with transform-style: preserve-3d + backface-visibility: hidden
Rich metadata: headword, IPA transcription, part-of-speech tags
Audio playback with animated ripple effects on microphone button
Swipe left → Need Review / Swipe right → Mastered (score validation required)

10 Vocabulary Trainer — Ten Levels

📈 Progressive Difficulty System

Level	Name	Color Intensity
1–2	First / Simple Words	Light Blue
3–4	Basic / Learning Words	Blue
5–6	Intermediate / Advanced	Strong Blue
7–8	Expert / Master Words	Orange
9–10	Specialist / Extreme	Deep Orange

🎯 Five Training Tasks Per Word

Learn & Read: Listen → Read → Record pronunciation (pronunciation-only scoring)
Spelling I: Identify correctly spelled sentence (pronunciation + attempt)
Spelling II: New spelling example (pronunciation + attempt)
Word Usage: Choose correct sentence and read aloud (pronunciation + attempt)
Picture Dictation: Listen and spell the word shown (attempt-only)

Each task retriable up to 3× with score deductions. Words scoring >80 move to Mastered library. AI Smart Tips personalize suggestions based on performance patterns.

11 Phoneme-Level Pronunciation Assessment

🎤 Four-Dimensional Scoring

Accuracy

Phonetic precision

Fluency

Rhythm & prosody

Completeness

Lexical coverage

Intonation

Pitch & stress

🎨 Traffic-Light Color System

Green (90–100): Native-like production — recommend maintaining
Yellow (70–89): Good but non-native — room for improvement
Orange (40–69): Intelligible but non-native — needs practice
Red (0–39): Communication-impairing — targeted remediation

Word-level spans color-coded per score. Clicking a word reveals phoneme-level breakdown — distinguish /θ/ vs /s/, /æ/ vs /ɛ/ — critical for Cantonese-speaking students.

🔊 Audio Pipeline

RecordRTC with StereoAudioRecorder, 16kHz sampling (human speech range: 300Hz–3.4kHz). 300–500ms debounce prevents double-clicks. 15-second safety timeout prevents dead air. WAV MIME preserved on upload.

12 Speaking Practice Modes

🎭 Role-Play Scenarios

Presentations: Structured academic speaking with timed delivery
Debates: Argument frameworks with opposing viewpoint practice
Discussions: Multi-turn conversation simulating DSE Paper 3
Read & Speak: Input-to-output transformation exercises

🤖 Dual AI Tutor Model

Model.GPT

General English: everyday vocabulary, casual discussions, fluency building

Model.RAG

Exam Prep: HKDSE-aligned content, structured arguments, evidence evaluation

Switch disabled during server processing — prevents mid-conversation confusion. ChatHeader integration with Model enum.

📊 Assessment Metrics

Interruption count — measures turn-taking etiquette (DSE criterion)
Phase-based analysis — Opening, Exploration, Challenge, Synthesis
Part A + Part B separation — mirrors official DSE scoring
Rank badge — overall performance level indicator

13 Reading Comprehension Module

📖 Exam Simulation Mode

Timed reading tasks building speed and focus under exam conditions
Exam-authentic passages aligned to HKDSE standards
Real exam pressure training with countdown timers
Strategy-based skill training: main idea identification, tone interpretation, logical conclusions

🤖 AI Coach Support Mode

Students can ask questions directly during reading tasks
Mentor-style hints that scaffold understanding without giving away answers
Adaptive difficulty based on assessment profile and real-time performance
Toggle between Exam Mode and Guided Mode mid-session

📚 Course Adventure Integration

5-task lesson structure: Video → Quiz → Glossary → Download → Speaking
Progressive disclosure: icons shift from full opacity to 30% for locked content
Skills tracked: Listening, Speaking, Reading, Vocabulary
Milestone badges unlock at completion thresholds

14 DSE Paper 4 Group Discussion

🎯 Authentic Exam Format

Part A Prep
10 min → Part A Discussion
8 min → Part B Individual
10 × 60s

Four discussion phases: Opening → Exploration → Challenge → Synthesis. Six AI persona roles participate in the discussion.

👥 AI Persona Roles

🎯 Facilitator — guides flow

⚡ Starter — initiates topics

🔧 Builder — develops ideas

🔥 Challenger — questions assumptions

🔗 Connector — links concepts

💡 Clarifier — explains complexities

🎤 Real-Time Voice Processing

Azure Cognitive Services for speech recognition + forced alignment + phoneme evaluation. Real-time transcription streamed via WebSocket. Session recordings preserved for portfolio building and parent-teacher conferences.

15 8-Dimension Learner Profiling

🧠 Assessment Dimensions

#	Dimension	Options
1	Age Groups	6–8, 9–11, 12–14, 15–17, 18+
2	Gender	Avatar & theme customization
3	Interest Topics	11 categories
4	Learning Personas	9 types
5	Motivation Factors	8 categories
6	Learning Styles	8 methods
7	Proficiency Level	Scale 1–10
8	Curious Subjects	22 academic topics

⚠️ This Is NOT a Test

Students don't "pass" or "fail." The system learns preferences to deliver personalized content. Profiling happens once during onboarding, then informs everything.

16 Personalization Pipeline

🔄 Data Flow

Assessment Input

↓

Learner Profile Created

↓

Content Adaptation Difficulty Adjustment Topic Recommendations

↓

Personalized Dashboard

📊 Age-Based Color Psychology

6–8

Warm Gold

9–11

Teal

12–14

Cool Blue

15–17

Orange

18+

Deep Orange

data-age-group attribute selectors cascade color themes through the entire interface — from authentication to Course Adventure progression to Live Translation cohort indicators.

🎓 Interesting Subjects (22 Topics)

Math, Science, Biology, Chemistry, Physics, History, Geography, Economics, Business Studies, Literature, Music, Art, PE, Computer Science, General Studies, Religious Studies, Philosophy, Psychology, Sociology, Environmental Science, Astronomy, Law Basics — includes HKDSE-relevant areas.

17 Student Dashboard Metrics

127

Practice Sessions

342

Words Mastered

48h

Time Spent

8

Courses Done

⚙️ Technical Architecture

useCallGetUserDashboard — custom API hook for real-time data
Redux store synchronization for state consistency
useModuleGuardian — access control for module sequencing
Optimistic UI updates with rollback on API failure
Skeleton loading states during data fetch

🏆 Gamification Elements

Points system earned through completions, streaks, and scores
Achievement badges unlocking at milestones
Login streak tracking with celebration animations
Progress bars for each module with animated fills

18 Multi-Tab Reporting Suite

📋 Three Report Categories

🎓 Courses Report

Enrollment tracking, completion rates, skill development (Fluency, Accuracy, Completeness, Intonation)

🎤 Scripts Report

Speaking practice monitoring, script mastery, best performance highlighting, improvement trends

📚 VocabBank Report

5-skill radar chart: Vocab Guide, Spelling L1, Spelling L2, Vocab Usage, Dictation

⏱️ Time Range & Comparison

Preset filters: 7 days, 30 days, 90 days, all-time
Period comparison: current vs. previous statistical analysis
Circular progress visualization for overall performance
AI-powered insights: summary and improvement suggestions

🎯 Spaced Repetition Integration

Ebbinghaus forgetting curve optimization: 1 day → 3 days → 1 week → 2 weeks → 1 month → 3 months. Students using spaced repetition retain vocabulary 3× better than mass exposure. "Words Mastered" requires 4+ successful review cycles.

19 AI Tutor System

🤖 Dual Tutor Modes

ChatBox/ChatFooter component with dual-model system (GPT vs. RAG) and input mode flexibility (text/audio). Adapts to proficiency — beginners type while advanced students use speech-to-speech.

Model.GPT

General English conversation, fluency, everyday vocabulary

Model.RAG

Exam Prep: HKDSE content, structured arguments, evidence quality

💭 Conversational UX

Auto-scrolling: scrollContainer.scrollTo({top: scrollHeight}) mimics eye contact
Activity state awareness: isUserSending, isSystemThinking, isSystemAnswering
"Clear History" hidden during active AI explanation — protects impulsive young learners
Message count logged to analytics before clearing
Sticky messages: system/host messages pinned to screen for broadcast instructions

20 Thinking Lab — Critical Thinking

🧪 Debate & Argumentation Module

Distinct learning vertical for debate and critical thinking
External bridge to nexus-ari-ai.replit.app for specialized debate environments
Structured arguments: opening, evidence, counterpoint, conclusion
Academic vocabulary for economics, science, social issues
Real-time feedback on argument structure and evidence quality

🎨 Mascot Emotional Feedback (ARI)

Oops

0–50% Encouraging

Keep Up

51–70% Supportive

Amazing!

71–100% Celebrating

Affective Filter reduction (Krashen): anxiety reduction through friendly, non-textual feedback. Blue mascot creates emotional safety — no harsh red X marks.

⚡ Skip Button — Interruptibility

opacity: 0 → opacity: 100 on hover. Students can interrupt AI mid-explanation — just like interjecting in human conversation. Builds speaking confidence through agency.

21 Read & Speak Module

📖 Daily Reading Comprehension

AI selects readings based on assessment interests/topics
Text complexity dynamically adjusts to student proficiency
Guided Learning Mode with AI hints and scaffolding
Exam Mode with timed practice under authentic conditions

📝 Personalized Script Practice

Teachers upload scripts in bulk; students can upload their own
Listening and recording functions with pronunciation comparison
Practice record history and scores tracked per script
Top progress bar: total practice hours used vs. remaining

🔄 Batch Script Generation

Multi-word selections (≥5 words) auto-added to Scripts sidebar via pendingMultiSelectCommitRef with 400ms debounce. 180-character minimum ensures substantive practice passages. Authenticated discourse becomes personalized speaking material.

22 Learning Sidebar — Command Center

📜 Scripts Tab (Orange)

Contextual sentence collection for Read & Speak practice
Source room reference tracking and timestamps
Multi-word selection auto-add (≥5 words, 400ms debounce)
Script history archival with frequency tracking

📚 Categories Tab (Green)

Tag-based taxonomy: Subject, Theme, Difficulty, Custom Tags
Default "To Practice" tag auto-filtering
toPracticeCount: filteredWordList.filter(item => item.tags.includes('To Practice')).length
Flashcard queue management per category

📓 My Notebook Integration

myNotebookDialogOpenInfo.edit({roomId}) — seamless transition from live translation to personal study. Complete learning loop: listen → capture → review → master. Decorative CSS layers (polygons, stars, card layers) create game-like flashcard aesthetic.

23 Cognitive Load Management

Property	Desktop	Mobile
Font Size	24px	18px
Line Height	32px	26px
Max Width	75%	90%
Padding Top	60px	24px

Expanded line height prevents crowding during bilingual EN+ZH reading. 75% max-width = ~66 characters (optimal reading line length).

🌐 CJK Font Stack

font-family: system-ui,
  "Segoe UI", Roboto,
  "Noto Sans CJK SC",
  "Noto Sans CJK TC",
  sans-serif;

24 AI Presence Indicators

🤖 nodRocket Animation

@keyframes nodRocket {
  0% { transform: rotate(0deg); }
  25% { transform: rotate(-5deg); }
  75% { transform: rotate(5deg); }
  100% { transform: rotate(0deg); }
}

Blue mascot physically "nods" while processing — mimics human teacher non-verbal cues. Addresses the "uncanny valley" — system lag feels like active engagement, not dead time.

🎨 Iconography System

IconNote (34×34px) — in-context capture with pencil overlay
IconMyNotes (16×16px) — stroke-based for dense note lists
IconVocabulary — abstract "A" formation, stroke-linecap="round"
IconQuestions (22×22px) — concentric circles for confusion signals
IconExport (40×40px) — folder with upward arrow for learning assets
All icons use aria-hidden="true" and currentColor strategy

25 Animation & Performance

✨ LinesAnimation (9s loop)

SVG SMIL animation (<animateTransform>) — GPU-accelerated 60fps
Three opacity layers: 0.55 / 0.35 / 0.22 (hint levels)
Sine wave paths: organic, non-linear motion
Pill-shaped clip path: rx="30" ry="30"

🚀 Performance Optimizations

content-visibility: auto for off-screen elements
contain: layout style paint isolation
will-change: transform for GPU acceleration
Dynamic imports with ssr: false for heavy components
Debounced search (500ms) and message deduplication

♿ Accessibility

WCAG 2.1 AA: aria-hidden on decorative SVGs
prefers-reduced-motion support
Keyboard navigation: onPressEnter, autoFocus
44px+ touch targets for young learners

26 The Closed-Loop Learning System

🔄 Listen → Capture → Review → Master

🎤 LISTEN — Real-time bilingual subtitles (EN ↔ ZH)

↓

📝 CAPTURE — Smart notes + vocabulary extraction

↓

📖 REVIEW — Flashcards + spaced repetition

↓

🏆 MASTER — Measurable progress + AI insights

🎯 For Your Classroom

Reduced prep time: Automated vocabulary extraction from lessons
Real-time engagement: Live feedback on student understanding
Differentiated instruction: AI adapts to each student's level
Multilingual support: Teach diverse classrooms without language barriers
Every class = a learning asset: Structured, reusable content

27 Getting Started

🚀 Demo Credentials

Role	Email	Password
Teacher	test_teacher@seechange-edu.com	Aa123456
Student	test_student1@seechange-edu.com	Aa123456

🌐 Access URLs

app.thinkandspeak.com

Production Environment

uat-app.thinkandspeak.com

UAT / Testing Environment

💬 Next Steps

Try the platform with the test credentials above
Explore the Live Translation feature with a colleague
Check the Dashboard analytics after completing a practice session
Contact us for school-wide deployment setup

"Teach with confidence. Learn without limits."