Status History
May 29, 04:16 PM Pending
May 31, 12:56 PM In Progress
May 31, 12:56 PM In Progress
Intake Form
Technical Specification
Frontend
Next.js + React (existing) with WebSocket/Socket.IO client, WebRTC audio capture
Backend
NestJS API Gateway (Node.js) + FastAPI PersonaPlex Service (Python) + Python AI Orchestrator + Python Moshi Streaming Runtime + Python Telephony Service
Database
MongoDB (long-term memory, personas, conversation history) + Redis (hot session cache, pub/sub, stream buffering)
Hosting
Docker + Docker Compose on GPU-enabled bare-metal or cloud VM (L40S / A100 / H100 / RTX 4090), NGINX reverse proxy, NVIDIA Container Toolkit
Summary

Aziza is a production-grade, GPU-accelerated, real-time multilingual AI voice platform built on a strict microservices architecture: an NGINX-fronted NestJS API Gateway handles all external WebSocket and REST traffic, while isolated internal services — PersonaPlex (FastAPI), AI Orchestrator (Python asyncio), and the Moshi Streaming Runtime (PyTorch CUDA) — coordinate persona persistence, GPU scheduling, and sub-120ms streaming inference across concurrent sessions. The platform uniquely combines a two-tier Redis/MongoDB memory architecture with the PersonaPlex persona engine to deliver emotionally-aware, language-adaptive AI conversations that persist across sessions, and extends to telephony clients via Asterisk ARI/RTP bridging, all deployed via Docker Compose with full Prometheus/Grafana/Loki observability and optional LoRA fine-tuning for Russian, Uzbek, and English multilingual quality.

File Structure
docker-compose.yml Orchestrates all services: nginx, nestjs-gateway, personaplex, ai-orchestrator, moshi-runtime, telephony, redis, mongodb, prometheus, grafana, loki
.env.example All environment variable templates: JWT secrets, Redis/Mongo credentials, HuggingFace tokens, GPU config, service URLs — never hardcoded
nginx/nginx.conf NGINX reverse proxy config: routes /api to NestJS gateway, /ws to websocket layer, /metrics to Prometheus, enforces HTTPS, rate limiting headers
nginx/Dockerfile NGINX container build with SSL cert mounting and custom config
services/gateway/Dockerfile NestJS API Gateway container build, Node.js 20 LTS base, production build step
services/gateway/package.json NestJS dependencies: @nestjs/core, @nestjs/websockets, socket.io, passport-jwt, throttler, class-validator, winston
services/gateway/src/main.ts NestJS bootstrap: enables CORS, Socket.IO adapter, global validation pipe, Helmet security, starts on port 3001
services/gateway/src/app.module.ts Root module importing AuthModule, SessionModule, StreamModule, PersonaModule, HealthModule, WebSocketGatewayModule
services/gateway/src/auth/auth.module.ts Auth module wiring: JwtModule, PassportModule, AuthService, AuthController, JwtStrategy, RefreshTokenStrategy
services/gateway/src/auth/auth.service.ts JWT issuance, refresh token rotation, bcrypt password hashing, token blacklist via Redis
services/gateway/src/auth/auth.controller.ts POST /auth/login, POST /auth/register, POST /auth/refresh, POST /auth/logout endpoints
services/gateway/src/auth/jwt.strategy.ts Passport JWT strategy validating Bearer tokens against Redis blacklist
services/gateway/src/session/session.module.ts Session module: SessionService, SessionController, Redis-backed session store
services/gateway/src/session/session.service.ts Creates, retrieves, updates, destroys sessions in Redis with TTL; maps session_id to user_id and persona_id
services/gateway/src/session/session.controller.ts GET /session/:id, POST /session/create, DELETE /session/:id with JWT guard
services/gateway/src/stream/stream.module.ts Stream module: StreamController for HTTP stream lifecycle endpoints
services/gateway/src/stream/stream.controller.ts POST /stream/start, POST /stream/stop — publishes events to Redis pub/sub for AI Orchestrator consumption
services/gateway/src/persona/persona.module.ts Persona proxy module: forwards requests to PersonaPlex internal service via HTTP
services/gateway/src/persona/persona.controller.ts GET /persona/:id, POST /persona/load, POST /persona/switch — proxied to PersonaPlex with internal auth header
services/gateway/src/websocket/voice.gateway.ts Socket.IO gateway: handles audio chunk events (audio:chunk, audio:start, audio:stop), authenticates via JWT handshake, routes chunks to Redis stream queue, emits AI audio responses back to client
services/gateway/src/websocket/websocket.module.ts WebSocket module wiring VoiceGateway with Redis pub/sub subscriber for AI response events
services/gateway/src/health/health.controller.ts GET /health — returns service status, Redis ping, MongoDB ping, upstream service reachability
services/gateway/src/common/redis.service.ts Shared Redis client wrapper using ioredis: pub/sub, get/set/del, stream operations, connection pooling
services/gateway/src/common/logger.service.ts Winston structured logger with JSON output, log levels from env, request correlation IDs
services/personaplex/Dockerfile FastAPI PersonaPlex container: Python 3.11 slim base, installs requirements, runs with uvicorn workers
services/personaplex/requirements.txt FastAPI, uvicorn, motor (async MongoDB), redis[asyncio], pydantic, httpx, python-jose, langchain-core (for summarization utilities)
services/personaplex/main.py FastAPI app factory: mounts persona router, memory router, prompt router, adds internal auth middleware, startup/shutdown Redis and MongoDB connections
services/personaplex/routers/persona.py POST /persona/load, POST /persona/switch, GET /persona/state — loads persona from MongoDB, caches in Redis, returns assembled persona profile
services/personaplex/routers/memory.py POST /memory/store (persist turn to MongoDB + update Redis context window), POST /memory/retrieve (Redis-first with MongoDB fallback, returns ranked memory refs)
services/personaplex/routers/prompt.py POST /prompt/build — assembles final system prompt from persona profile + emotional state + active context window + memory refs in <30ms, returns prompt string
services/personaplex/services/persona_service.py Business logic: load persona by ID from MongoDB, apply language adaptation, track emotional state transitions, persist updates
services/personaplex/services/memory_service.py Hot/cold memory strategy: writes to Redis active_context_window (capped sliding window), async writes to MongoDB, triggers summarization worker when window exceeds threshold
services/personaplex/services/prompt_service.py Prompt assembly engine: injects persona system prompt, emotional modifiers, language instructions, summarized memory, recent context window into final prompt dict
services/personaplex/workers/summarization_worker.py Async background worker: polls Redis for sessions needing summarization, calls LLM summarization endpoint, stores summary to MongoDB, updates memory_refs in session
services/personaplex/models/persona.py Pydantic models: PersonaProfile, EmotionState, LanguageConfig, PersonaSession matching the PersonaPlex session model schema
services/personaplex/models/memory.py Pydantic models: MemoryEntry, ContextWindow, MemoryRef, SummarizedMemory
services/personaplex/middleware/internal_auth.py FastAPI middleware validating X-Internal-Token header on all requests — PersonaPlex is never publicly exposed
services/personaplex/db/mongo.py Motor async MongoDB client: connection pool, collections for personas, memories, conversation_history, analytics
services/personaplex/db/redis_client.py aioredis async client: session hot cache, pub/sub, context window storage with TTL management
services/ai-orchestrator/Dockerfile Python 3.11 orchestrator container with asyncio, redis, httpx dependencies
services/ai-orchestrator/requirements.txt redis[asyncio], httpx, asyncio, pydantic, structlog, prometheus-client
services/ai-orchestrator/main.py Orchestrator entrypoint: starts Redis pub/sub listeners, GPU worker pool, session router, inference scheduler event loops
services/ai-orchestrator/orchestrator/session_router.py Routes incoming session stream requests to available GPU workers; maintains session-to-worker mapping in Redis; handles worker failover
services/ai-orchestrator/orchestrator/gpu_scheduler.py GPU worker pool manager: tracks worker availability, GPU memory headroom, assigns sessions to least-loaded workers, implements backpressure
services/ai-orchestrator/orchestrator/stream_lifecycle.py Manages full stream lifecycle: start → PersonaPlex prompt build → Moshi inference → audio stream → stop; handles interruptions and reconnects
services/ai-orchestrator/orchestrator/pubsub_coordinator.py Redis pub/sub coordinator: subscribes to session:start, session:stop, audio:chunk channels; publishes to inference:queue and audio:response channels
services/ai-orchestrator/orchestrator/service_discovery.py Internal service registry: resolves Moshi runtime, PersonaPlex, telephony service URLs from env/Redis; health-checks upstreams
services/ai-orchestrator/orchestrator/failover.py Failover handler: detects worker crashes via heartbeat, reassigns sessions, publishes reconnect events to gateway
services/moshi-runtime/Dockerfile GPU-enabled container: FROM nvcr.io/nvidia/pytorch:24.01-py3, installs moshi, transformers, vllm/TGI, transformer-engine, bitsandbytes, CUDA toolkit
services/moshi-runtime/requirements.txt torch, moshi, transformers, accelerate, bitsandbytes, transformer-engine, vllm, redis[asyncio], asyncio, numpy, soundfile
services/moshi-runtime/main.py Moshi runtime entrypoint: loads model to GPU, starts Redis queue consumer, starts async inference loop, starts audio stream publisher
services/moshi-runtime/runtime/model_loader.py Loads Moshi model from HuggingFace Hub with auth token, configures FP8/BF16 precision, moves to CUDA device, initializes KV-cache
services/moshi-runtime/runtime/inference_engine.py Async streaming inference: consumes audio chunks from Redis queue, runs Moshi forward pass with CUDA streams, yields token stream, publishes audio response chunks to Redis
services/moshi-runtime/runtime/audio_processor.py FFmpeg-backed audio transcoding: converts incoming WebRTC audio to model input format, converts model output to client-compatible format (opus/pcm)
services/moshi-runtime/runtime/kv_cache_manager.py KV-cache lifecycle: allocates per-session cache slots, evicts on session end, implements memory pooling to avoid CUDA OOM
services/moshi-runtime/runtime/cuda_stream_manager.py CUDA stream pool: assigns dedicated CUDA streams per session, enables concurrent inference without serialization
services/moshi-runtime/runtime/dynamic_batcher.py Dynamic batching: collects inference requests within a configurable time window, batches compatible requests, dispatches to GPU, unbatches responses
services/moshi-runtime/runtime/fp8_optimizer.py transformer-engine FP8 quantization wrapper: wraps model layers for FP8 forward pass on supported GPUs (H100, L40S, A100)
services/moshi-runtime/runtime/memory_pool.py GPU memory pool: pre-allocates tensor buffers for expected concurrent sessions, recycles buffers on session end to avoid fragmentation
services/moshi-runtime/runtime/latency_monitor.py Per-request latency tracking: measures time-to-first-token, total generation time, publishes metrics to Prometheus pushgateway
services/telephony/Dockerfile Python 3.11 telephony service container with Asterisk ARI client, RTP libraries
services/telephony/requirements.txt panoramisk (Asterisk ARI), asyncio, redis[asyncio], httpx, pydantic, structlog
services/telephony/main.py Telephony service entrypoint: connects to Asterisk ARI websocket, starts call event listener, starts RTP bridge manager
services/telephony/telephony/ari_client.py Asterisk ARI client: handles StasisStart/StasisEnd events, answers calls, bridges channels, controls call lifecycle
services/telephony/telephony/rtp_bridge.py RTP bridge: receives RTP audio from Asterisk, converts to PCM chunks, publishes to Redis audio queue; subscribes to AI audio responses, sends RTP back to Asterisk
services/telephony/telephony/sip_handler.py SIP session management: maps SIP call IDs to Aziza session IDs, creates sessions in gateway via internal API, handles call routing
services/telephony/telephony/call_lifecycle.py Call lifecycle manager: start call → create session → stream audio → end call → cleanup session; handles reconnects and call drops
services/telephony/telephony/recording.py Optional call recording: streams audio to file or object storage, manages recording lifecycle per call
monitoring/prometheus/prometheus.yml Prometheus scrape config: targets for NestJS metrics, FastAPI metrics, Moshi runtime metrics, Redis exporter, MongoDB exporter, NVIDIA GPU exporter
monitoring/grafana/dashboards/aziza-overview.json Grafana dashboard: GPU memory utilization, concurrent sessions, token throughput, WebSocket connections, latency percentiles, error rates
monitoring/grafana/dashboards/gpu-performance.json GPU-specific dashboard: per-GPU memory, CUDA utilization, FP8 throughput, KV-cache hit rate, OOM events
monitoring/grafana/dashboards/session-health.json Session health dashboard: active sessions, session creation rate, session errors, persona load times, prompt assembly latency
monitoring/loki/loki-config.yml Loki log aggregation config: ingests structured JSON logs from all services via Promtail, retention policies
monitoring/promtail/promtail-config.yml Promtail config: scrapes Docker container logs, adds service labels, ships to Loki
monitoring/alertmanager/alerts.yml Alert rules: GPU OOM, latency >200ms, session failure rate >5%, Redis connection loss, MongoDB write failures
fine-tuning/Dockerfile Fine-tuning container: GPU-enabled, installs PEFT, LoRA, QLoRA, bitsandbytes, datasets, accelerate, transformers
fine-tuning/requirements.txt peft, transformers, datasets, accelerate, bitsandbytes, torch, sentencepiece, evaluate
fine-tuning/scripts/prepare_dataset.py Dataset preparation: loads Russian, Uzbek Latin, Uzbek Cyrillic, English conversation datasets, formats for instruction tuning, splits train/eval
fine-tuning/scripts/audit_tokenizer.py Tokenizer audit: checks coverage for all 4 languages, identifies missing tokens, generates extension vocabulary if needed
fine-tuning/scripts/train_lora.py LoRA/QLoRA training script: loads base model, applies PEFT LoRA config, trains on multilingual dataset, saves adapter weights
fine-tuning/scripts/validate_streaming.py Post-training validation: loads adapter, runs streaming inference, measures latency impact, validates multilingual response quality
fine-tuning/configs/lora_config.json LoRA hyperparameters: rank, alpha, target modules, dropout, task type — tuned for minimal latency impact
fine-tuning/configs/training_args.json Training arguments: batch size, gradient accumulation, learning rate, warmup, FP16/BF16, output dir
scripts/deploy.sh Production deployment script: pulls latest images, runs docker-compose up with GPU flags, waits for health checks, runs smoke tests
scripts/health_check.sh Validates all services are healthy: curls /health endpoints, checks Redis ping, MongoDB connection, GPU visibility
scripts/gpu_check.sh Verifies NVIDIA Container Toolkit setup, GPU visibility in containers, CUDA version compatibility
scripts/seed_personas.py Seeds MongoDB with initial persona profiles for testing: multilingual personas with emotional state configs
shared/schemas/session.schema.json Shared PersonaPlex session model JSON schema used across services for validation
shared/schemas/persona.schema.json Shared persona profile schema: name, language, emotional_range, system_prompt_template, memory_config
shared/proto/aziza.proto Optional gRPC proto definitions for high-performance internal service communication (Orchestrator ↔ Moshi runtime)
Features (12)
Phase 1: Dockerized GPU Infrastructure Foundation P1
Establishes the complete containerized infrastructure with GPU support, Redis, MongoDB, and NGINX as the stable foundation for all subsequent services.
  • docker-compose up brings all infrastructure containers online without errors
  • nvidia-smi is accessible inside GPU-enabled containers confirming CUDA visibility
  • Redis responds to PING with PONG and persists data across container restarts
  • MongoDB accepts authenticated connections and persists data across restarts
  • NGINX routes /api and /ws traffic correctly to upstream services
  • All containers have health checks defined and pass within 30 seconds of startup
  • Environment variables are loaded from .env file with no hardcoded secrets
  • Centralized logging aggregates container logs to Loki via Promtail
Phase 2: NestJS API Gateway with JWT Auth and WebSocket P2
Builds the sole external-facing backend service handling authentication, session management, rate limiting, and WebSocket audio event routing.
  • POST /auth/login returns signed JWT access token and refresh token
  • POST /auth/refresh rotates refresh token and issues new access token
  • Expired or blacklisted tokens return 401 on all protected endpoints
  • WebSocket connections require valid JWT in handshake query or header
  • Rate limiting blocks requests exceeding configured thresholds per IP
  • GET /health returns 200 with Redis and MongoDB connectivity status
  • POST /session/create returns a unique session_id stored in Redis
  • Socket.IO audio:chunk events are received and published to Redis stream queue
  • Structured JSON logs include correlation IDs for every request
Phase 3: Real-Time Duplex Audio Streaming Pipeline P3
Establishes the end-to-end async audio streaming pipeline from WebSocket client through FFmpeg transcoding to Redis queues and back.
  • Audio chunks received via WebSocket are published to Redis stream within 10ms
  • FFmpeg transcodes incoming WebRTC audio (opus) to model-compatible PCM format without blocking
  • Audio response chunks are streamed back to client as they are generated, not buffered
  • Client disconnection triggers graceful stream cleanup without orphaned Redis entries
  • Reconnecting client resumes session state from Redis within 500ms
  • Interruption signal (user speaks while AI responds) halts current generation within 50ms
  • Stream jitter remains below 20ms under normal network conditions
  • Async queue depth stays below 50 chunks under sustained load
Phase 4: Moshi Streaming Inference Runtime P4
Deploys the GPU-backed Moshi model with async token streaming, KV-cache management, and Redis queue integration for real-time AI voice generation.
  • Moshi model loads successfully from HuggingFace Hub using authenticated token
  • Model is loaded to CUDA device and confirmed via torch.cuda.is_available()
  • First token is generated within 120ms of receiving audio input chunk
  • Tokens are streamed to Redis audio:response channel as they are generated
  • KV-cache is allocated per session and released on session end
  • No GPU memory leaks detected after 100 consecutive session cycles
  • Inference runs fully async without blocking the event loop
  • Model handles empty or malformed audio input gracefully without crashing
Phase 5: GPU Optimization for Concurrent Sessions P5
Optimizes the inference runtime for FP8 precision, CUDA stream parallelism, dynamic batching, and memory pooling to support 8-10 concurrent sessions at sub-120ms latency.
  • FP8 inference enabled via transformer-engine on H100/L40S GPUs with measurable throughput improvement
  • 8 concurrent streaming sessions run simultaneously without CUDA OOM errors
  • Per-session CUDA streams prevent serialization between concurrent inference calls
  • Dynamic batcher groups compatible requests within 5ms window before GPU dispatch
  • GPU memory pool pre-allocates buffers for max_sessions and recycles without fragmentation
  • Time-to-first-token remains below 120ms at 8 concurrent sessions
  • Latency monitor publishes p50/p95/p99 metrics to Prometheus every 10 seconds
  • GPU scheduler rejects new sessions when memory headroom is insufficient rather than crashing
Phase 6: PersonaPlex Microservice — Persona and Memory Orchestration P6
Deploys the isolated PersonaPlex FastAPI service providing persona profile management, Redis-backed hot memory, MongoDB cold storage, multilingual adaptation, emotional state tracking, and sub-30ms prompt assembly.
  • POST /persona/load retrieves persona from MongoDB, caches in Redis, returns within 50ms
  • POST /persona/switch atomically updates session persona in Redis and MongoDB
  • GET /persona/state returns current emotional state and active context window from Redis
  • POST /memory/store writes conversation turn to Redis context window and async to MongoDB
  • POST /memory/retrieve returns ranked memory refs from Redis-first with MongoDB fallback
  • POST /prompt/build assembles complete system prompt in under 30ms
  • Summarization worker triggers when context window exceeds 20 turns and stores summary to MongoDB
  • Persona state persists correctly across session reconnects
  • Multilingual persona correctly adapts system prompt language based on session language field
  • PersonaPlex rejects all requests without valid X-Internal-Token header with 403
  • Emotional state transitions are tracked and stored per session turn
Phase 7: AI Orchestrator — Centralized Session and GPU Coordination P7
Builds the Python asyncio orchestration service that coordinates session routing, GPU worker allocation, stream lifecycle, Redis pub/sub, and failover across all runtime services.
  • Orchestrator subscribes to session:start Redis channel and routes to available GPU worker within 20ms
  • Session-to-worker mapping stored in Redis prevents cross-session data leakage
  • GPU scheduler assigns sessions to workers with sufficient memory headroom
  • Worker heartbeat failure triggers session reassignment within 5 seconds
  • Stream lifecycle correctly sequences: prompt build → inference start → audio stream → session end
  • Redis pub/sub coordinator handles 50 concurrent channel subscriptions without message loss
  • Service discovery resolves all upstream URLs from environment and validates health on startup
  • Orchestrator exposes /health endpoint confirming all downstream service connectivity
Phase 9: Telephony Integration via Asterisk ARI P8
Enables AI voice conversations over traditional telephony by integrating Asterisk ARI for SIP call handling and RTP audio bridging to the Aziza streaming pipeline.
  • Asterisk ARI WebSocket connection established and StasisStart events received on incoming calls
  • Incoming SIP call creates a corresponding Aziza session via internal API
  • RTP audio from caller is converted to PCM and published to Redis audio queue
  • AI audio responses are received from Redis and sent back via RTP to caller
  • Call end (StasisEnd) triggers graceful session cleanup
  • RTP round-trip latency stays below 200ms under normal conditions
  • Call recording optionally captures both sides of conversation to file
  • Telephony service reconnects to Asterisk ARI automatically after connection loss
Phase 10: Monitoring and Observability Stack P9
Deploys Prometheus, Grafana, Loki, and alerting to provide full production observability across GPU performance, session health, latency, and error rates.
  • Prometheus scrapes metrics from all services every 15 seconds without gaps
  • Grafana overview dashboard shows GPU memory, concurrent sessions, token throughput, and WebSocket connections in real time
  • GPU performance dashboard shows per-GPU CUDA utilization, FP8 throughput, and KV-cache hit rate
  • Loki receives structured JSON logs from all containers with correct service labels
  • Alert fires within 2 minutes when GPU memory exceeds 90% utilization
  • Alert fires when p95 latency exceeds 200ms for 5 consecutive minutes
  • Alert fires when session error rate exceeds 5% over a 1-minute window
  • NVIDIA GPU exporter exposes dcgm metrics for L40S/A100/H100 GPU health
Phase 11: Multilingual Fine-Tuning with LoRA Adapters P10
Creates language-specific LoRA adapters for Russian, Uzbek Latin, Uzbek Cyrillic, and English that integrate with the streaming runtime without degrading latency.
  • Dataset preparation script produces formatted train/eval splits for all 4 languages
  • Tokenizer audit confirms adequate coverage for Uzbek Cyrillic and Latin scripts
  • LoRA adapter training completes without OOM on target GPU with QLoRA if needed
  • Adapter weights are under 500MB per language for practical deployment
  • Loading LoRA adapter adds less than 15ms to model initialization time
  • Streaming inference with adapter maintains time-to-first-token below 150ms
  • Multilingual response quality validated by native speaker evaluation on 50-turn test set
  • Adapter can be hot-swapped per session based on PersonaPlex language field
JWT Authentication and WebSocket Security P1
Ensures all external communication is authenticated and all internal services are isolated from public access.
  • All REST endpoints except /auth/login and /auth/register require valid JWT
  • WebSocket handshake validates JWT before accepting connection
  • Refresh token rotation invalidates previous refresh token on use
  • Internal services (PersonaPlex, Orchestrator, Moshi) are not exposed via NGINX
  • X-Internal-Token header required for all inter-service communication
  • Redis and MongoDB require authentication credentials from environment variables
  • HTTPS enforced at NGINX layer with TLS termination
PersonaPlex Session Model and Hot/Cold Memory Strategy P2
Implements the defined session schema and two-tier memory architecture ensuring sub-30ms context retrieval for active sessions.
  • Session model matches defined schema: session_id, user_id, persona_id, language, emotion_state, conversation_summary, active_context_window, memory_refs
  • Active context window stored in Redis with configurable TTL and sliding window eviction
  • Emotional state updated per turn and reflected in next prompt assembly
  • Summarized memories stored in MongoDB and referenced via memory_refs array
  • Redis-first retrieval returns context in under 5ms for active sessions
  • MongoDB fallback retrieves cold memory in under 100ms
  • Memory persists correctly when session reconnects after disconnect
Build Log
scoping Starting AI-powered tech spec generation
scoping Tech spec generated successfully
start Build orchestration started for project 2
attempt Build attempt 1/3
generate Attempt 1 failed: AI generation failed: Unterminated string in JSON at position 45824 failed
retry Retrying (2/3)...
attempt Build attempt 2/3
generate Attempt 2 failed: AI generation failed: 429 Daily token limit reached (100,000 tokens). Resets at midnight UTC. failed
retry Retrying (3/3)...
attempt Build attempt 3/3
generate Attempt 3 failed: AI generation failed: 429 Daily token limit reached (100,000 tokens). Resets at midnight UTC. failed
complete Build failed after 3 attempts failed
status Project status updated to Build Failed
Deliverables
📦

Deliverables become available once project reaches Review status.