System Overview
GenNet is a cloud-native microservices platform for Gene Regulatory Network analysis, designed for scalability, reliability, and modern scientific collaboration.
🎯 Scope and Capabilities
The GenNet platform provides comprehensive tools for:
- Multi-Scale Analysis: Single-cell to tissue-level GRN modeling
- Automated Inference: ML-driven GRN reconstruction from omics data
- Qualitative Modeling: Formal verification using Computational Tree Logic (CTL)
- Hybrid Modeling: Time-delay analysis with HyTech integration
- Real-Time Collaboration: WebSocket-based multi-user editing
- HPC Orchestration: Kubernetes-native job scheduling with GPU support
🏗️ Architecture Components
Core Services
- API Gateway (Kong)
- Request routing and load balancing
- JWT authentication and authorization
- Rate limiting and API versioning
- CORS handling and security headers
- Auth Service
- User registration and login
- JWT token generation and validation
- Role-Based Access Control (RBAC)
- Session management and refresh tokens
- GRN Service
- CRUD operations for GRN networks
- Graph storage and querying (Neo4j)
- Network validation and consistency checks
- Import/export in multiple formats (SBML, BioPAX, JSON)
- Workflow Service
- Orchestration of analysis pipelines
- Job queuing with Redis
- Status tracking and progress monitoring
- Result aggregation and caching
Analysis Services
- Qualitative Service
- CTL formula verification using SMBioNet
- K-parameter generation and optimization
- State graph generation and analysis
- Parameter space exploration
- Hybrid Service
- Time delay computation with HyTech
- Hybrid automata modeling
- Trajectory simulation and analysis
- ML Service
- GRN inference algorithms (ARACNE, GENIE3, GRNBoost2)
- Parameter prediction using Graph Neural Networks
- Anomaly detection in gene expression data
- Disease prediction models
Supporting Services
- Collaboration Service
- Real-time WebSocket communication
- Operational Transformation for conflict-free editing
- Presence tracking and user activity monitoring
- Metadata Service
- Centralized data catalog
- Metadata indexing and search
- Data lineage tracking
- GraphQL Service
- Flexible API for complex queries
- Schema stitching for federated services
- Real-time subscriptions
- HPC Orchestrator
- Kubernetes Job and CronJob management
- GPU resource allocation
- Distributed computing with Ray/Dask
- Batch processing pipelines
🔄 Data Flow Architecture
sequenceDiagram
participant U as User
participant UI as Web UI
participant GW as API Gateway
participant AS as Auth Service
participant WS as Workflow Service
participant GS as GRN Service
participant DB as Databases
participant HPC as HPC Cluster
participant S3 as Object Storage
U->>UI: Login Request
UI->>GW: Authenticate
GW->>AS: Validate Credentials
AS->>DB: Check User
AS-->>GW: JWT Token
GW-->>UI: Token
U->>UI: Create Network
UI->>GW: POST /networks
GW->>GS: Create Network
GS->>DB: Store Graph
GS-->>GW: Network ID
GW-->>UI: Success
U->>UI: Start Analysis
UI->>GW: POST /workflows
GW->>WS: Create Workflow
WS->>DB: Queue Job
WS->>HPC: Submit Job
HPC->>HPC: Execute Analysis
HPC->>S3: Store Results
HPC-->>WS: Job Complete
WS->>DB: Update Status
WS-->>GW: Results Ready
GW-->>UI: Push Update
🗂️ Data Architecture
graph TD
subgraph "Primary Data"
PG[(PostgreSQL<br/>Metadata & Users)]
NEO[(Neo4j<br/>GRN Graphs)]
end
subgraph "Caching & Sessions"
REDIS[(Redis<br/>Sessions & Cache)]
end
subgraph "Time Series"
INF[(InfluxDB<br/>Metrics & Logs)]
end
subgraph "Object Storage"
S3[(S3<br/>Results & Files)]
end
subgraph "Services"
GS[GRN Service] --> NEO
GS --> PG
WS[Workflow Service] --> REDIS
WS --> INF
WS --> S3
AS[Auth Service] --> PG
CS[Collaboration] --> REDIS
end
🛠️ Technology Stack
- Languages: Python 3.11+, Go 1.21+, TypeScript
- Frameworks: FastAPI, Next.js 14, React
- Databases: PostgreSQL, Neo4j, Redis, InfluxDB
- Cloud: AWS (EKS, RDS, Neptune, S3, ElastiCache)
- Container: Docker, Kubernetes
- ML: PyTorch, TensorFlow, scikit-learn
- Messaging: Kafka, WebSockets
- Monitoring: Prometheus, Grafana, ELK Stack