
隽戈
Core Capabilities
Tech Stack
Community & Open Source
Active Communities
- Expert Ops Community
- HAMI Project Community
- Xinference Community
- CNCF Cloud-Native Southwest China
- ArkSphere AI Community
Open Source Experience
AIOps Capability Pipeline
Familiar with the complete pipeline from multi-source data ingestion to intelligent operations closure:
Career Background
Held a key technical leadership role in Singapore Telecom Smart City Project, driving the complete infrastructure evolution from early Mesos to modern Kubernetes cloud-native architecture.
During tenure at Ant Group and leading internet banks, deeply involved in cloud-native transformation of financial-grade core systems. Focused on high-availability infrastructure and platform engineering capabilities, significantly improving delivery efficiency and system elasticity while maintaining financial-grade stability.
Featured Projects
Enterprise DevOps Platform
Built enterprise-class private R&D workflow platform from scratch based on GitOps + CI/CD + Kubernetes. Standardized pipelines and automated delivery systems significantly shortened development cycles. Led platform to achieve DevOps Level 3 Certification, establishing industry-leading engineering standards.
AI Heterogeneous Computing Infrastructure
Built high-performance heterogeneous computing platform based on HAMi + Kubernetes, achieving vGPU resource pooling and dynamic elastic scheduling. Successfully deployed vLLM / sglang LLM containerization solutions, providing unified, efficient, and scalable computing infrastructure for multi-scenario inference tasks.
Next-Gen AIOps System
Explored deep LLM applications in operations domain using Dify + OpenClaw + n8n framework. Implemented intelligent log attribution analysis, automated fault diagnosis, and self-service ops chatbot, building an AIOps closed-loop system from reactive response to proactive governance.
Large-Scale Observability Platform
Led design and implementation of multi-tenant Prometheus + OpenTelemetry observability platform covering Metric / Log / Trace / Event full链路 data collection and analysis. Built real-time alert stream processing engine with Kafka + Flink for automated alert deduplication, anomaly detection, and root cause analysis, supporting billions of metrics daily.
Personal Vision
Beyond work, committed to connecting with the community through video content creation and technical sharing. Continuously producing high-quality tech Vlogs, sharing practical experience in a visual and systematic way to promote cloud-native and AI technology adoption.
- 📚 Knowledge: Building systematic tech knowledge base
- 💡 Insights: Sharing industry trend perspectives
- 📷 Life: Documenting colorful moments beyond technology