Chief Technology Officer · AI Architect · Robotics Researcher
Architecting production AI systems at the intersection of deep learning, embodied robotics, and real-world deployment.
As CTO, I make long-term bets on the right technology directions for the next 5 years of robotics and AI.
End-to-end intelligence for physical robots — from perception to action. Building systems where humanoids and robot arms reason about the physical world and execute tasks autonomously in unstructured factory environments.
Training foundation models that close the loop between language instructions, visual observation, and motor actions. Moving robots from pre-programmed routines to general-purpose instruction-following agents.
Building latent world models that let robots predict consequences before acting — enabling safer planning without relying purely on environmental feedback. Key for legged locomotion on unstructured terrain.
Leveraging large vision-language models as the reasoning backbone for robots — enabling semantic scene understanding, natural-language task decomposition, and open-vocabulary object interaction on edge hardware in real time.
6+ years shipping AI to real enterprise clients — banking (Maybank, ShinHan), manufacturing, and national infrastructure. Deep expertise in model optimization, edge deployment, and scaling AI systems that handle millions of daily inferences.
Building high-performance research engineering teams from scratch. Defining technical culture, research roadmap, and product strategy that translate cutting-edge AI papers into deployed real-world systems — not demos.
Weekly updates on what I'm building, researching, and breaking.
Working on a data pipeline to extract clean action-observation pairs from wrist-camera + ego-view recordings of human demonstrations. Training an initial BC policy on Unitree G1 arm tasks: pick, place, and handover.
Experimenting with RSSM-style recurrent world models to predict future terrain elevation maps for the legged robot. Early runs in Isaac Sim show 30% improvement in stair-climbing stability over baseline RL policy.
Integrating a quantized Qwen2.5-3B model as a high-level task planner for the robot arm. The LLM decomposes natural language instructions into primitive motion sequences executed by the low-level controller.
Deployed a fine-tuned InternVL2-2B on the humanoid's onboard GPU for real-time semantic scene understanding. The model correctly identifies objects, obstacles, and affordances at 12 FPS in factory-floor conditions.
Open to strategic advisory roles, research collaborations, and technology leadership opportunities.