Software engineer specialising in GPU programming, real-time rendering, and low-latency system optimisation. I build high-performance rendering engines, CUDA compute kernels, and systems-level infrastructure — from Vulkan pipelines and warp-synchronous GPU algorithms to SIMD-optimised CPU rasterisers.
Vulkan, OpenGL, GLSL, CUDA, Render Pipelines, Path Tracing, Ray Tracing, Rasterisation
C++17/20 (primary), C, Python, ARM Assembly
SIMD (SSE/AVX2), TBB, OpenMP, Lock-Free, Warp-Level Primitives, perf, Nsight
Linux, Boost.Asio, gRPC, Docker, AWS, CMake, GDB/LLDB, Valgrind
A Vulkan 1.3 real-time rendering engine built from scratch, featuring multi-queue architecture, timeline semaphore synchronisation, compute + graphics pipelines, GPU particle systems, and dynamic rendering — designed for maximum GPU utilisation and minimal CPU overhead.
A CPU-side rendering engine combining conventional rasterisation, Whitted-style ray tracing, and Monte Carlo path tracing with importance sampling. Optimised with AVX2/SSE vectorisation and TBB multithreading for both real-time (60+ FPS) and offline (1024 SPP) workloads.
A CUDA and C++ high-performance computing library featuring warp-synchronous algorithms, hierarchical sparse grid frameworks, and GPU kernel profiling infrastructure. Demonstrates design of latency-bounded, high-throughput compute kernels at scale.
A low-latency, event-driven messaging system sustaining 8,000+ concurrent sessions with jitter control, non-blocking I/O, explicit backpressure, and bounded queues — built with Boost.Asio, gRPC, and Qt6.
Actively seeking GPU software engineering and graphics roles. Open to opportunities in Shanghai and across the APAC region.