- Python: Queue
- Python: iterable vs. iterator
- Python: __init__ vs. __new__
- Python: __slots__
- Python: mixin
- Special methods
- Python: generator
- Python: decorator
- Python: async/await
- Python: multiprocessing vs. threading
- Use less for-loop in Python
- pdb
- Garbage collection
- Python Virtual Machine (PVM)
- torch: ProcessGroup
- torch: DeviceMesh
- Functional functions
- Declare variable length array
- Clockwise Rule
- Initialize vector with value
- C++: reference vs. pointer
- C++: typedef vs. #define
- C++: std::vector vs. std::list
- C++: Heaps & Priority Queues
- The Rule of Three
- Iterator increment
- C++: emplace_back() vs. push_back()
- Where is const stored?
- C++: extern
- C++: inline function
- C++: lambda expression
- C++: Pointers
- C++: const string&
- C++: producer-comsumer model
- Read Write Lock
- Return Value Optimization (RVO)
- g++ cheatsheet
- C++11 and C++14
- C
- Optimize your Java code
- If-else optimization
- Java: hashCode() & equals()
- Java: concurrency
- Java: HashMap
- Java: control access
- How to write go code
- Go: struct{}{}
- Go: context
- Go: error handling
- Go: panic & recover
- Go: sync.RWMutex
- Go: sync.Cond
- Go: Limiter
- Gin: middleware
- gRPC: interceptor
- (FP) Map Reduce Filter
- (FP) Functional Options
- (FP) Pipeline
- (FP) OptionSet
- Implement a locked cache
- Implement a cronjob
- Implement a buffered IO
- Implement a customized logger
- API Practices in Go
- Some Go concurrency patterns
- Chain-of-responsibility
- Tree-of-responsibility
- MVC-MVVM
- Factory Pattern
- Singleton Pattern
- Expressiveness by Abstraction
- Dependency Injection
- Lazy ideas in programming
- Proxy Pattern
- Survey on efficient LLM inference
- Survey on model deployment and serving
- Text generation strategies
- SplitWise
- Multi-Head Attention
- KV Cache
- PTD Parallelism
- (WIP) Tensor Parallelism
- (WIP) vLLM
- How ByteDance scale large data size offline inference with Ray
- Transformer inference arithmetic
- Basic concepts
- SSH basics
- SSH Tunneling
- SSH ProxyJump
- Page
- Singularity
- Scheduler
- System admin
- Transfer files
- Modules
- Compilation
- MPI
- Timing CPU
- InfiniBand
- SadServers
- Linux performance tools
- Linux Load Averages
- Crontab
- PCIe, NVLink, InfiniBand
- GPU networking
- NCCL
- How to trace high CPU usage thread
- How to create shared directory across nodes
- Distribute files to tenants on server
- GPU Bound
- Timing CUDA kernals
- Least Common Multiply
- Reservoir sampling
- Sliding window
- Sieve of Eratosthenes
- Sorting
- LRU cache
- LRU-K and 2Q Algorithm
- Kadane's Algorithm
- Longest Common Subsequence
- Longest Increasing Subsequence
- Monotonic Stack
- Brian Kernighan Algorithm
- Asymptotic Analysis
- Binary Search
- Quick sort partition
- Date
- Check given x is an Integer
- C++ style template
- Constraints
- Matching Machine
- Token Bucket
- Topological Sort
- Trie
- RankedDict
- DFS template
- BFS template
- Library Procedure
- Process Identifier
- Zombie process vs. Orphan process
- Shell Scripting
- Error Handling
- Common Gateway Interface
- IO paradigms
- Semaphores and Mutexes
- Zero-copy
- OOM, memory leak
- Page Fault
- Intruction pipelining
- OS-level program performance optimization
- Copy-on-write
- Signals
- Coroutine
- Microservice
- Load Balance
- Database Indexes
- Tiny Web Server
- Proxy vs. Reverse Proxy
- Jump Server
- Better GitHub Workflow
- Singleton class with DCL
- Lock Free Queue
- Code of Connect 2023
- Why can't inspect Twitter web?
- CAP theorem
- Actor model
- Distributed AI task scheduling
- How to deploy log shipper in K8S
- Kafka use cases
- Database transaction
- Database indexing
- Backend development basis
- API optimization
- API pagination
- How to maintain login status
- How to store historical data