-
M1 GPU Acceleration
May 20, 2022Setup guide for GPU acceleration on Apple Silicon M1 with PyTorch MPS backend and HuggingFace Transformers.
1 min read ·dev-toolsdl -
Grad Cache
April 12, 2022An approach that enables large-batch contrastive learning under memory constraints, similar to gradient accumulation.
1 min read ·mlml-engineering+2 -
AutoML
November 28, 2021Introduction to AutoML and hyperparameter optimization using Bayesian Optimization with Gaussian Process Regression.
4 min read ·mldl+1 -
Model Compression Competition
November 28, 2021Key considerations for model compression competitions, covering FLOPs, memory access cost, and speed optimization guidelines from ShuffleNet v2.
1 min read ·dl -
Model Compression Overview
November 28, 2021Overview of model compression techniques: NAS, network pruning, knowledge distillation, tensor decomposition, and quantization.
3 min read ·dl -
Ensemble
August 31, 2021Ensemble methods for AI competitions: hard voting, soft voting, and weighted voting to improve model performance.
1 min read ·dlpytorch -
Training Process
August 30, 2021How gradient accumulation works in PyTorch for effective large-batch training on limited GPU memory.
1 min read ·dlpytorch -
Confusing Training Methods
August 29, 2021Clarifying common mistakes in PyTorch training loops, validation ordering, and K-fold cross validation usage.
1 min read ·dlpytorch -
Hyperparameter Tuning
August 22, 2021Overview of hyperparameter tuning methods including grid search, random search, and Bayesian optimization, with an introduction to Ray for parallel tuning.
1 min read ·pytorchdl -
Multi GPU
August 20, 2021Guide to multi-GPU training in PyTorch covering model parallelism, DataParallel, and DistributedDataParallel with code examples.
1 min read ·pytorchdl -
Generative Models
August 13, 2021Introduction to generative models: probability distributions, independence assumptions, chain rule, and auto-regressive models.
4 min read ·dlnaver-boostcamp+1 -
Transformer
August 13, 2021Core concepts of the Transformer model: encoder self-attention, Query/Key/Value embeddings, and multi-head attention mechanics.
4 min read ·dlnlp -
Parameter Count
August 13, 2021Discussion on how the relationship between model parameter count and generalization performance shifted with scaling law findings.
1 min read ·dlnaver-boostcamp -
Convolution Practice
August 12, 2021Practical CNN implementation in PyTorch: add_module, training loops, and batch normalization.
1 min read ·computer-visiondl -
CNN Key Concepts
August 11, 2021Key CNN architectures from ILSVRC: AlexNet, VGGNet, GoogLeNet, and ResNet, with analysis of receptive fields and 1x1 convolutions.
3 min read ·computer-visiondl -
Weight Initialization
August 11, 2021Why weight initialization matters in deep learning and why zero initialization should be avoided.
1 min read ·dlml+1 -
Convolution
August 11, 2021Convolution fundamentals: stride, padding, parameter counting, and 1x1 convolutions.
2 min read ·computer-visiondl -
Optimizer Practice
August 10, 2021Practical comparison of SGD, Momentum, and Adam optimizers on function approximation with noisy data in PyTorch.
1 min read ·dlpytorch -
Optimization
August 10, 2021Deep learning optimization fundamentals: generalization, overfitting, cross-validation, bias-variance tradeoff, bootstrapping, bagging, and boosting.
6 min read ·dl -
Deep Learning
January 1, 2021Deep learning fundamentals: key components (data, model, loss, optimizer) and a brief history from AlexNet to GPT-3.
3 min read ·dlml -
Neural Network
January 1, 2021Introduction to neural networks covering linear regression, softmax classification, activation functions, and why deep layers are preferred.
2 min read ·dlml -
NN & Multi Layer Perceptron
January 1, 2021Neural networks as function approximators: linear models, activation functions, multi-layer perceptrons, universal approximation theorem, and loss functions.
2 min read ·dl -
PyTorch
January 1, 2021Practical PyTorch tips covering parameter initialization, model.eval(), tensor views, and the training loop.
2 min read ·dlpytorch