Author

Code DAO

We deliver AI algorithm tutorials and the latest news, curated by a team of researchers from Peking University, Shanghai Jiao Tong University, Central South University, and leading AI companies such as Huawei, Kuaishou, and SenseTime. Join us in the AI alchemy—making life better!

100

Articles

Likes

Views

Comments

Latest from Code DAO

100 recent articles max

Code DAO

Dec 22, 2021 · Artificial Intelligence

Understanding SimCLR: A Simple Contrastive Learning Framework for Visual Representations

This article explains SimCLR, the 2020 Google Research framework that advances self‑supervised visual pre‑training by using extensive data augmentations, a ResNet encoder, a projection‑head MLP, and the NT‑Xent loss to learn robust image representations that outperform many prior methods on ImageNet and other benchmarks.

Computer VisionNT-Xent lossResNet

0 likes · 7 min read

Understanding SimCLR: A Simple Contrastive Learning Framework for Visual Representations

Code DAO

Dec 21, 2021 · Artificial Intelligence

Four Keras Techniques for Preprocessing Text for Deep Learning

This article explains four Keras utilities—text_to_word_sequence, hashing_trick, one_hot, and Tokenizer—showing how each converts raw text into token lists, hash indices, integer encodings, or document matrices, with code examples and sample outputs.

KerasTokenizerhashing_trick

0 likes · 6 min read

Four Keras Techniques for Preprocessing Text for Deep Learning

Code DAO

Dec 20, 2021 · Artificial Intelligence

Exploring Latent Space with a Variational Autoencoder in TensorFlow

This article explains the theory behind variational autoencoders, details their KL‑divergence loss, provides a complete TensorFlow implementation, and demonstrates reconstruction, latent‑space visualization, and novel image generation through sampling and interpolation.

Image GenerationKL divergenceLatent Space

0 likes · 13 min read

Exploring Latent Space with a Variational Autoencoder in TensorFlow

Code DAO

Dec 20, 2021 · Artificial Intelligence

Building Efficient Data Pipelines with TensorFlow’s tf.data API

This article explains how to use TensorFlow’s tf.data API to construct high‑performance, flexible data pipelines—from loading images or tensors, applying transformations and data augmentation, to batching, shuffling, caching, prefetching, and feeding the pipeline directly into model.fit for training.

PythonTensorFlowdata loading

0 likes · 9 min read

Building Efficient Data Pipelines with TensorFlow’s tf.data API

Code DAO

Dec 19, 2021 · Artificial Intelligence

Exploring Latent Space with TensorFlow Autoencoders (Part 1)

This tutorial walks through building a TensorFlow 2.0 autoencoder from scratch, preparing the FashionDB dataset, visualizing raw images, projecting them into PCA and t‑SNE spaces, constructing encoder and decoder layers, training the model, and visualizing the resulting latent space to reveal image clusters.

AutoencoderLatent SpacePCA

0 likes · 13 min read

Exploring Latent Space with TensorFlow Autoencoders (Part 1)

Code DAO

Dec 18, 2021 · Artificial Intelligence

Essential Feature Selection Techniques for Machine Learning

This article explains why feature selection is crucial for building robust machine‑learning models and walks through popular filter, wrapper, and embedded methods—including information gain, chi‑square, LASSO, random‑forest importance, and PCA—providing code examples and practical guidance.

PCARegularizationembedded methods

0 likes · 18 min read

Essential Feature Selection Techniques for Machine Learning

Code DAO

Dec 18, 2021 · Artificial Intelligence

Implement Random Forest Regression in Python using Scikit-Learn

This article explains the fundamentals of random forest regression, describes why it outperforms single decision trees for nonlinear or noisy data, defines bootstrapping and bagging, and provides a step‑by‑step Python example using NumPy, Pandas, and Scikit‑Learn’s RandomForestRegressor with data loading, preprocessing, model training, prediction, and evaluation via MSE and R².

BootstrappingPythonmachine learning

0 likes · 6 min read

Implement Random Forest Regression in Python using Scikit-Learn

Code DAO

Dec 18, 2021 · Artificial Intelligence

Accelerating Gradient Boosting with CatBoost

This article explains how CatBoost implements gradient boosting, handles categorical features without preprocessing, lists its key advantages, details common training parameters, and provides a step‑by‑step regression example with code for fitting, cross‑validation, grid search, tree visualization, and parameter inspection.

CatBoostgradient boostinghyperparameter tuning

0 likes · 7 min read

Accelerating Gradient Boosting with CatBoost

Code DAO

Dec 17, 2021 · Artificial Intelligence

How to Accelerate XGBoost Training with Tree Methods, Cloud Computing, and Ray

The article explains why XGBoost training can be slow despite its speed focus and presents three acceleration techniques—choosing an optimal tree_method, leveraging cloud resources for larger memory, and using Ray for distributed training—complete with code examples and benchmark results.

RayXGBoostcloud computing

0 likes · 5 min read

How to Accelerate XGBoost Training with Tree Methods, Cloud Computing, and Ray

Code DAO

Dec 17, 2021 · Artificial Intelligence

How to Scale XGBoost with Ray for Distributed Multi‑GPU Training

XGBoost‑Ray provides a fault‑tolerant, multi‑node, multi‑GPU backend for XGBoost that integrates seamlessly with Ray Tune, supports distributed data loading, and can be enabled with only three code changes, enabling scalable training and inference on large clusters.

GPURayRay Tune

0 likes · 8 min read

How to Scale XGBoost with Ray for Distributed Multi‑GPU Training