Blog posts

2026

Simplify ModelOps with Amazon SageMaker AI Projects using Amazon S3-based templates

Managing ModelOps workflows can be complex and time-consuming. Amazon SageMaker AI Projects now offers an easier path with Amazon S3-based templates. With this new capability, you can store AWS CloudFormation templates directly in Amazon S3 and manage their entire lifecycle using familiar S3 features such as versioning, lifecycle policies, and cross-region replication.

Full text here, and GitHub repository here GitHub stars

2025

Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

In this post, we show how to use FMEval and Amazon SageMaker to programmatically evaluate LLMs. FMEval is an open source LLM evaluation library, designed to provide data scientists and machine learning (ML) engineers with a code-first experience to evaluate LLMs for various aspects, including accuracy, toxicity, fairness, robustness, and efficiency.

Full text here, and GitHub repository here GitHub stars

2023

Secure MLflow in AWS Fine-grained access control with AWS native services

MLflow and Amazon SageMaker are two of many tools on the market to help data scientists to implement end-to-end Machine Learning workloads. SageMaker offers the possibility to run these workloads fully end-to-end on its own ecosystem as it has been designed to solve some of the common challenges that are peculiar to ML lifecycle workloads. Nonetheless, one of the great traits of the SageMaker ecosystem is also its flexibility and openess to integrate with other tools. Today, we ultimately want to show how you can securely integrate SageMaker with MLflow using native AWS services to enable access control on the open-source version of MLflow.

Full text here, and GitHub repository here GitHub stars

2022

Track your ML experiments end to end with Data Version Control and Amazon SageMaker Experiments

Data scientists often work towards understanding the effects of various data preprocessing and feature engineering strategies in combination with different model architectures and hyperparameters. Doing so requires you to cover large parameter spaces iteratively, and it can be overwhelming to keep track of previously run configurations and results while keeping experiments reproducible.

Full text here, and GitHub repository here GitHub stars