Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

January 28, 2025 less than 1 minute read

In this post, we show how to use FMEval and Amazon SageMaker to programmatically evaluate LLMs. FMEval is an open source LLM evaluation library, designed to provide data scientists and machine learning (ML) engineers with a code-first experience to evaluate LLMs for various aspects, including accuracy, toxicity, fairness, robustness, and efficiency.

Full text here, and GitHub repository here

We demonstrate how to combine FMEval with Amazon SageMaker managed MLflow to track and compare LLM evaluation results, enabling systematic model selection and governance for your generative AI workflows.

Share on

Mastodon Twitter Facebook LinkedIn

End-to-end lineage with DVC and Amazon SageMaker AI MLflow apps

April 21, 2026

Production ML teams often struggle to trace the full lineage of a model back to the exact data and code that trained it. In this post, we close that gap by combining DVC for data versioning, Amazon SageMaker AI for scalable processing and training, and Amazon SageMaker AI MLflow Apps for experime...

Simplify ModelOps with Amazon SageMaker AI Projects using Amazon S3-based templates

January 30, 2026

Managing ModelOps workflows can be complex and time-consuming. Amazon SageMaker AI Projects now offers an easier path with Amazon S3-based templates. With this new capability, you can store AWS CloudFormation templates directly in Amazon S3 and manage their entire lifecycle using familiar S3 feat...

Secure MLflow in AWS Fine-grained access control with AWS native services

May 08, 2023

MLflow and Amazon SageMaker are two of many tools on the market to help data scientists to implement end-to-end Machine Learning workloads.SageMaker offers the possibility to run these workloads fully end-to-end on its own ecosystem as it has been designed to solve some of the common challenges t...

Paolo Di Francesco

Track LLM model evaluation using Amazon SageMaker managed MLflow and FMEval

Share on

You May Also Enjoy

End-to-end lineage with DVC and Amazon SageMaker AI MLflow apps

Simplify ModelOps with Amazon SageMaker AI Projects using Amazon S3-based templates

Secure MLflow in AWS Fine-grained access control with AWS native services