Page Not Found
Page not found. Your pixels are in another canvas.
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Page not found. Your pixels are in another canvas.
About me
Managing ModelOps workflows can be complex and time-consuming. Amazon SageMaker AI Projects now offers an easier path with Amazon S3-based templates. With this new capability, you can store AWS CloudFormation templates directly in Amazon S3 and manage their entire lifecycle using familiar S3 features such as versioning, lifecycle policies, and cross-region replication.
In this post, we show how to use FMEval and Amazon SageMaker to programmatically evaluate LLMs. FMEval is an open source LLM evaluation library, designed to provide data scientists and machine learning (ML) engineers with a code-first experience to evaluate LLMs for various aspects, including accuracy, toxicity, fairness, robustness, and efficiency.
MLflow and Amazon SageMaker are two of many tools on the market to help data scientists to implement end-to-end Machine Learning workloads. SageMaker offers the possibility to run these workloads fully end-to-end on its own ecosystem as it has been designed to solve some of the common challenges that are peculiar to ML lifecycle workloads. Nonetheless, one of the great traits of the SageMaker ecosystem is also its flexibility and openess to integrate with other tools. Today, we ultimately want to show how you can securely integrate SageMaker with MLflow using native AWS services to enable access control on the open-source version of MLflow.
The process of building a machine learning (ML) model is iterative until you find the candidate model that is performing well and is ready to be deployed. As data scientists iterate through that process, they need a reliable method to easily track experiments to understand how each model version was built and how it performed.
Data scientists often work towards understanding the effects of various data preprocessing and feature engineering strategies in combination with different model architectures and hyperparameters. Doing so requires you to cover large parameter spaces iteratively, and it can be overwhelming to keep track of previously run configurations and results while keeping experiments reproducible.
J.C. O'Sullivan, P. Di Francesco, U.K. Anyanwu, L.A. DaSilva, and A.B. MacKenzie. 2011. "Multi-hop MAC Implementations for Affordable SDR Hardware." IEEE DySPAN, Aachen, Germany, pp. 632-636.
Y. Xiao, Y. Chau, P. Di Francesco, and L.A. DaSilva. 2013. "Dynamic Spectrum Scheduling for Carrier Aggregation: A Game Theoretic Approach." IEEE ICC, Budapest, Hungary, pp. 2672-2676.
L.A. DaSilva, J. Kibiłda, P. Di Francesco, T.K. Forde, and L.E. Doyle. 2013. "Customized Services over Virtual Wireless Networks: The Path towards Networks without Borders." Future Network and MobileSummit, Lisbon, Portugal.
P. Di Francesco, S. McGettrick, U.K. Anyanwu, J.C. O'Sullivan, A.B. MacKenzie, and L.A. DaSilva. 2013. "A Split Architecture for Random Access MAC for SDR Platforms." CROWNCOM, Washington, DC.
A. Puschmann, P. Di Francesco, M.A. Kalil, L.A. DaSilva, and A. Mitschele-Thiel. 2013. "Enhancing the Performance of Random Access MAC Protocols for Low-cost SDRs." ACM WiNTECH, Miami, FL.
P. Di Francesco, F. Malandrino, and L.A. DaSilva. 2014. "Mobile Network Sharing Between Operators: A Demand Trace-Driven Study." ACM SIGCOMM CSWS, Chicago, IL, pp. 39-44.
P. Di Francesco, S. McGettrick, U.K. Anyanwu, A.B. MacKenzie, and L.A. DaSilva. 2015. "A Split MAC Approach for SDR Platforms." IEEE Transactions on Computers 64(4): 912-924. doi: 10.1109/TC.2014.2308197
J. Kibiłda, P. Di Francesco, F. Malandrino, and L.A. DaSilva. 2015. "Infrastructure and Spectrum Sharing Trade-offs in Mobile Networks." IEEE DySPAN, Stockholm, Sweden.
P. Di Francesco, F. Malandrino, T. Forde, and L.A. DaSilva. 2018. "A Sharing and Competition Aware Framework for Cellular Network Evolution Planning." IEEE Transactions on Cognitive Communications and Networking 1(4): 464 - 470. doi:10.1109/TCCN.2017.2663060
P. Di Francesco, F. Malandrino, and L.A. DaSilva. 2017. "Sensitivity Analysis on Service-Driven Network Planning." IEEE/ACM Transactions on Networking 25(3): 1417 - 1430. doi: 10.1109/TNET.2016.2633417
P. Di Francesco, F. Malandrino, and L.A. DaSilva. 2018. "Assembling and Using a Cellular Dataset for Mobile Network Analysis and Planning." IEEE Transactions on Big Data 4(4): 614 - 620. doi:10.21105/joss.01722
This talk presented a comprehensive overview of MLOps on AWS, covering the journey from experimental notebooks to production-ready ML systems using Amazon SageMaker. Starting from the premise that ML code is only a small fraction of a real-world ML system, the session walked through an MLOps maturity framework across four phases — Initial, Repeatable, Reliable, and Scalable — mapping each to specific AWS services and capabilities. Topics included SageMaker Studio for experimentation, SageMaker Experiments for tracking, SageMaker Pipelines for workflow automation, Model Registry for versioning and promotion, SageMaker Projects for one-click CI/CD provisioning, shadow testing and deployment guardrails, Model Monitor for drift detection, and Model Cards and Dashboard for governance. The talk also covered team structures, multi-account strategies, and custom project templates for enterprise-scale MLOps.
Co-presented with Ankit Anand and Matt Nightingale, this session explored the challenges of training foundation models at scale and how Amazon SageMaker HyperPod addresses them. The talk covered the generative AI landscape and the growing computational demands of FM development, from prompt engineering and RAG to full pre-training. We introduced SageMaker HyperPod as a resilient, performant, and customizable environment for large-scale distributed training — featuring self-healing clusters that automatically detect hardware failures, replace faulty instances, and resume training jobs from checkpoints, reducing training time by up to 20%. The session went under the hood of HyperPod, covering cluster architecture, instance groups, lifecycle scripts, Elastic Fabric Adapter (EFA) for high-speed inter-node communication, distributed training software stacks for both GPU and Trainium, and job scheduling with auto-healing. Customer stories from Stability AI, Perplexity AI, and Hugging Face illustrated real-world benefits.