Machine Learning Engineer

Remote
Full Time
Artificial Intelligence
Experienced

Summary 

As a Machine Learning Engineer at Reflex Media, you will bridge the critical gap between experimental AI models and scalable, production-grade systems. You will be responsible for taking AI models and inference code developed by our data scientists and deploying them into a rock-solid, high-availability production environment.

We are an AI-forward organization looking for a builder with an "owner" mindset. You will not only execute on current infrastructure needs using the full AWS suite but also drive our future architecture by implementing advanced Sagemaker features and AgentCore. You will serve as the connective tissue between our AI researchers and our application development teams, instilling DevOps culture and software engineering best practices into the machine learning lifecycle.


Key Responsibilities

Production Deployment & Architecture

  • Model Productionization: Take ownership of deploying AI models and inference code to production. Ensure deployments are seamless, monitoring for latency, throughput, and error rates.
  • Infrastructure Architecture: Design, architect, and maintain a rock-solid, scalable infrastructure. You are responsible for ensuring the system is easy to maintain, cost-efficient, and capable of handling varying loads without downtime.
  • AWS Ecosystem Management: Utilize deep expertise in AWS to architect solutions. You must have hands-on experience configuring and managing Sagemaker, Lambda, ECR, S3, SQS, Redshift (SQL), CloudWatch, and SNS.
  • Networking & Routing: Manage application networking and routing services (ALB, API Gateway, VPC configurations) to ensure secure and efficient communication between inference endpoints and the core application.

MLOps & CI/CD

  • DevOps Integration: Instill and refine DevOps culture within the AI wing. Design and implement robust CI/CD pipelines for machine learning (MLOps) to automate retraining, testing, and deployment processes.
  • Monitoring & Alerting: leveraging CloudWatch and custom tools to ensure model drift, infrastructure health, and inference errors are detected and resolved proactively.

Collaboration & Innovation

  • Forward-Looking Innovation: Evaluate and implement emerging technologies to keep us ahead of the curve, with a specific focus on Advanced Sagemaker features and AWS AgentCore for agentic workflows.
  • Cross-Functional Development: Work directly with the Application Development team to clarify technical details, API contracts, and integration points.
  • Best Practices Consultation: Consult with the AI/Data Science team to refine software development best practices, including code modularity, version control, and testing standards, ensuring "research code" is transformed into "production code."

Required Skills & Experience

  • Education: Bachelor’s degree in Computer Science, Engineering, Mathematics, or a related field.
  • Production ML Experience: 3+ years of experience specifically in deploying Machine Learning models to high-traffic production environments.
  • AWS Mastery: Deep, practical experience with the AWS stack. You must be comfortable working with Sagemaker (Endpoint configuration, training jobs), Lambda (serverless inference), ECR (container management), and Redshift (data warehousing).
  • Networking Knowledge: Strong understanding of application networking, including Load Balancers (ALB/NLB), routing, DNS, and security groups.
  • DevOps & MLOps: Proven track record of building CI/CD pipelines (Github Actions, Bitbucket Pipelines, Jenkins, or AWS CodePipeline) and managing Infrastructure as Code (Terraform or CloudFormation).
  • Software Engineering Foundation: Strong proficiency in Python. You write clean, maintainable, and tested code. You have a firm working knowledge of the SDLC, and have strong Git skills.
  • Communication: Ability to translate complex technical constraints to data scientists and product managers alike.

Nice to Have

  • Education: A Master's or Ph.D. is preferred.
  • Advanced AWS Features: Prior experience with Bedrock, SageMaker JumpStart, or specifically AgentCore.
  • Container Orchestration: Advanced experience with Kubernetes (EKS) for managing complex ML workloads.
  • AI-Assisted Execution: Practical experience using AI tools (for discovery, content, or workflow acceleration).
  • Industry-Specific Experience: Experience improving or expanding dating, social networking, or social media platforms.
  • Certifications: AWS Certified Machine Learning – Specialty or AWS Certified DevOps Engineer – Professional.
Share

Apply for this position

Required*
We've received your resume. Click here to update it.
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

Human Check*