Design, build, and improve systems that detect structured events from video and audio streams in controlled environments. Key responsibilities include event detection pipeline, audio-based event detection, and multimodal fusion. Strong Python skills and experience with PyTorch are required.
Key Highlights
Key Responsibilities
Technical Skills Required
Benefits & Perks
Nice to Have
Job Description
ML Engineer — Video & Audio → Text Event Detection
Location: Remote
Level: Mid to Senior
Reports to: Engineering Lead / Head of ML
We are open to hiring only candidates who are in Oakistan for this role.
About the CompanyWe are an early-stage company building machine learning–powered visibility for time-sensitive, high-stakes environments. Our platform leverages video and audio from fixed cameras to detect structured workflow events, enabling real-time coordination and insights—while maintaining a strong focus on privacy and compliance.
⚠️ Important Requirement (Please Read Before Applying)We are specifically looking for professionals who can demonstrate and present their work.
All candidates must be able to:
- Showcase real-world projects or systems they have built or contributed to
- Clearly explain their role, decisions, and impact
- Demonstrate high professional standards in their current or previous positions
Applications without demonstrable work or the ability to present it will not be considered.
Role SummaryAs an ML Engineer, you will design, build, and improve systems that detect structured events from video and audio streams in controlled environments. You will work across computer vision, speech-to-text pipelines, and multimodal ML systems.
Key ResponsibilitiesEvent Detection Pipeline
- Build and optimize object detection systems (e.g., YOLO-based models)
- Develop temporal models (e.g., transformer-based) for event classification
- Optimize inference for edge (e.g., Jetson) and cloud environments
Audio-Based Event Detection
- Implement speech-to-text pipelines (e.g., Whisper)
- Detect protocol or safety-related events using keyword/phrase recognition
- Ensure anonymization and timestamp accuracy for downstream use
Interested in remote work opportunities in Machine Learning & AI? Discover Machine Learning & AI Remote Jobs featuring exclusive positions from top companies that offer flexible work arrangements.
Multimodal Fusion
- Combine video and audio signals for improved detection accuracy
- Define fusion strategies and confidence calibration
Training & Evaluation
- Design annotation strategies and leverage active learning
- Define and track key metrics (accuracy, F1, false positives, temporal precision)
Model Lifecycle
- Manage model versioning, training, and deployment
- Support A/B testing, monitoring, and rollback strategies
Documentation
- Maintain clear documentation for models, experiments, and design decisions
- Bachelor’s degree (or equivalent experience) in a relevant technical field
- 3+ years of hands-on experience in at least two of the following:
- Computer vision (object detection, tracking, activity recognition)
- Speech recognition or NLP for event detection
- Multimodal ML systems
- Strong Python skills and experience with PyTorch (or similar frameworks)
- Experience with inference optimization (TensorRT, ONNX, quantization)
- Experience building and evaluating ML training pipelines
- Ability to work from structured requirements and iterate with stakeholders
- Strong communication skills in a collaborative, remote environment
Browse our curated collection of remote jobs across all categories and industries, featuring positions from top companies worldwide.
- Experience in healthcare or other high-stakes, real-time systems
- Familiarity with edge deployment (e.g., NVIDIA Jetson) and/or cloud ML (e.g., AWS)
- Experience with privacy-aware ML and data handling
- Knowledge of multi-object tracking (e.g., ByteTrack, BoT-SORT)
- Experience with Whisper-based pipelines and voice activity detection
- Exposure to clinical or regulated environments
- Experience with structured workflows and event sequencing
- Interest in explainability and confidence calibration
- Experience working in distributed, remote teams
- Fully remote, globally distributed team
- Opportunity to own and shape a core ML pipeline
- Work on meaningful, real-world ML applications in high-impact environments
- Collaborative and fast-moving engineering culture
- Competitive compensation, benefits, and equity (based on experience)
Similar Jobs
Explore other opportunities that match your interests
elevate recruitment
keystone recruitment