Staff ML Engineer: Autonomous Agent Architect (City of Sydney)
Staff ML Engineer: Autonomous Agent Architect (City of Sydney)
-
City of Sydney, Australia
-
Posted: less than a week ago
-
Save
Description
Staff ML Engineer: Agentic AI Team: AI Agents | Location: Melbourne / Sydney / Remote (AU) What we have built We run production AI agents that autonomously resolve customer service tickets across 100,000+ Zendesk accounts. These agents take a customer issue, plan a multi-step resolution, execute real actions (refunds, order modifications, escalations) through live APIs, and close the ticket without a human in the loop. The agent core uses a proprietary iterative architecture: the agent decomposes goals into plans, pulls reusable skills from a registry, executes, evaluates the outcome, and refines. Each iteration feeds back into the next attempt. We have a working self‑learning mechanism where successful resolution patterns are synthesized into new skills and fed back into the registry, so the system improves from its own execution history. On multi‑step tool‑use benchmarks (GAIA‑class), our agents perform at parity with the best published results. Our internal evaluation suite runs 158+ scenario‑based tests from real Zendesk tickets, scored continuously through Braintrust with regression detection on every deploy. What we need help with Pushing the architecture further. The iterative planner works, but there are open questions we have not solved yet: how to handle plan decomposition when the goal is ambiguous, how to manage interference between memory tiers under concurrent sessions, how to make skill acquisition more selective (the agent acquires skills too eagerly today), and how to design multi‑agent delegation patterns where one agent hands off subtasks to specialized agents via A2A (the Agent‑to‑Agent protocol). Domain‑specialized agent models. We are building toward training our own models, specialized for customer service resolution via RL on production trajectories. The data pipeline is already being instrumented (resolution outcomes, escalation patterns, user satisfaction signals). The next step is the RL training infrastructure itself: reward curricula, rollout systems, and the feedback loops that turn a capable base model into a specialist that matches or beats frontier models on our task distribution at significantly lower inference cost. This is a 6‑12 month build, and we need someone who can own both the science and the systems. Hardening evaluation. We run 158+ scenario evals continuously with regression detection, but multi‑turn evaluation and automated trajectory analysis (pinpointing where reasoning diverged) are still early. We need quality gates that block deploys when agent performance drops, and we need them integrated into CI, not run as an afterthought. Guardrails at enterprise scale. The threat surface for autonomous agents includes tool misuse, cascading action chains, prompt injection, and hallucination loops that burn tokens before anyone notices. We need multi‑layered defenses with supervisor patterns, capabilities‑based access control, and output validation that works across thousands of concurrent sessions without adding meaningful latency. What we are looking for
- 5+ years building production ML/AI systems, with hands‑on experience in agent architectures (planning, tool dispatch, memory, error recovery). If you have only used LangChain tutorials, this is not the right fit.
- Strong evaluation instincts. You understand why public benchmarks diverge from production performance and you have built internal evals to close that gap.
- OPTIONAL: Experience with or genuine depth in RL for language models: reward shaping, online/offline tradeoffs, reward hacking as a diagnostic signal. We are building toward domain‑specialized training and need someone who can lead that work.
- Python and PyTorch fluency. Familiarity with at least one agent framework, combined with the judgment to know when to build custom. Zendesk is an equal prospect employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, disability, military or veteran status, or any other characteristic protected by applicable law. We are an AA/EEO/Veterans/Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law, please click here. Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application, complete any pre‑employment testing, or otherwise participate in the employee selection process, please send an e‑mail to with your accommodation request. #J-18808-Ljbffr Apply on Kit Job: kitjobau.com/job/3rscjf
- 5+ years building production ML/AI systems, with hands‑on experience in agent architectures (planning, tool dispatch, memory, error recovery). If you have only used LangChain tutorials, this is not the right fit.
- Strong evaluation instincts. You understand why public benchmarks diverge from production performance and you have built internal evals to close that gap.
- OPTIONAL: Experience with or genuine depth in RL for language models: reward shaping, online/offline tradeoffs, reward hacking as a diagnostic signal. We are building toward domain‑specialized training and need someone who can lead that work.
- Python and PyTorch fluency. Familiarity with at least one agent framework, combined with the judgment to know when to build custom. Zendesk is an equal prospect employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, disability, military or veteran status, or any other characteristic protected by applicable law. We are an AA/EEO/Veterans/Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law, please click here. Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application, complete any pre‑employment testing, or otherwise participate in the employee selection process, please send an e‑mail to with your accommodation request. #J-18808-Ljbffr Apply on Kit Job: kitjobau.com/job/3rscjf
Highlights
-
Company nameZendesk
-
Job positionStaff ML Engineer: Autonomous Agent Architect (City of Sydney)
Safety Tips
Beware of ads written with poor grammar or spelling.
More info about this ad
Staff ML Engineer: Autonomous Agent Architect (City of Sydney) has been posted in the Sydney Design & Architecture category on Locanto.
Another ad you might like is Sydney Architectural design & drafting service in Sydney.
You can find the Design & Architecture category under Jobs. Want something else? Check out the related categories Multi Level Marketing, Labour and Construction & Manufacturing Sydney.
Interested in more? Widen your search to view ads in nearby areas of Sydney. This includes Design & Architecture in Ultimo, Haymarket and Surry Hills. There are more ads within a 15 km radius for this category. If you want to view those ads, click here.