Staff ML Engineer: Autonomous Agent Architect (City of Sydney)

-33.8682 151.212
City of Sydney, Australia
Posted: less than a week ago
Save
Share

Description

Staff ML Engineer: Agentic AI Team: AI Agents | Location: Melbourne / Sydney / Remote (AU) What we have built We run production AI agents that autonomously resolve customer service tickets across 100,000+ Zendesk accounts. These agents take a customer issue, plan a multi-step resolution, execute real actions (refunds, order modifications, escalations) through live APIs, and close the ticket without a human in the loop. The agent core uses a proprietary iterative architecture: the agent decomposes goals into plans, pulls reusable skills from a registry, executes, evaluates the outcome, and refines. Each iteration feeds back into the next attempt. We have a working self‑learning mechanism where successful resolution patterns are synthesized into new skills and fed back into the registry, so the system improves from its own execution history. On multi‑step tool‑use benchmarks (GAIA‑class), our agents perform at parity with the best published results. Our internal evaluation suite runs 158+ scenario‑based tests from real Zendesk tickets, scored continuously through Braintrust with regression detection on every deploy. What we need help with Pushing the architecture further. The iterative planner works, but there are open questions we have not solved yet: how to handle plan decomposition when the goal is ambiguous, how to manage interference between memory tiers under concurrent sessions, how to make skill acquisition more selective (the agent acquires skills too eagerly today), and how to design multi‑agent delegation patterns where one agent hands off subtasks to specialized agents via A2A (the Agent‑to‑Agent protocol). Domain‑specialized agent models. We are building toward training our own models, specialized for customer service resolution via RL on production trajectories. The data pipeline is already being instrumented (resolution outcomes, escalation patterns, user satisfaction signals). The next step is the RL training infrastructure itself: reward curricula, rollout systems, and the feedback loops that turn a capable base model into a specialist that matches or beats frontier models on our task distribution at significantly lower inference cost. This is a 6‑12 month build, and we need someone who can own both the science and the systems. Hardening evaluation. We run 158+ scenario evals continuously with regression detection, but multi‑turn evaluation and automated trajectory analysis (pinpointing where reasoning diverged) are still early. We need quality gates that block deploys when agent performance drops, and we need them integrated into CI, not run as an afterthought. Guardrails at enterprise scale. The threat surface for autonomous agents includes tool misuse, cascading action chains, prompt injection, and hallucination loops that burn tokens before anyone notices. We need multi‑layered defenses with supervisor patterns, capabilities‑based access control, and output validation that works across thousands of concurrent sessions without adding meaningful latency. What we are looking for
- 5+ years building production ML/AI systems, with hands‑on experience in agent architectures (planning, tool dispatch, memory, error recovery). If you have only used LangChain tutorials, this is not the right fit.
- Strong evaluation instincts. You understand why public benchmarks diverge from production performance and you have built internal evals to close that gap.
- OPTIONAL: Experience with or genuine depth in RL for language models: reward shaping, online/offline tradeoffs, reward hacking as a diagnostic signal. We are building toward domain‑specialized training and need someone who can lead that work.
- Python and PyTorch fluency. Familiarity with at least one agent framework, combined with the judgment to know when to build custom. Zendesk is an equal prospect employer, and we’re proud of our ongoing efforts to foster global diversity, equity, & inclusion in the workplace. Individuals seeking employment and employees at Zendesk are considered without regard to race, color, religion, national origin, age, sex, gender, gender identity, gender expression, sexual orientation, marital status, medical condition, ancestry, disability, military or veteran status, or any other characteristic protected by applicable law. We are an AA/EEO/Veterans/Disabled employer. If you are based in the United States and would like more information about your EEO rights under the law, please click here. Zendesk endeavors to make reasonable accommodations for applicants with disabilities and disabled veterans pursuant to applicable federal and state law. If you are an individual with a disability and require a reasonable accommodation to submit this application, complete any pre‑employment testing, or otherwise participate in the employee selection process, please send an e‑mail to with your accommodation request. #J-18808-Ljbffr Apply on Kit Job: kitjobau.com/job/3rscjf

Highlights

Company name

Zendesk
Job position

Staff ML Engineer: Autonomous Agent Architect (City of Sydney)

Ad ID:

8814480175
Flag
Block ad

Safety Tips

Beware of ads written with poor grammar or spelling.