Papers, patents, and research outputs.
Synced with the current Google Scholar profile, with public paper and patent links where available.
Compares UI-grounded and API-grounded agent control surfaces across reliability, robustness, security, latency, auditability, coverage, cost, and human-in-the-loop design.
Frames modern AI design around a task-system unification axis, showing why many production systems combine task-level end-to-end learning with system-level modular control.
Defines tools, skills, sub-agents, and agents by control characteristics, with practical orchestration guidance for production agentic systems.
Separates memory, learning, and personalization into explicit sub-agent responsibilities to improve adaptation and controllability in long-horizon agent systems.
Argues for treating specifications and contracts as the primary artifact while implementation is generated, checked, and refined against formal intent.
Describes a post-ASR correction model that flags likely misrecognitions, selects a corrected interpretation, and routes it to response generation.
Provides a design framework for choosing between agent-centric and environment-centric intelligence, with emphasis on reliability and structured task settings.
Makes the case for speech-native architectures and audio-first data distributions as voice interaction becomes habitual rather than exceptional.
Connects Unix's file abstraction to agentic AI, proposing file/code-centric interfaces for composability, auditability, and robust tool use.
Synthesizes hiring and organizational signals to argue that AI research and engineering roles increasingly overlap along a practical capability continuum.
Introduces a professional-speech ASR benchmark focused on entity-critical errors across finance, medicine, legal, and technical domains.
Covers an LLM-guided correction workflow where a prompt encodes a multi-step tool plan and uses tool outputs to update user-provided data.
Evaluates multi-turn agents by whether the user's goal is achieved, with teacher models and a taxonomy of failure causes.
Introduces a routing engine that selects models per request using user-defined tradeoffs across accuracy, latency, cost, and ethics.
Python framework for simulating role-based dialogues and generating structured conversational data.
Explores LLM-agent behavior for improving automated ASR transcription workflows by imitating human correction patterns.
Covers techniques for generating candidate utterances from intent and target data, then evaluating them for downstream voice/NLU use.
Amazon Machine Learning Conference work on connecting web-shopping context with voice-shopping ASR improvement.
Amazon Machine Learning Conference work on retrieval-and-ranking based ASR correction for long-tail domains.
Amazon Machine Learning Conference work on identifying product affinity signals in webpages for contextual advertising.