Alibaba's HDPO framework trains AI agents to skip unnecessary tool calls, cutting redundant invocations from 98% to 2% while ...
By putting the weights of a highly capable, 33B-parameter agentic model in the hands of researchers and startups, Poolside is ...