AI Research Highlights | Week 43, 2023

1. Theory of Mind for Multi-Agent Collaboration via Large Language Models

Source: https://arxiv.org/abs/2310.10701 Researchers evaluated LLM-based agents in a multi-agent cooperative text game with Theory of Mind (ToM) inference tasks, comparing their performance with Multi-Agent Reinforcement Learning (MARL) and planning-based baselines. The results demonstrate that LLM-based agents can handle complex multi-agent collaborative tasks at a level comparable with the state-of-the-art RL algorithm. Emergent collaborative behaviors and high-order ToM capabilities were also observed among those agents.

2. Large Language Model Unlearning

Source: https://arxiv.org/abs/2310.10683 Scientists from ByteDance Research presented three scenarios of aligning LLMs with human preferences that can benefit from unlearning: (1) removing harmful responses, (2) erasing copyright-protected content as requested, and (3) eliminating hallucinations. Ablation study shows that unlearning can still achieve better alignment performance than RLHF with only a fraction of its computational time. This is a pioneering work in the field of LLM unlearning.

3. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Source: https://arxiv.org/abs/2310.11511

This paper introduced Self-Reflective Retrieval-Augmented Generation (Self-RAG), a new framework to enhance an LM’s quality and factuality through retrieval and self-reflection, outperforming ChatGPT and retrieval-augmented LLama2 Chat on six tasks.

The project can be found here.

4. OpenAgents: An Open Platform for Language Agents in the Wild

Source: https://arxiv.org/abs/2310.10634

OpenAgents is an open in-the-wild agent platform now containing 3 available agents: (1) Data Agent for data analysis with Python/SQL and data tools; (2) Plugins Agent with 200+ daily API tools; (3) Web Agent for autonomous web browsing. The project is open-source: Code, Demos, Docs.

5. Eliminating Reasoning via Inferring with Planning: A New Framework to Guide LLMs’ Non-linear Thinking

Source: https://arxiv.org/abs/2310.12342

Researchers proposed Inferential Exclusion Prompting (IEP), a novel prompting that combines the principles of elimination and inference to guide LLMs to think non-linearly. Combining IEP with CoT equips LLMs with more complete logic processes. The researchers also introduced Mental Ability Reasoning Benchmark (MARB), a benchmark comprises 6 subtasks with 9,115 questions, which will be available at anonymity link soon. *The corresponding author Dr. Shang is also a science consultant in MindOS, providing scientific guidance for the agent’s abilities.

6. AgentTuning: Enabling Generalized Agent Abilities for LLMs

Source: https://arxiv.org/abs/2310.12823

Researchers from Tsinghua University proposed AgentTuning, enhancing agent abilities of LLMs while maintaining their general capabilities. They also provided a lightweight instruction-tuning dataset AgentInstruct to be combined with open-source instructions from general domains, leading to a hybrid strategy. AgentLM-70B, based on the Llama2 series with AgentTuning, is comparable to GPT-3.5-turbo on unseen agent tasks. You can find the project here.

7. DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning

Source: https://arxiv.org/abs/2310.12128

Researchers from UNC Chapel Hill presented DiagrammerGPT, a novel two-stage text-to-diagram generation framework. The first stage is Diagram Planning, where LLMs are used to generate and iteratively refine ‘diagram plans’ in a planner-auditor feedback loop. In the second stage, researchers use a diagram generator, DiagramGLIGEN, and a text label rendering module to generate diagrams following the diagram plans. They also introduced AI2D-Caption, a densely annotated diagram dataset, to benchmark the T2D generation task. The results showed that DiagrammerGPT outperformed existing T2I models. The project can be found here.

*The researchers behind the publications deserve full credit for their work.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *