AI Research Highlights | Week 45, 2023

1. ChatCoder: Chat-based Refine Requirement Improves LLMs’ Code Generation

Source: https://arxiv.org/abs/2311.00272

PKU researchers proposed ChatCoder: a method to refine the requirements via chatting with LLMs. They designed a chat scheme (shown below) in which the LLMs will guide the human users to refine their expression of requirements to be more precise, unambiguous, and complete than before. Experiments show that ChatCoder has improved existing LLMs’ performance by a large margin.

2. Small Language Models Fine-tuned to Coordinate Larger Language Models improve Complex Reasoning

Source: https://arxiv.org/abs/2310.18338

This paper introduced DaSLaM (Decomposition And Solution LAnguage Models), which uses a decomposition generator to decompose complex problems into subproblems that require fewer reasoning steps. These subproblems are answered by a solver. Evaluation results revealed that with this method, a 175 billion parameter LM (text-davinci-003) can produce competitive or even better performance, compared to its orders-of-magnitude larger successor, GPT-4.

3. DCQA: Document-Level Chart Question Answering towards Complex Reasoning and Common-Sense Understanding

Source: https://arxiv.org/abs/2310.18983

The researchers presented a comprehensive and extensive document-level chart question answering dataset, DCQA, which features a wide range of chart styles and includes question-answer pairs that incorporate complex reasoning and common-sense knowledge. They conceptualized chart question answering as a document-level task and proposed TOT-Doctor, a transformer-based OCR-free model to effectively address this task.

4. Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents

Source: https://arxiv.org/abs/2310.19923

Researchers introduced Jina Embeddings v2, a novel embedding model based on a modified BERT architecture. This model eschews positional embeddings and instead employs bi-directional ALiBi slopes to capture positional information. This effort has produced a new suite of open-source embedding models capable of encoding texts containing up to 8192 tokens. These embeddings signify a 16x increase in the maximum sequence length compared to leading open-source embedding models.

5. Unlearn What You Want to Forget: Efficient Unlearning for LLMs

Source: https://arxiv.org/abs/2310.20150

In this work, researchers proposed Efficient Unlearning method for LLMs (EUL), an efficient unlearning method for LLMs that could efficiently and effectively unlearn the user-requested data via learning unlearning layers through the selective teacher-student objective. They further introduced a fusion mechanism that could merge different unlearning layers into one unified layer to dynamically unlearn a sequence of data. Experiments on different settings demonstrated the effectiveness of EUL method compared to state-of-the-art baselines.

6. AI Alignment: A Comprehensive Survey

Source: https://arxiv.org/abs/2310.19852

in this paper, the researchers identified the RICE principles as the key objectives of AI alignment: Robustness, Interpretability, Controllability, and Ethicality. Guided by these four principles, they outlined the landscape of current alignment research and decomposed them into two key components: forward alignment and backward alignment. The former aims to make AI systems aligned via alignment training, while the latter aims to gain evidence about the systems’ alignment and govern them appropriately to avoid exacerbating misalignment risks. The project can be found here.

7. ControlLLM: Augment Language Models with Tools by Searching on Graphs

Source: https://arxiv.org/abs/2310.17796

This paper presented ControlLLM, a novel 3-stage framework that enables LLMs to utilize multi-modal tools for solving complex real-world tasks. This framework comprises three key components: (1) a task decomposer that breaks down a complex task into clear subtasks with well-defined inputs and outputs; (2) a Thoughts-on-Graph (ToG) paradigm that searches the optimal solution path on a pre-built tool-resource graph, which specifies the parameter and dependency relations among different tools; and (3) an execution engine with a rich toolbox that interprets the solution path and runs the tools efficiently on different computational devices. The code will be soon released here.

8. Generative Input: Towards Next-Generation Input Methods Paradigm

Source: https://arxiv.org/abs/2311.01166

In this study, the authors proposed a novel Generative Input paradigm named GeneInput. It uses prompts to handle all input scenarios and other intelligent auxiliary input functions, optimizing the model with user feedback to deliver personalized results. The results demonstrate that they have achieved state-of-the-art performance for the first time in the Full-mode Key-sequence to Characters(FK2C) task. Furthermore, they introduced four novel reward-model training methods based on user feedback, allowing online model updates without the need for external annotated data, and resulting in state-of-the-art performance across all tasks.

9. Conan: Active Reasoning in an Open-World Environment

Source: https://arxiv.org/abs/2311.02018

The researchers proposed Conan, a new open-world environment tailored for abductive reasoning. Standing head and shoulders above traditional single-round passive reasoning benchmarks, Conan boasts an open-world arena, urging agents to actively probe surroundings and engage in multi-round abductive inferences, all while leveraging in-situ collected evidence alongside pre-existing knowledge. They meticulously crafted questions within Conan to span various levels of abstraction, from localized intentions (Intent) to overarching objectives (Goal) and survival states (Survival). Moreover, they proposed a new learning paradigm, Abduction from Deduction (AfD), that turns the problem of abduction to deduction, exploiting the problem structure through Bayesian principles. The project can be found here.

*The researchers behind the publications deserve full credit for their work.