top of page

AI Research Highlights | Week 3, 2024

Updated: Feb 19



Contents


 


1. Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

In this paper, researchers focused on Personal LLM Agents, which are LLM-based agents that are deeply integrated with personal data and personal devices and used for personal assistance. They envision that Personal LLM Agents will become a major software paradigm for end-users in the upcoming era. First, they discussed several important questions about Personal LLM Agents, including their architecture, capability, efficiency and security. They started by summarizing the key components and design choices in the architecture of Personal LLM Agents, followed by an in-depth analysis of the opinions collected from domain experts. Next, they discussed several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges. You can find the project here, containing a must-read paper list.



2. The Impact of Reasoning Step Length on Large Language Models

In this work, researchers explore whether the reasoning steps are the most critical component of the prompts that make CoT work. They conduct experiments to investigate this by maintaining strict control over variables. And the findings revealed that the key factor appears to be the length of the thinking chain rather than its accuracy. Specifically, (1) for few-shot COT, there is a direct linear correlation between step count and accuracy. This provides a quantifiable approach for optimizing CoT prompting in complex reasoning; (2) even incorrect rationales can yield favorable outcomes if they maintain the requisite length of inference; (3) the advantages of increasing reasoning steps are task-dependent: simpler tasks necessitate fewer steps, whereas more complex tasks gain significantly from longer inference sequences; (4) increased reasoning steps in zero-shot CoT can also significantly improve LLM accuracy.



3. User Embedding Model for Personalized Language Prompting

Scientists from Google Research introduced a new User Embedding Module (UEM) that efficiently processes longer user history in free-form text by compressing and representing them as embeddings, to use them as soft prompts to a LM. They employed an embedding-based technique to compress the user’s entire history, creating a sequence of representative user embedding tokens. This embedded representation enhances the ability to comprehend user preferences and subsequently generate predictions that align more closely with their interests. Further, since the UEM module is co-trained with the LM, the representations are learned in-context for the specific tasks. Compared to the naive approach of concatenating user history and incurring O⁢(n^2) compute cost for self-attention, this approach demonstrates a cheap way to incorporate history metadata as an embedding thus dramatically reducing the required compute.



4. Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

In this work, researchers demonstrated: 1. We can train models to have backdoors that, when triggered, involve switching from writing safe code to inserting code vulnerabilities; 2. We can train models with backdoors that are robust to the behavioral safety techniques of RL fine-tuning, supervised fine-tuning, and adversarial training; 3. This robustness of backdoored models to RL fine-tuning increases with model scale; 4. Adversarial training tends to make backdoored models more accurate at implementing their backdoored behaviors, effectively hiding rather than removing them; 5. We can train backdoored models that produce consistent, coherent reasoning regarding pursuing their backdoor, and find that such models show increased robustness to safety fine-tuning techniques, even when the reasoning is distilled away. These results validate the hypothesis that current behavioral training techniques would provide insufficient defense against threat models.




5. TrustLLM: Trustworthiness in Large Language Models

In this paper, authors present TrustLLM, a unified framework to support a comprehensive analysis of trustworthiness in LLM, including a survey of existing work, organizing principles of different dimensions of trustworthy LLMs, a novel benchmark, and a thorough evaluation of trustworthiness for mainstream LLMs. They identify eight facets of trustworthiness, select comprehensive and diverse LLMs for investigation, and conduct benchmarking and evaluation across various tasks and datasets. As LLMs play a pivotal role in natural language processing and a variety of real-world applications, addressing trustworthiness concerns is essential to maximize their utility and ensure responsible deployment in various domains. Only through collective effort, can we build trustworthy LLMs.



6. Transformers are Multi-State RNNs

The article demonstrates that decoder-only transformers can be conceptualized as infinite multi-state RNNs, which are RNN variants with unlimited hidden state size. The article further shows that pretrained transformers can be converted into finite multi-state RNNs by fixing the size of their hidden state. The article introduces a novel policy called TOVA, which is a simpler and more powerful multi-state RNN compression policy that outperforms all other baseline policies while being nearly on par with the full (infinite) model. The results indicate that transformer decoder LLMs often behave in practice as RNNs, and the proposed method substantially reduces memory consumption during inference, leading to up to 88% reduction in LLM cache size.



7. Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

In this paper, researchers proposed Lightning Attention-2 to solve the issue that the cumulative summation (cumsum) needed by the linear attention kernel trick prevents it from reaching its theoretical training speed in the causal setting. The key idea is to leverage the concept of "divide and conquer" by separately handling the intra block and inter block components in linear attention calculation. Specifically, for the intra blocks, they maintain the use of conventional attention computation mechanism to compute the product of QKV, while for the inter blocks, they employ the linear attention kernel trick. Tiling techniques are implemented in both forward and backward procedures to fully leverage GPU hardware capabilities. As a result, the Lightning Attention-2 can train LLMs with unlimited sequence length without extra cost, as its computational speed remains constant with increasing sequence length under fixed memory consumption.



8. Secrets of RLHF in Large Language Models Part II: Reward Modeling

Reinforcement Learning from Human Feedback (RLHF) has become a crucial technology for aligning language models with human values and intentions. Reward models are trained as proxies for human preferences to drive reinforcement learning optimization. While reward models are often considered central to achieving high performance, they face the following challenges in practical applications: (1) Incorrect and ambiguous preference pairs in the dataset may hinder the reward model from accurately capturing human intent. (2) Reward models trained on data from a specific distribution often struggle to generalize to examples outside that distribution and are not suitable for iterative RLHF training. In this report, authors attempt to address these two issues. (1) From a data perspective, they propose a method to measure the strength of preferences within the data, based on a voting mechanism of multiple reward models. Experimental results confirm that data with varying preference strengths have different impacts on reward model performance. They introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data. (2) From an algorithmic standpoint, they introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses, thereby improving model generalization. Furthermore, they employ meta-learning to enable the reward model to maintain the ability to differentiate subtle differences in out-of-distribution samples, and this approach can be utilized for iterative RLHF optimization. The previous article on RLHF's PPO can be found at: https://arxiv.org/abs/2307.04964




9. Batch-ICL: Effective, Efficient, and Order-Agnostic In-Context Learning

In this paper, researchers introduced Batch-ICL, an effective, efficient and order-agnostic inference algorithm for in-context learning. They implemented N separate 1-shot forward computations. This is then followed by the aggregation of N corresponding meta-gradients at specific layers. These aggregated meta-gradients are subsequently employed in the forward computation when an LLM handling a zero-shot task, ultimately producing the final predictions. Batch-ICL addresses concerns related to the order of examples in ICL. Across a variety of tasks, Batch-ICL consistently exhibits improved accuracy compared to the average accuracy achieved through all permutations of ICL examples and occasionally outperforms even the best order permutation. Meanwhile, Batch-ICL reduces the computational resources needed for executing an ICL sequence.



*The researchers behind the publications deserve full credit for their work.


 





bottom of page