arXiv2023

Number of papers: 16

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

Authors: Huang, Lei and Yu, Weijiang and Ma, Weitao and Zhong, Weihong and Feng, Zhangyin and Wang, Haotian and Chen, Qianglong and Peng, Weihua and Feng, Xiaocheng and Qin, Bing and others
Abstract: The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP), leading to remarkable advancements in text understanding and generation. Nevertheless, alongside these strides, LLMs exhibit a critical tendency to produce hallucinations, resulting in content that is inconsistent with real-world facts or user inputs. This phenomenon poses substantial challenges to their practical deployment and raises concerns over the reliability of LLMs in...
Link: Read Paper
Labels: hallucination in reasoning, survey

Cognitive architectures for language agents

Authors: Sumers, Theodore R and Yao, Shunyu and Narasimhan, Karthik and Griffiths, Thomas L
Abstract: The prominent large language models (LLMs) of today differ from past language models not only in size, but also in the fact that they are trained on a combination of natural language and formal language (code). As a medium between humans and computers, code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity. In this survey, we present an overview of the various benefits of integrating code into LLMs' training data. Speci...
Link: Read Paper
Labels: agent design, survey

Cumulative reasoning with large language models

Authors: Zhang, Yifan and Yang, Jingqin and Yuan, Yang and Yao, Andrew Chi-Chih
Abstract: While language models are powerful and versatile, they often fail to address highly complex problems. This is because solving complex problems requires deliberate thinking, which has been only minimally guided during training. In this paper, we propose a new method called Cumulative Reasoning (CR), which employs language models in a cumulative and iterative manner to emulate human thought processes. By decomposing tasks into smaller components, \ournameb streamlines the problem-solving process, ...
Link: Read Paper
Labels: hallucination in reasoning, prompt strategy

Do Language Models Learn Semantics of Code? {A} Case Study in Vulnerability Detection

Authors: Benjamin Steenhoek and Md Mahbubur Rahman and Shaila Sharmin and Wei Le
Abstract: Recently, pretrained language models have shown state-of-the-art performance on the vulnerability detection task. These models are pretrained on a large corpus of source code, then fine-tuned on a smaller supervised vulnerability dataset. Due to the different training objectives and the performance of the models, it is interesting to consider whether the models have learned the semantics of code relevant to vulnerability detection, namely bug semantics, and if so, how the alignment to bug semant...
Link: Read Paper
Labels: static analysis, bug detection, empirical study

Do you still need a manual smart contract audit?

Authors: David, Isaac and Zhou, Liyi and Qin, Kaihua and Song, Dawn and Cavallaro, Lorenzo and Gervais, Arthur
Abstract: We investigate the feasibility of employing large language models (LLMs) for conducting the security audit of smart contracts, a traditionally time-consuming and costly process. Our research focuses on the optimization of prompt engineering for enhanced security analysis, and we evaluate the performance and accuracy of LLMs using a benchmark dataset comprising 52 Decentralized Finance (DeFi) smart contracts that have previously been compromised. Our findings reveal that, when applied to vuln...
Link: Read Paper
Labels: static analysis, bug detection

Finding inductive loop invariants using large language models

Authors: Kamath, Adharsh and Senthilnathan, Aditya and Chakraborty, Saikat and Deligiannis, Pantazis and Lahiri, Shuvendu K and Lal, Akash and Rastogi, Aseem and Roy, Subhajit and Sharma, Rahul
Abstract: Loop invariants are fundamental to reasoning about programs with loops. They establish properties about a given loop's behavior. When they additionally are inductive, they become useful for the task of formal verification that seeks to establish strong mathematical guarantees about program's runtime behavior. The inductiveness ensures that the invariants can be checked locally without consulting the entire program, thus are indispensable artifacts in a formal proof of correctness. Finding in...
Link: Read Paper
Labels: static analysis, program verification

How Far Have We Gone in Vulnerability Detection Using Large Language Models

Authors: Zeyu Gao and Hao Wang and Yuchen Zhou and Wenyu Zhu and Chao Zhang
Abstract: As software becomes increasingly complex and prone to vulnerabilities, automated vulnerability detection is critically important, yet challenging. Given the significant successes of large language models (LLMs) in various tasks, there is growing anticipation of their efficacy in vulnerability detection. However, a quantitative understanding of their potential in vulnerability detection is still missing. To bridge this gap, we introduce a comprehensive vulnerability benchmark VulBench. This bench...
Link: Read Paper
Labels: static analysis, bug detection, benchmark

Impact of large language models on generating software specifications

Authors: Xie, Danning and Yoo, Byungwoo and Jiang, Nan and Kim, Mijung and Tan, Lin and Zhang, Xiangyu and Lee, Judy S
Abstract: Software specifications are essential for ensuring the reliability of software systems. Existing specification extraction approaches, however, suffer from limited generalizability and require manual efforts. The recent emergence of Large Language Models (LLMs), which have been successfully applied to numerous software engineering tasks, offers a promising avenue for automating this process. In this paper, we conduct the first empirical study to evaluate the capabilities of LLMs for generating so...
Link: Read Paper
Labels: static analysis, specification inference

LLMs: Understanding Code Syntax and Semantics for Code Analysis

Authors: Ma, Wei and Liu, Shangqing and Lin, Zhihao and Wang, Wenhan and Hu, Qiang and Liu, Ye and Zhang, Cen and Nie, Liming and Li, Li and Liu, Yang
Abstract: Large language models~(LLMs) demonstrate significant potential to revolutionize software engineering (SE) by exhibiting outstanding performance in SE tasks such as code and document generation. However, the high reliability and risk control requirements in software engineering raise concerns about the lack of interpretability of LLMs. To address this concern, we conducted a study to evaluate the capabilities of LLMs and their limitations for code analysis in SE. We break down the abilities neede...
Link: Read Paper
Labels: static analysis, data-flow analysis, call graph analysis, data-flow analysis, code model, code model training, source code model, empirical study

Large language model-powered smart contract vulnerability detection: New perspectives

Authors: Hu, Sihao and Huang, Tiansheng and {.I}lhan, Fatih and Tekin, Selim Furkan and Liu, Ling
Abstract: This paper provides a systematic analysis of the opportunities, challenges, and potential solutions of harnessing Large Language Models (LLMs) such as GPT-4 to dig out vulnerabilities within smart contracts based on our ongoing research. For the task of smart contract vulnerability detection, achieving practical usability hinges on identifying as many true vulnerabilities as possible while minimizing the number of false positives. Nonetheless, our empirical study reveals contradictory yet intere...
Link: Read Paper
Labels: static analysis, bug detection

Lmpa: Improving decompilation by synergy of large language model and program analysis

Authors: Xu, Xiangzhe and Zhang, Zhuo and Feng, Shiwei and Ye, Yapeng and Su, Zian and Jiang, Nan and Cheng, Siyuan and Tan, Lin and Zhang, Xiangyu
Abstract: Decompilation aims to recover the source code form of a binary executable. It has many applications in security and software engineering such as malware analysis, vulnerability detection and code reuse. A prominent challenge in decompilation is to recover variable names. We propose a novel method that leverages the synergy of large language model (LLM) and program analysis. Language models encode rich multi-modal knowledge, but its limited input size prevents providing sufficient global context ...
Link: Read Paper
Labels: static analysis, program decompilation, code model, code model training, binary code model

Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning

Authors: Nan Jiang, Chengxiao Wang, Kevin Liu, Xiangzhe Xu, Lin Tan, Xiangyu Zhang, and Petr Babkin
Abstract: Binary code analysis is the foundation of crucial tasks in the security domain; thus building effective binary analysis techniques is more important than ever. Large language models (LLMs) although have brought impressive improvement to source code tasks, do not directly generalize to assembly code due to the unique challenges of assembly: (1) the low information density of assembly and (2) the diverse optimizations in assembly code. To overcome these challenges, this work proposes a hierarchica...
Link: Read Paper
Labels: static analysis, program decompilation, static analysis, code similarity analysis, code model, code model training, binary code model

Refining Decompiled C Code with Large Language Models

Authors: Wai Kin Wong, Huaijin Wang, Zongjie Li, Zhibo Liu, Shuai Wang, Qiyi Tang, Sen Nie, and Shi Wu
Abstract: A C decompiler converts an executable into source code. The recovered C source code, once re-compiled, is expected to produce an executable with the same functionality as the original executable. With over twenty years of development, C decompilers have been widely used in production to support reverse engineering applications. Despite the prosperous development of C decompilers, it is widely acknowledged that decompiler outputs are mainly used for human consumption, and are not suitable for aut...
Link: Read Paper
Labels: static analysis, program decompilation

SkipAnalyzer: An Embodied Agent for Code Analysis with Large Language Models

Authors: Mohammad Mahdi Mohajer and Reem Aleithan and Nima Shiri Harzevili and Moshi Wei and Alvine Boaye Belle and Hung Viet Pham and Song Wang
Abstract: We introduce SkipAnalyzer, a large language model (LLM)-powered tool for static code analysis. SkipAnalyzer has three components: 1) an LLM-based static bug detector that scans source code and reports specific types of bugs, 2) an LLM-based false-positive filter that can identify false-positive bugs in the results of static bug detectors (e.g., the result of step 1) to improve detection accuracy, and 3) an LLM-based patch generator that can generate patches for the detected bugs above. As a proo...
Link: Read Paper
Labels: static analysis, bug detection, agent design

The rise and potential of large language model based agents: A survey

Authors: Xi, Zhiheng and Chen, Wenxiang and Guo, Xin and He, Wei and Ding, Yiwen and Hong, Boyang and Zhang, Ming and Wang, Junzhe and Jin, Senjie and Zhou, Enyu and others
Abstract: For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the commu...
Link: Read Paper
Labels: survey, agent design

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

Authors: Avishree Khare and Saikat Dutta and Ziyang Li and Alaia Solko{-}Breslin and Rajeev Alur and Mayur Naik
Abstract: While automated vulnerability detection techniques have made promising progress in detecting security vulnerabilities, their scalability and applicability remain challenging. The remarkable performance of Large Language Models (LLMs), such as GPT-4 and CodeLlama, on code-related tasks has prompted recent works to explore if LLMs can be used to detect vulnerabilities. In this paper, we perform a more comprehensive study by concurrently examining a higher number of datasets, languages and LLMs, an...
Link: Read Paper
Labels: static analysis, bug detection, empirical study

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

arXiv2023

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

Cognitive architectures for language agents

Cumulative reasoning with large language models

Do Language Models Learn Semantics of Code? {A} Case Study in Vulnerability Detection

Do you still need a manual smart contract audit?

Finding inductive loop invariants using large language models

How Far Have We Gone in Vulnerability Detection Using Large Language Models

Impact of large language models on generating software specifications

LLMs: Understanding Code Syntax and Semantics for Code Analysis

Large language model-powered smart contract vulnerability detection: New perspectives

Lmpa: Improving decompilation by synergy of large language model and program analysis

Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning

Refining Decompiled C Code with Large Language Models

SkipAnalyzer: An Embodied Agent for Code Analysis with Large Language Models

The rise and potential of large language model based agents: A survey

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

Files

README.md

Latest commit

History

README.md

File metadata and controls

arXiv2023