Skip to content

Latest commit

 

History

History
114 lines (82 loc) · 15.4 KB

README.md

File metadata and controls

114 lines (82 loc) · 15.4 KB

arXiv2023

Number of papers: 16

  • Authors: Huang, Lei and Yu, Weijiang and Ma, Weitao and Zhong, Weihong and Feng, Zhangyin and Wang, Haotian and Chen, Qianglong and Peng, Weihua and Feng, Xiaocheng and Qin, Bing and others
  • Abstract: The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP), leading to remarkable advancements in text understanding and generation. Nevertheless, alongside these strides, LLMs exhibit a critical tendency to produce hallucinations, resulting in content that is inconsistent with real-world facts or user inputs. This phenomenon poses substantial challenges to their practical deployment and raises concerns over the reliability of LLMs in...
  • Link: Read Paper
  • Labels: hallucination in reasoning, survey
  • Authors: Sumers, Theodore R and Yao, Shunyu and Narasimhan, Karthik and Griffiths, Thomas L
  • Abstract: The prominent large language models (LLMs) of today differ from past language models not only in size, but also in the fact that they are trained on a combination of natural language and formal language (code). As a medium between humans and computers, code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity. In this survey, we present an overview of the various benefits of integrating code into LLMs' training data. Speci...
  • Link: Read Paper
  • Labels: agent design, survey
  • Authors: Zhang, Yifan and Yang, Jingqin and Yuan, Yang and Yao, Andrew Chi-Chih
  • Abstract: While language models are powerful and versatile, they often fail to address highly complex problems. This is because solving complex problems requires deliberate thinking, which has been only minimally guided during training. In this paper, we propose a new method called Cumulative Reasoning (CR), which employs language models in a cumulative and iterative manner to emulate human thought processes. By decomposing tasks into smaller components, \ournameb streamlines the problem-solving process, ...
  • Link: Read Paper
  • Labels: hallucination in reasoning, prompt strategy
  • Authors: Benjamin Steenhoek and Md Mahbubur Rahman and Shaila Sharmin and Wei Le
  • Abstract: Recently, pretrained language models have shown state-of-the-art performance on the vulnerability detection task. These models are pretrained on a large corpus of source code, then fine-tuned on a smaller supervised vulnerability dataset. Due to the different training objectives and the performance of the models, it is interesting to consider whether the models have learned the semantics of code relevant to vulnerability detection, namely bug semantics, and if so, how the alignment to bug semant...
  • Link: Read Paper
  • Labels: static analysis, bug detection, empirical study
  • Authors: David, Isaac and Zhou, Liyi and Qin, Kaihua and Song, Dawn and Cavallaro, Lorenzo and Gervais, Arthur
  • Abstract: We investigate the feasibility of employing large language models (LLMs) for conducting the security audit of smart contracts, a traditionally time-consuming and costly process. Our research focuses on the optimization of prompt engineering for enhanced security analysis, and we evaluate the performance and accuracy of LLMs using a benchmark dataset comprising 52 Decentralized Finance (DeFi) smart contracts that have previously been compromised. Our findings reveal that, when applied to vuln...
  • Link: Read Paper
  • Labels: static analysis, bug detection
  • Authors: Kamath, Adharsh and Senthilnathan, Aditya and Chakraborty, Saikat and Deligiannis, Pantazis and Lahiri, Shuvendu K and Lal, Akash and Rastogi, Aseem and Roy, Subhajit and Sharma, Rahul
  • Abstract: Loop invariants are fundamental to reasoning about programs with loops. They establish properties about a given loop's behavior. When they additionally are inductive, they become useful for the task of formal verification that seeks to establish strong mathematical guarantees about program's runtime behavior. The inductiveness ensures that the invariants can be checked locally without consulting the entire program, thus are indispensable artifacts in a formal proof of correctness. Finding in...
  • Link: Read Paper
  • Labels: static analysis, program verification
  • Authors: Zeyu Gao and Hao Wang and Yuchen Zhou and Wenyu Zhu and Chao Zhang
  • Abstract: As software becomes increasingly complex and prone to vulnerabilities, automated vulnerability detection is critically important, yet challenging. Given the significant successes of large language models (LLMs) in various tasks, there is growing anticipation of their efficacy in vulnerability detection. However, a quantitative understanding of their potential in vulnerability detection is still missing. To bridge this gap, we introduce a comprehensive vulnerability benchmark VulBench. This bench...
  • Link: Read Paper
  • Labels: static analysis, bug detection, benchmark
  • Authors: Xie, Danning and Yoo, Byungwoo and Jiang, Nan and Kim, Mijung and Tan, Lin and Zhang, Xiangyu and Lee, Judy S
  • Abstract: Software specifications are essential for ensuring the reliability of software systems. Existing specification extraction approaches, however, suffer from limited generalizability and require manual efforts. The recent emergence of Large Language Models (LLMs), which have been successfully applied to numerous software engineering tasks, offers a promising avenue for automating this process. In this paper, we conduct the first empirical study to evaluate the capabilities of LLMs for generating so...
  • Link: Read Paper
  • Labels: static analysis, specification inference
  • Authors: Ma, Wei and Liu, Shangqing and Lin, Zhihao and Wang, Wenhan and Hu, Qiang and Liu, Ye and Zhang, Cen and Nie, Liming and Li, Li and Liu, Yang
  • Abstract: Large language models~(LLMs) demonstrate significant potential to revolutionize software engineering (SE) by exhibiting outstanding performance in SE tasks such as code and document generation. However, the high reliability and risk control requirements in software engineering raise concerns about the lack of interpretability of LLMs. To address this concern, we conducted a study to evaluate the capabilities of LLMs and their limitations for code analysis in SE. We break down the abilities neede...
  • Link: Read Paper
  • Labels: static analysis, data-flow analysis, call graph analysis, data-flow analysis, code model, code model training, source code model, empirical study
  • Authors: Hu, Sihao and Huang, Tiansheng and {.I}lhan, Fatih and Tekin, Selim Furkan and Liu, Ling
  • Abstract: This paper provides a systematic analysis of the opportunities, challenges, and potential solutions of harnessing Large Language Models (LLMs) such as GPT-4 to dig out vulnerabilities within smart contracts based on our ongoing research. For the task of smart contract vulnerability detection, achieving practical usability hinges on identifying as many true vulnerabilities as possible while minimizing the number of false positives. Nonetheless, our empirical study reveals contradictory yet intere...
  • Link: Read Paper
  • Labels: static analysis, bug detection
  • Authors: Xu, Xiangzhe and Zhang, Zhuo and Feng, Shiwei and Ye, Yapeng and Su, Zian and Jiang, Nan and Cheng, Siyuan and Tan, Lin and Zhang, Xiangyu
  • Abstract: Decompilation aims to recover the source code form of a binary executable. It has many applications in security and software engineering such as malware analysis, vulnerability detection and code reuse. A prominent challenge in decompilation is to recover variable names. We propose a novel method that leverages the synergy of large language model (LLM) and program analysis. Language models encode rich multi-modal knowledge, but its limited input size prevents providing sufficient global context ...
  • Link: Read Paper
  • Labels: static analysis, program decompilation, code model, code model training, binary code model
  • Authors: Nan Jiang, Chengxiao Wang, Kevin Liu, Xiangzhe Xu, Lin Tan, Xiangyu Zhang, and Petr Babkin
  • Abstract: Binary code analysis is the foundation of crucial tasks in the security domain; thus building effective binary analysis techniques is more important than ever. Large language models (LLMs) although have brought impressive improvement to source code tasks, do not directly generalize to assembly code due to the unique challenges of assembly: (1) the low information density of assembly and (2) the diverse optimizations in assembly code. To overcome these challenges, this work proposes a hierarchica...
  • Link: Read Paper
  • Labels: static analysis, program decompilation, static analysis, code similarity analysis, code model, code model training, binary code model
  • Authors: Wai Kin Wong, Huaijin Wang, Zongjie Li, Zhibo Liu, Shuai Wang, Qiyi Tang, Sen Nie, and Shi Wu
  • Abstract: A C decompiler converts an executable into source code. The recovered C source code, once re-compiled, is expected to produce an executable with the same functionality as the original executable. With over twenty years of development, C decompilers have been widely used in production to support reverse engineering applications. Despite the prosperous development of C decompilers, it is widely acknowledged that decompiler outputs are mainly used for human consumption, and are not suitable for aut...
  • Link: Read Paper
  • Labels: static analysis, program decompilation
  • Authors: Mohammad Mahdi Mohajer and Reem Aleithan and Nima Shiri Harzevili and Moshi Wei and Alvine Boaye Belle and Hung Viet Pham and Song Wang
  • Abstract: We introduce SkipAnalyzer, a large language model (LLM)-powered tool for static code analysis. SkipAnalyzer has three components: 1) an LLM-based static bug detector that scans source code and reports specific types of bugs, 2) an LLM-based false-positive filter that can identify false-positive bugs in the results of static bug detectors (e.g., the result of step 1) to improve detection accuracy, and 3) an LLM-based patch generator that can generate patches for the detected bugs above. As a proo...
  • Link: Read Paper
  • Labels: static analysis, bug detection, agent design
  • Authors: Xi, Zhiheng and Chen, Wenxiang and Guo, Xin and He, Wei and Ding, Yiwen and Hong, Boyang and Zhang, Ming and Wang, Junzhe and Jin, Senjie and Zhou, Enyu and others
  • Abstract: For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the commu...
  • Link: Read Paper
  • Labels: survey, agent design
  • Authors: Avishree Khare and Saikat Dutta and Ziyang Li and Alaia Solko{-}Breslin and Rajeev Alur and Mayur Naik
  • Abstract: While automated vulnerability detection techniques have made promising progress in detecting security vulnerabilities, their scalability and applicability remain challenging. The remarkable performance of Large Language Models (LLMs), such as GPT-4 and CodeLlama, on code-related tasks has prompted recent works to explore if LLMs can be used to detect vulnerabilities. In this paper, we perform a more comprehensive study by concurrently examining a higher number of datasets, languages and LLMs, an...
  • Link: Read Paper
  • Labels: static analysis, bug detection, empirical study