Publication

Automated Classification of Data Races Under Both Strong and Weak Memory Models

Abstract

Data races are one of the main causes of concurrency problems in multithreaded programs. Whether all data races are bad, or some are harmful and others are harmless, is still the subject of vigorous scientific debate [Narayanasamy et al. 2007; Boehm 2012]. What is clear, however, is that today's code has many data races [Kasikci et al. 2012; Jin et al. 2012; Erickson et al. 2010], and fixing data races without introducing bugs is time consuming [Godefroid and Nagappan 2008]. Therefore, it is important to efficiently identify data races in code and understand their consequences to prioritize their resolution. We present Portend+, a tool that not only detects races but also automatically classifies them based on their potential consequences: Could they lead to crashes or hangs? Could their effects be visible outside the program? Do they appear to be harmless? How do their effects change under weak memory models? Our proposed technique achieves high accuracy by efficiently analyzing multiple paths and multiple thread schedules in combination, and by performing symbolic comparison between program outputs. We ran Portend+ on seven real-world applications: it detected 93 true data races and correctly classified 92 of them, with no human effort. Six of them were harmful races. Portend+'s classification accuracy is up to 89% higher than that of existing tools, and it produces easy-to-understand evidence of the consequences of "harmful" races, thus both proving their harmfulness and making debugging easier. We envision Portend+ being used for testing and debugging, as well as for automatically triaging bug reports.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.