Browsing by Author "McCamant, Stephen"
Now showing 1 - 5 of 5
- Results Per Page
- Sort Options
Item Bit-Vector Model Counting using Statistical Estimation(2016-11-07) Kim, Seonmo; McCamant, StephenApproximate model counting for bit-vector SMT formulas (generalizing #SAT) has many applications such as probabilistic inference and quantitative information-flow security, but it is computationally difficult. Adding random parity constraints (XOR streamlining) and then checking satisfiability is an effective approximation technique, but it requires a prior hypothesis about the model count to produce useful results. We propose an approach inspired by statistical estimation to continually refine a probabilistic estimate of the model count for a formula, so that each XOR-streamlined query yields as much information as possible. We implement this approach, using estimated probability distributions, as a wrapper around an off-the-shelf SMT solver or SAT solver. Experimental results show that the implementation is faster than the most similar previous approach which used a simpler refinement strategy. The technique also lets us model count formulas over floating-point constraints, which we demonstrate with an application to a vulnerability in differential privacy mechanisms.Item Conservative Signed/Unsigned Type Inference for Binaries using Minimum Cut(2014-01-29) Yan, Qiuchen; McCamant, StephenRecovering variable types or other structural information from binaries is useful for reverse engineering in security, and to facilitate other kinds of analysis on binaries. However such reverse engineering tasks often lack precise problem definitions; some information is lost during compilation, and existing tools can exhibit a variety of errors. As a step in the direction of more principled reverse engineering algorithms, we isolate a sub-task of type inference, namely determining whether each integer variable is declared as signed or unsigned. The difficulty of this task arises from the fact that signedness information in a binary, when present at all, is associated with operations rather than with data locations. We propose a graph-based algorithm in which variables represent nodes and edges connect variables with the same signedness. In a program without casts or conversions, signed and unsigned variables would form distinct connected components, but when casts are present, signed and unsigned variables will be connected. Reasoning that developers prefer source code without casts, we compute a minimum cut between signed and unsigned variables, which corresponds to a minimal set of casts required for a legal typing. We evaluate this algorithm by erasing signedness information from debugging symbols, and testing how well our tool can recover it. Applying an intra-procedural version of the algorithm to the GNU Coreutils, we we observe that many variables are unconstrained as to signedness, but that it almost all cases our tool recovers either the type from the original source, or a type that yields the same program behavior.Item Contract discovery from black-box components(2018) Sharma, Vaibhav; Byun, Taejoon; McCamant, Stephen; Rayadurgam, Sanjai; Heimdahl, MatsComplex computer-controlled systems are commonly constructed in a middle-out fashion where existing subsystems and available components have a significant influence on system architecture and drive design decisions. During system design, the architect must verify that the components, put together as specified in the architecture, will achieve the desired system behavior. This typically leads to further design modifications or adjustments to requirements triggering another iteration of the design-verify cycle. For software components that are acquired from third-parties, often the only definitive source of information about the component's system-relevant behavior -- its contract -- is its object code. We posit that existing static and dynamic analysis techniques can be used to discover contracts that can help the system designer and specifically discuss how symbolic execution of object code may be particularly well-suited for this purpose.Item Discovering Instructions for Robust Binary-level Coverage Criteria(2017) Sharma, Vaibhav; Byun, Taejoon; McCamant, Stephen; Rayadurgam, Sanjai; Heimdahl, MatsObject-Branch Coverage (OBC) is often used to measure e ective- ness of test suites, when source code is unavailable. The traditional OBC de nition can be made more resilient to variations in compil- ers and the structure of generated code by creating more robust de nitions. However nding which instructions should be included in each new de nition is laborious, error-prone, and architecture- dependent. We automate the discovery of instructions to be in- cluded for an improved OBC de nition on the X86 and ARM archi- tectures. We discover all possible valid instructions by symbolically executing instruction decoders for X86 and ARM instructions. For each discovered instruction, we translate it to Vine IR, and check if the Vine IR translation satis es the OBC de nition. We verify the correctness of our tool by comparing its output with the X86 and ARM architecture manuals. Our automated instruction clas- si cation facilitates development of more robust OBC de nitions with better bug- nding ability and lesser sensitivity to compiler variations.Item Toward Rigorous Object-Code Coverage Criteria(2017-06-13) Byun, Taejoon; Sharma, Vaibhav; Rayadurgam, Sanjai; McCamant, Stephen; Heimdahl, MatsObject-branch coverage (OBC) is often used as a measure of the thoroughness of tests suites, augmenting or substituting source-code based structural criteria such as branch coverage and modified condition/decision coverage (MC/DC). In addition, with the increasing use of third-party components for which source-code access may be unavailable, robust object-code coverage criteria are essential to assess how well the components are exercised during testing. While OBC has the advantage of being programming language independent and is amenable to non-intrusive coverage measurement techniques, variations in compilers, and the optimizations they perform can substantially change what is seen as an object branch, which itself appears to be an informally understood concept. To address the need for a robust object coverage criterion, this paper proposes a rigorous definition of OBC such that it captures well the semantic of source code branches for a given instruction set architecture. We report an empirical assessment of these criteria for the Intel x86 instruction set on several examples from embedded control systems software. Preliminary results indicate that object-code coverage can be made robust to compilation variations and is comparable in its bug-finding efficacy to source level MC/DC.