Graph Embeddings for the extraction of Compiler Provenance features

Thumbnail Image

Persistent link to this item

View Statistics

Journal Title

Journal ISSN

Volume Title


Graph Embeddings for the extraction of Compiler Provenance features

Published Date




Thesis or Dissertation


This thesis explores the use of graph embedding methods for compiler provenance identification. Graph embedding algorithms are widely used to analyze, compare, or distinguish networks, or similar structures, that are too large to represent visually. Using graph embeddings to address the problem of compiler provenance identification is a novel approach. Our approach applies embedding algorithms to the control flow graphs of binaries. In this document, we explore two graph embedding methods: tiered approaches and alternative embedding representations for analysis. Our results indicate that our method has the potential for use in compiler provenance identification. Experiments show that our approach is able to distinguish between individual compilers, compiler versions, and compiler version flags with above-average accuracy. Future work may explore extracting the significant graph embeddings from our generated model, recreate the generalized graph from the embeddings, and identify significant structures for manual analysis.



University of Minnesota M.S. thesis. June 2019. Major: Computer Science. Advisor: Peter Peterson. 1 computer file (PDF); vii, 42 pages.

Related to



Series/Report Number

Funding information

Isbn identifier

Doi identifier

Previously Published Citation

Suggested citation

Straumann, Aleksandar N.. (2019). Graph Embeddings for the extraction of Compiler Provenance features. Retrieved from the University Digital Conservancy,

Content distributed via the University Digital Conservancy may be subject to additional license and use restrictions applied by the depositor. By using these files, users agree to the Terms of Use. Materials in the UDC may contain content that is disturbing and/or harmful. For more information, please see our statement on harmful content in digital repositories.