Supplemental Information for the file: Barley_SNP_Annotations.txt This documentation was obtained on July 19, 2016 from the GitHub.com repository "MorrellLAB/SNPMeta"as SNPMeta_Manual.md and converted into a text file by DRUM curator Lisa Johnston. To view the current version visit: https://github.com/MorrellLAB/SNPMeta/blob/master/SNPMeta_Manual.md. Tabular Output When the --outfmt tabular option is given, SNPMeta will write tab-delimited text as output. Each row corresponds to a single SNP, and columns are various pieces of annotation information. Columns will be filled, depending on the amount of annotation information that is available for each SNP. If no data is available for a particular field, SNPMeta will write a - for that field. Some fields are mutually exclusive; that is, if one is filled, then the other will always be -. The table outputs 23 fields, which are explained below. Field Description SNPName The name of the SNP, taken from the FASTA record. Organism Organism from which the annotation was derived. GenBankID ID of the GenBank record used for annotation. ProteinID Protein ID of the CDS used for annotation, if applicable. GeneShortName Short name of the gene used for annotation, if applicable. Position Position of the SNP in the codon. Can take values of {1, 2, 3} if coding, and non-coding if not coding. Downstream How many bp downstream of the CDS the SNP lies, if non-coding. Upstream How many bp upstream of the CDS the SNP lies, if non-coding. Silent Whether or not the SNP alters amino acid sequence. Synonymous and non-coding SNPs getYes, and nonsynonymous SNPs get No. AA1 Amino acid state present in the GenBank record. AA2 Alternate amino acid state. May be identical to AA1, if synonymous. GranthamScore Score of amino acid substitution, as defined by Grantham (1974) in Science. CDSPosition Position of the affected residue in the CDS. Codon1 Codon triplet present in the GenBank sequence. Codon2 Alternate codon triplet. AmbiguityCode IUPAC ambiguity code that describes the two SNP states. ProductName Name of the CDS product, if applicable. Notes Messages and warnings about SNP files, or the nature of certain SNPs. Will be described in detail in the following section RelatedGene Short name of a related gene, found by searching through the list of BLAST hits. RelatedOrganism Source organism for the RelatedGene. ContextSequence Warnings about malformed SNP sequences. AlignScore Alignment score reported by needle, divided by the query sequence length in base pairs. Useful for flagging SNPs that may be called incorrectly due to low-quality alignments. DateTime Date and time the SNP was annotated. The ‘Notes’ Field A variety of messages can be written into the ‘Notes’ field. These will be elaborations on special classes of mutations, or messages about problems when annotating a particular SNP. The range of messages and their meanings are given below. Value Description Empty File The file SNPMeta tried to read contains no data. No BLAST Hits There were no BLAST hits for the sequence. If many SNPs have this message, it could be that either the list of target organisms is too restrictive, or there is a spelling error in the name of the organism. No Annotations The GenBank records returned by BLAST did not contain any annotated CDS. SNP aligns to a gap in the GenBank sequence When the GenBank sequence and the SNP contextual sequence are aligned, the SNP is placed over a gap in the GenBank sequence, making annotation impossible. SNP aligns to a gap in the CDS When the GenBank sequence and the SNP contextual sequence are aligned, the SNP is placed over a gap in the CDS. This could either be due to an alignment error, or a break in the annotated CDS. CDS annotation has a fuzzy start The CDS annotation in the GenBank record does not have an exact start position. This could cause issues with inferring reading frame. SNP - Sequence Mismatch The base in the GenBank sequence is not represented by the IUPAC ambiguity code in the SNP sequence. Methionine mutation The mutation changes an amino acid to a methionine, or from a methionine. Disrupted STOP codon The mutation changes a STOP codon to a different codon. Premature STOP codon The mutation creates a STOP codon before the end of the annotated CDS.