Sven Boekhoff, M.Sc.

Ph.D. student at the Max Planck Institute for Dynamics and Self-Organization, Göttingen
You are here: / Data / FASTA DNA Codes


The table below gives you the encoding for the four bases (A, C, T, G) and for ambiguous positions in your DNA-sequence. This one-letter-code is usually used in FASTA-Files and other DNA file formats. The etymology should give you a mnemonic to memorize the codes.

Code Meaning Etymology Complement Opposite
A A Adenosine T B
T/U T Thymidine/Uridine A V
G G Guanine C H
C C Cytidine G D
K G or T Keto M M
M A or C Amino K K
R A or G Purine Y Y
Y C or T Pyrimidine R R
S C or G Strong S W
W A or T Weak W S
B C or G or T not A (B comes after A) V A
V A or C or G not T/U (V comes after U) B T/U
H A or C or T not G (H comes after G) D G
D A or G or T not C (D comes after C) H C
X/N G or A or T or C any N .
. not G or A or T or C . N
- gap of indeterminate length

Source: This website is available in english

This website uses the following standards:
Valid XHTML 1.1, Valid CSS!