Create and manipulate Seq, MutableSeq, and SeqRecord objects using Biopython. Use when creating sequences from strings, modifying sequence data in-place, or building annotated sequence records.
Reference examples tested with: BioPython 1.83+
Before using code patterns, verify installed versions match. If versions differ:
pip show <package> then help(module.function) to check signaturesIf code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying.
Create and manipulate biological sequence objects using Biopython.
"Create a sequence object" → Wrap a raw string in a typed sequence container for biological operations.
Seq('ATGC') (BioPython) — string-like, supports complement/translateMutableSeq('ATGC') (BioPython) — supports in-place editsSeqRecord(Seq(...), id=...) (BioPython) — adds metadata for file I/Ofrom Bio.Seq import Seq, MutableSeq
from Bio.SeqRecord import SeqRecord
The basic sequence object. Immutable like Python strings.
seq = Seq('ATGCGATCGATCG')
Seq objects support string-like operations:
len(seq) # Length
seq[0] # First base
seq[-1] # Last base
seq[0:10] # Slice (returns Seq)
str(seq) # Convert to string
'ATG' in seq # Membership test
seq.count('G') # Count occurrences
seq.find('ATG') # Find position (-1 if not found)
seq.upper() # Uppercase
seq.lower() # Lowercase
seq * 3 # Repeat sequence
seq.strip() # Remove leading/trailing whitespace
For in-place modifications when performance matters.
mut_seq = MutableSeq('ATGCGATCG')
mut_seq[0] = 'C' # Modify single position
mut_seq[0:3] = 'GGG' # Replace slice
mut_seq.append('A') # Add to end
mut_seq.insert(0, 'G') # Insert at position
mut_seq.pop() # Remove and return last
mut_seq.remove('G') # Remove first occurrence
mut_seq.reverse() # Reverse in place
Convert between types:
seq = Seq(mut_seq) # MutableSeq to Seq
mut_seq = MutableSeq(seq) # Seq to MutableSeq
Sequence with metadata for file I/O and analysis.
record = SeqRecord(
Seq('ATGCGATCG'),
id='gene1',
name='example_gene',
description='An example gene sequence'
)
SeqRecord attributes:
record.seq # The Seq object
record.id # Identifier string
record.name # Name string
record.description # Description string
record.features # List of SeqFeature objects
record.annotations # Dict of annotations
record.letter_annotations # Per-letter annotations (quality scores)
record.dbxrefs # Database cross-references
Goal: Transform entire records while preserving metadata.
Approach: Use SeqRecord methods that return new records with features remapped to new coordinates.
# Reverse complement (preserves ID, updates features)
rc_record = record.reverse_complement(id='gene1_rc', description='reverse complement')
# Translate to protein (creates new SeqRecord with protein)
protein_record = record.translate(id='gene1_protein')
# Quick format output (returns string in file format)
fasta_str = record.format('fasta')
genbank_str = record.format('genbank')
Slicing preserves features (adjusted to new coordinates):
# Slice SeqRecord - features are clipped/adjusted automatically
subset = record[10:50] # Features outside range are dropped
dna = Seq('ATGCGATCGATCG')
rna = Seq('AUGCGAUCGAUCG')
protein = Seq('MRCRS')
record = SeqRecord(Seq('ATGCGATCG'), id='seq1', description='My sequence')
record = SeqRecord(Seq('ATGCGATCG'), id='gene1', description='Example')
record.annotations['organism'] = 'Homo sapiens'
record.annotations['molecule_type'] = 'DNA'
from Bio.SeqFeature import SeqFeature, FeatureLocation
record = SeqRecord(Seq('ATGCGATCGATCG'), id='gene1')
feature = SeqFeature(FeatureLocation(0, 9), type='CDS', qualifiers={'product': ['Example protein']})
record.features.append(feature)
sequences = ['ATGC', 'GCTA', 'TTAA']
records = [SeqRecord(Seq(s), id=f'seq_{i}') for i, s in enumerate(sequences)]
from copy import deepcopy
new_record = deepcopy(record)
new_record.id = 'modified_copy'
record = SeqRecord(Seq('ATGCGATCG'), id='seq1')
record.seq = Seq('GGGGGATCG') # Replace entire sequence
combined_seq = seq1 + Seq('NNNN') + seq2 # With linker
combined_record = SeqRecord(combined_seq, id='combined')
# Reverse complement a gene sequence
record = SeqRecord(Seq('ATGCGATCGATCG'), id='gene1', description='Forward strand')
rc_record = record.reverse_complement(id=f'{record.id}_rc', description='Reverse complement')
# Features are remapped to new coordinates
# Translate coding sequence
cds_record = SeqRecord(Seq('ATGCGATCGATCGTAA'), id='cds1', description='Coding sequence')
protein_record = cds_record.translate(id=f'{cds_record.id}_protein', to_stop=True)
record = SeqRecord(Seq('ATGCGATCG'), id='seq1', description='Example sequence')
print(record.format('fasta'))
# >seq1 Example sequence
# ATGCGATCG
| Error | Cause | Solution |
|---|---|---|
TypeError: 'Seq' object does not support item assignment | Trying to modify immutable Seq | Use MutableSeq instead |
TypeError: SeqRecord object argument must be a Seq object | Passed string instead of Seq | Wrap string in Seq() |
| Missing annotations in output | Didn't set required annotations | Add molecule_type to annotations for GenBank output |
Need to work with sequence data?
├── Just doing string-like operations?
│ └── Use Seq
├── Need to modify sequence in-place?
│ └── Use MutableSeq
├── Need metadata (ID, description, features)?
│ └── Use SeqRecord
└── Need to write to file?
└── Use SeqRecord with appropriate annotations