Name: Bio Entrez Search
Author: GPTomics

Skills suchen.../

Bio Entrez Search | Skills Pool

from Bio import Entrez

Entrez.email = '[email protected]'  # Required by NCBI
Entrez.api_key = 'your_api_key'          # Optional, raises rate limit 3->10 req/sec

handle = Entrez.esearch(db='nucleotide', term='human[orgn] AND BRCA1[gene]')
record = Entrez.read(handle)
handle.close()

print(f"Found {record['Count']} records")
print(f"IDs: {record['IdList']}")  # First 20 IDs by default

Parameter	Description	Default
`db`	Database to search	Required
`term`	Search query	Required
`retmax`	Max IDs to return	20
`retstart`	Starting index (pagination)	0
`usehistory`	Store results on server	'n'
`sort`	Sort order	database-specific
`datetype`	Date field to search	'pdat'
`reldate`	Records from last N days	None
`mindate`	Start date (YYYY/MM/DD)	None
`maxdate`	End date (YYYY/MM/DD)	None

record['Count']        # Total matching records (string)
record['IdList']       # List of record IDs
record['RetMax']       # Number of IDs returned
record['RetStart']     # Starting index
record['QueryKey']     # For history server (if usehistory='y')
record['WebEnv']       # For history server (if usehistory='y')
record['TranslationSet']  # Query translations applied
record['QueryTranslation']  # Final translated query

# List all available databases
handle = Entrez.einfo()
record = Entrez.read(handle)
handle.close()
print(record['DbList'])  # ['pubmed', 'protein', 'nucleotide', ...]

# Get info about specific database
handle = Entrez.einfo(db='nucleotide')
record = Entrez.read(handle)
handle.close()

print(f"Description: {record['DbInfo']['Description']}")
print(f"Record count: {record['DbInfo']['Count']}")

# List searchable fields
for field in record['DbInfo']['FieldList']:
    print(f"{field['Name']}: {field['Description']}")

record['DbInfo']['DbName']       # Database name
record['DbInfo']['Description']  # Database description
record['DbInfo']['Count']        # Total records in database
record['DbInfo']['LastUpdate']   # Last update date
record['DbInfo']['FieldList']    # Searchable fields
record['DbInfo']['LinkList']     # Available links to other databases

handle = Entrez.egquery(term='CRISPR')
record = Entrez.read(handle)
handle.close()

for result in record['eGQueryResult']:
    if int(result['Count']) > 0:
        print(f"{result['DbName']}: {result['Count']} records")

# Search specific fields using [field_name]
term = 'BRCA1[gene]'                    # Gene name field
term = 'human[orgn]'                    # Organism field
term = 'Homo sapiens[ORGN]'             # Full organism name
term = 'NM_007294[accn]'                # Accession number
term = 'Smith J[auth]'                  # Author (PubMed)
term = 'Nature[jour]'                   # Journal (PubMed)
term = '1000:5000[slen]'                # Sequence length range
term = 'mRNA[fkey]'                     # Feature key

term = 'BRCA1 AND human'                # Both terms
term = 'cancer OR tumor'                # Either term
term = 'human NOT mouse'                # Exclude term
term = '(BRCA1 OR BRCA2) AND human'     # Grouping

# Using date parameters
handle = Entrez.esearch(
    db='pubmed',
    term='CRISPR',
    datetype='pdat',     # Publication date
    mindate='2023/01/01',
    maxdate='2024/12/31'
)

# Or in query string
term = 'CRISPR AND 2024[pdat]'
term = 'CRISPR AND 2023:2024[pdat]'

term = 'immun*'                         # Wildcard
term = '"breast cancer"[title]'         # Exact phrase

Database	`db` value	Common Fields
PubMed	`pubmed`	`[auth]`, `[title]`, `[jour]`, `[pdat]`
Nucleotide	`nucleotide`	`[orgn]`, `[gene]`, `[accn]`, `[slen]`
Protein	`protein`	`[orgn]`, `[gene]`, `[accn]`, `[molwt]`
Gene	`gene`	`[orgn]`, `[sym]`, `[chr]`
SRA	`sra`	`[orgn]`, `[platform]`, `[strategy]`
Taxonomy	`taxonomy`	`[scin]`, `[comn]`, `[rank]`
Assembly	`assembly`	`[orgn]`, `[level]`, `[refseq]`

from Bio import Entrez

Entrez.email = '[email protected]'

def search_ncbi(db, term, max_results=100):
    handle = Entrez.esearch(db=db, term=term, retmax=max_results)
    record = Entrez.read(handle)
    handle.close()
    return record['IdList'], int(record['Count'])

ids, total = search_ncbi('nucleotide', 'human[orgn] AND insulin[gene]')
print(f'Retrieved {len(ids)} of {total} total records')

def search_all_ids(db, term, batch_size=10000):
    all_ids = []
    handle = Entrez.esearch(db=db, term=term, retmax=0)
    record = Entrez.read(handle)
    handle.close()
    total = int(record['Count'])

    for start in range(0, total, batch_size):
        handle = Entrez.esearch(db=db, term=term, retstart=start, retmax=batch_size)
        record = Entrez.read(handle)
        handle.close()
        all_ids.extend(record['IdList'])

    return all_ids

# Store results on NCBI server for subsequent fetching
handle = Entrez.esearch(db='nucleotide', term='human[orgn] AND mRNA[fkey]', usehistory='y')
record = Entrez.read(handle)
handle.close()

webenv = record['WebEnv']
query_key = record['QueryKey']
total = int(record['Count'])

# Use webenv and query_key with efetch for batch downloads
# See batch-downloads skill for details

# Records from last 30 days
handle = Entrez.esearch(db='pubmed', term='CRISPR', reldate=30, datetype='pdat')
record = Entrez.read(handle)
handle.close()

def get_search_fields(db):
    handle = Entrez.einfo(db=db)
    record = Entrez.read(handle)
    handle.close()
    return [(f['Name'], f['Description']) for f in record['DbInfo']['FieldList']]

fields = get_search_fields('nucleotide')
for name, desc in fields[:10]:
    print(f'{name}: {desc}')

handle = Entrez.esearch(db='nucleotide', term='human BRCA1')
record = Entrez.read(handle)
handle.close()

# See how NCBI interpreted your query
print(f"Your query was translated to: {record['QueryTranslation']}")
# e.g., '"homo sapiens"[Organism] AND BRCA1[All Fields]'

Error	Cause	Solution
`HTTPError 429`	Rate limit exceeded	Add delays or use API key
`HTTPError 400`	Invalid query syntax	Check field names and operators
Empty IdList	No matches or typo	Check QueryTranslation field
`RuntimeError`	Missing email	Set `Entrez.email`

Need to search NCBI?
├── Finding records in one database?
│   └── Use Entrez.esearch()
├── Search across all databases?
│   └── Use Entrez.egquery()
├── Need database field names?
│   └── Use Entrez.einfo(db='database')
├── List all available databases?
│   └── Use Entrez.einfo() (no db argument)
├── Results > 10,000 records?
│   └── Use usehistory='y', then batch fetch
└── Need to fetch actual records?
    └── See entrez-fetch skill

Bio Entrez Search

Version Compatibility

Entrez Search

Required Setup

Bio Entrez Search

Version Compatibility

Entrez Search

Required Setup

Core Functions

Entrez.esearch() - Search a Database

Entrez.einfo() - Database Information

Entrez.egquery() - Global Query

Search Query Syntax

Field Tags

Boolean Operators

Date Ranges

Wildcards and Phrases

Common Databases

Code Patterns

Paginated Search for Large Results

Search with History Server (for Large Results)

Recent Records Only

Get Available Fields for a Database

Check Query Translation

Common Errors

Decision Tree

Deep Research

Academic Researcher

Brenda Database

Clinical Decision Support Documents

Goplaces

Research Ops

Bio Entrez Search

Version Compatibility

Entrez Search

Required Setup

Bio Entrez Search

Version Compatibility

Entrez Search

Required Setup

Core Functions

Entrez.esearch() - Search a Database

Entrez.einfo() - Database Information

Entrez.egquery() - Global Query

Search Query Syntax

Field Tags

Boolean Operators

Date Ranges

Wildcards and Phrases

Common Databases

Code Patterns

Basic Search with Pagination

Paginated Search for Large Results

Search with History Server (for Large Results)

Recent Records Only

Get Available Fields for a Database

Check Query Translation

Common Errors

Decision Tree

Related Skills

Deep Research

Academic Researcher

Brenda Database

Clinical Decision Support Documents

Goplaces

Research Ops