Kbase Query

Skills for querying the KBase/BERDL Datalake via the MCP REST API. Use this when users want to explore KBase databases, list tables, get schemas, sample data, or run SQL queries against the KBase data lake. Triggers on mentions of KBase, BERDL, or requests to query biological/microbiome data stored in KBase.

Occupation
Categories: Backend

Query the KBase/BERDL Datalake MCP Server via REST API using Python.

Setup

The KBASE_TOKEN environment variable must be set. Tokens expire after ~1 week.

Python API Functions

Use execute_code with these helper functions to query the lakehouse:

import os
import requests
import pandas as pd

KBASE_TOKEN = os.environ.get('KBASE_TOKEN')
BASE_URL = 'https://hub.berdl.kbase.us/apis/mcp'

def get_headers():
    return {
        'Authorization': f'Bearer {KBASE_TOKEN}',
        'Content-Type': 'application/json',
        'accept': 'application/json'
    }

def list_databases():
    '''List all available databases'''
    resp = requests.post(
        f'{BASE_URL}/delta/databases/list',
        headers=get_headers(),
        json={'use_hms': True, 'filter_by_namespace': True}
    )
    resp.raise_for_status()
    return resp.json()['databases']

def list_tables(database):
    '''List tables in a database'''
    resp = requests.post(
        f'{BASE_URL}/delta/databases/tables/list',
        headers=get_headers(),
        json={'database': database, 'use_hms': True}
    )
    resp.raise_for_status()
    return resp.json()['tables']

def query(sql, limit=100):
    '''Execute SQL query and return DataFrame'''
    resp = requests.post(
        f'{BASE_URL}/delta/tables/query',
        headers=get_headers(),
        json={'query': sql, 'limit': limit}
    )
    resp.raise_for_status()
    data = resp.json()
    rows = data.get('result', [])
    return pd.DataFrame(rows)

Kbase Query

Occupation
Categories: Backend

Python API Functions

Use execute_code with these helper functions to query the lakehouse:

import os import requests import pandas as pd KBASE_TOKEN = os.environ.get('KBASE_TOKEN') BASE_URL = 'https://hub.berdl.kbase.us/apis/mcp' def get_headers(): return { 'Authorization': f'Bearer {KBASE_TOKEN}', 'Content-Type': 'application/json', 'accept': 'application/json' } def list_databases(): '''List all available databases''' resp = requests.post( f'{BASE_URL}/delta/databases/list', headers=get_headers(), json={'use_hms': True, 'filter_by_namespace': True} ) resp.raise_for_status() return resp.json()['databases'] def list_tables(database): '''List tables in a database''' resp = requests.post( f'{BASE_URL}/delta/databases/tables/list', headers=get_headers(), json={'database': database, 'use_hms': True} ) resp.raise_for_status() return resp.json()['tables'] def query(sql, limit=100): '''Execute SQL query and return DataFrame''' resp = requests.post( f'{BASE_URL}/delta/tables/query', headers=get_headers(), json={'query': sql, 'limit': limit} ) resp.raise_for_status() data = resp.json() rows = data.get('result', []) return pd.DataFrame(rows)

Database	Description
`nmdc_core`	NMDC microbiome data (63 tables)
`globalusers_kepangenome_parquet_1`	Pangenomic data with GTDB taxonomy
`enigma_coral`	ENIGMA coral microbiome data

Table	Description
`kegg_ko_module`	KEGG ortholog to module mappings
`cog_categories`	COG functional categories
`go_terms`	Gene Ontology terms
`ec_terms`	Enzyme Commission terms
`annotation_terms_unified`	Unified annotation data

Endpoint	Method	Description
`/delta/databases/list`	POST	List databases
`/delta/databases/tables/list`	POST	List tables in database
`/delta/databases/tables/schema`	POST	Get table schema
`/delta/tables/query`	POST	Execute SQL query
`/delta/tables/sample`	POST	Sample rows (may timeout on large tables)
`/delta/tables/count`	POST	Get row count

Kbase Query

Setup

Python API Functions

Kbase Query

Setup

Python API Functions

Example Workflow

1. Explore available databases

2. List tables in a database

3. Query data with SQL

Key Databases

Key Tables in nmdc_core

API Reference

Available Endpoints

Tips

Xurl

Acp Router

Coding Standards

Api Design

Mcp Server Patterns

Backend Patterns