Name: Neqsim Stid Retriever
Author: equinor

Skills suchen.../

Neqsim Stid Retriever | Skills Pool

python devtools/pdf_to_figures.py task_solve/YYYY-MM-DD_slug/step1_scope_and_research/references/ \
    --outdir task_solve/YYYY-MM-DD_slug/figures/

task_solve/YYYY-MM-DD_task_slug/
└── step1_scope_and_research/
    └── references/
        ├── compressor_curves.pdf
        ├── mechanical_drawing.pdf
        └── equipment_datasheet.pdf

python devtools/new_task.py "compressor analysis" --type B \
    --refs-dir "/path/to/existing/docs"

# Download documents by tag — saves to task's references/ folder
python devtools/stid_download.py --task-dir task_solve/2026-04-16_my_task \
    --inst MYINST --tags 30PT0001 30PT0002 33AI0001

# Download + convert to PNG for AI analysis
python devtools/stid_download.py --task-dir task_solve/2026-04-16_my_task \
    --inst MYINST --tags 30PT0001 --convert-png

# Download specific document numbers
python devtools/stid_download.py --task-dir task_solve/2026-04-16_my_task \
    --inst MYINST --docs E001-AS-P-XB-00001-01 E001-AS-BI000-DS-00001

# Generic retrieval interface used by the task solver:
from devtools.doc_retriever import retrieve_documents

docs = retrieve_documents(
    tags=['35-KA001A'],
    doc_types=['CE', 'AA', 'MD', 'DS'],
    output_dir='step1_scope_and_research/references/'
)
# Returns list of downloaded file paths, or [] if no backend configured

Code	Type	When Relevant
`CE`	Performance Curves / Calculations	Compressor, pump, turbine analysis
`DS`	Data Sheet	Any equipment analysis
`AA`	General Arrangement Drawing	Physical layout, sizing
`MD`	Mechanical Drawing	Detailed dimensions, nozzles
`RV`	Vendor Manual / Report	Operating procedures, maintenance
`RE`	Report	Background reference
`ER`	Assembly / Erection Drawing	Installation, coupling details
`PL`	Parts List	Spare parts, BOM
`PI`	P&ID	Process topology
`PF`	PFD	Process flow overview
`IN`	Instrument Data Sheet	Control system design
`SP`	Specification	Material/piping requirements

DOC_RELEVANCE = {
    'compressor_analysis': {
        'CE': 1.0,   # Performance curves — essential
        'DS': 0.9,   # Data sheet — essential
        'AA': 0.7,   # General arrangement — useful
        'MD': 0.6,   # Mechanical drawing — useful
        'ER': 0.6,   # Assembly drawing — useful
        'RV': 0.5,   # Vendor manual — background
        'RE': 0.4,   # Report — background
        'PL': 0.2,   # Parts list — skip
        'SP': 0.3,   # Specification — skip
    },
    'heat_exchanger_analysis': {
        'DS': 1.0, 'CE': 0.9, 'AA': 0.7, 'MD': 0.6, 'RV': 0.5,
    },
    'separator_analysis': {
        'DS': 1.0, 'AA': 0.9, 'PI': 0.8, 'MD': 0.6, 'IN': 0.7,
    },
    'pipeline_design': {
        'DS': 1.0, 'SP': 0.9, 'CE': 0.7, 'MD': 0.6,
    },
    'general': {
        'DS': 1.0, 'CE': 0.9, 'AA': 0.7, 'PI': 0.7, 'MD': 0.6,
        'RV': 0.5, 'RE': 0.4, 'ER': 0.4, 'IN': 0.5, 'SP': 0.4,
        'PL': 0.2, 'PF': 0.6,
    },
}

def filter_relevant_docs(doc_list, task_type, min_relevance=0.5):
    """Filter documents by relevance to the task type.

    Args:
        doc_list: List of dicts with at least 'docType' or 'doc_type' key
        task_type: One of the keys in DOC_RELEVANCE
        min_relevance: Minimum score to keep (default 0.5)

    Returns:
        (relevant, filtered_out) — two lists
    """
    relevance_map = DOC_RELEVANCE.get(task_type, DOC_RELEVANCE['general'])
    relevant, filtered_out = [], []
    for doc in doc_list:
        dtype = doc.get('docType') or doc.get('doc_type', '')
        score = relevance_map.get(dtype, 0.0)
        if score >= min_relevance:
            relevant.append({**doc, '_relevance': score})
        else:
            filtered_out.append({**doc, '_relevance': score,
                                 '_reason': f'Below threshold ({score} < {min_relevance})'})
    return relevant, filtered_out

import fitz  # pymupdf

def pdf_to_pngs(pdf_path, output_dir, dpi=200):
    """Convert PDF pages to numbered PNG images."""
    import os
    doc = fitz.open(pdf_path)
    base = os.path.splitext(os.path.basename(pdf_path))[0]
    paths = []
    for i, page in enumerate(doc):
        pix = page.get_pixmap(dpi=dpi)
        out = os.path.join(output_dir, f"{base}_page{i+1}.png")
        pix.save(out)
        paths.append(out)
    doc.close()
    return paths

python devtools/pdf_to_figures.py step1_scope_and_research/references/ --outdir figures/

manifest = {
    "source": "local" | "backend" | "manual",
    "retrieval_date": "2026-04-16",
    "task_type": "compressor_analysis",
    "tags_searched": ["35-KA001A", "35-KA001B"],
    "documents_retrieved": [
        {
            "filename": "performance_curves.pdf",
            "doc_type": "CE",
            "title": "Performance Curves Compressor B",
            "relevance": 1.0,
            "pages": 41,
            "used_in_analysis": True
        }
    ],
    "documents_filtered_out": [
        {
            "filename": "parts_list.pdf",
            "doc_type": "PL",
            "title": "Spare Parts List",
            "relevance": 0.2,
            "reason": "Below relevance threshold (0.5)"
        }
    ]
}
# Save as step1_scope_and_research/retrieval_manifest.json

## Data Sources

- **Equipment tags:** 35-KA001A, 35-KA001B (export compressors)
- **Document source:** Local directory / Auto-retrieval / User-provided
- **Key documents used:**
  - performance_curves.pdf: Vendor performance maps (41 pages)
  - as_built_curves.pdf: Shop test results (4 pages)
  - general_arrangement.pdf: GA drawing with dimensions
- **Documents filtered out:** 8 (parts lists, generic specs — below relevance)

# Load retrieval manifest to know what's available
import json
manifest_path = TASK_DIR / 'step1_scope_and_research' / 'retrieval_manifest.json'
if manifest_path.exists():
    with open(manifest_path) as f:
        manifest = json.load(f)

    # Work only with relevant documents
    curve_docs = [d for d in manifest['documents_retrieved']
                  if d['doc_type'] == 'CE' and d['used_in_analysis']]
    print(f"Analyzing {len(curve_docs)} performance curve documents")

{
    "data_sources": {
        "retrieval_method": "local",
        "documents_retrieved": 13,
        "documents_analyzed": 5,
        "documents_filtered_out": 8,
        "key_documents": [
            "performance_curves.pdf — Vendor Performance Maps",
            "as_built_curves.pdf — Shop Test Results"
        ]
    }
}

from neqsim import jneqsim

# Create compressor with performance curves from extracted data
compressor = jneqsim.process.equipment.compressor.Compressor("Export Comp", feed)

# If curve data has been digitized from the images:
chart = compressor.getCompressorChart()
chart.setHeadUnit("kJ/kg")
chart.setUseCompressorChart(True)

# Add speed curves (extracted from performance map)
for speed, points in curve_data.items():
    curve = jneqsim.process.equipment.compressor.CompressorCurve(speed)
    for flow, head, eff in points:
        curve.addCurveDataPoint(flow, head, eff)
    chart.addCurve(curve)

step1_scope_and_research/references/
├── [auto-retrieved]    performance_curves_35KA001A.pdf    (from backend)
├── [auto-retrieved]    datasheet_35KA001A.pdf             (from backend)
├── [manual]            vendor_email_attachment.pdf         (user dropped in)
├── [manual]            field_test_report_2025.xlsx         (user dropped in)
└── [manual]            photo_nameplate.jpg                 (user dropped in)

{
    "documents_retrieved": [
        {"filename": "performance_curves.pdf", "source": "backend", "doc_type": "CE"},
        {"filename": "vendor_email_attachment.pdf", "source": "manual", "doc_type": "RE"},
        {"filename": "field_test_report_2025.xlsx", "source": "manual", "doc_type": "DS"}
    ]
}

Neqsim Stid Retriever

Document Retrieval Skill for Engineering Tasks

⚠️ CRITICAL: All documents go INSIDE the task folder

Neqsim Stid Retriever

Document Retrieval Skill for Engineering Tasks

⚠️ CRITICAL: All documents go INSIDE the task folder

Document Sources (Priority Order)

Option A: User Provides Documents (Works for Everyone)

Option B: Auto-Retrieval (Requires Backend Config — Not Public)

STID Download Helper (Recommended)

Generic retrieval interface

Document Classification and Relevance Filtering

Document Type Codes

Relevance Scoring

PDF-to-Image Extraction

Retrieval Manifest

Task Solver Integration

In task_spec.md

In analysis notebook

In results.json

Loading into NeqSim

Manual + Auto Coexistence

Iterative Retrieval During Analysis

How It Works

Feishu Doc

Summarize

Nano Pdf

Diffs

Customs Trade Compliance

Nutrient Document Processing