Name: Mongodb Database Architect
Author: williamzujkowski

Mongodb Database Architect | Skills Pool

Criteria	Embed	Reference
Relationship	1-to-1, 1-to-many (low cardinality)	many-to-many, 1-to-many (high cardinality)
Access Pattern	Always queried together	Often queried independently
Update Frequency	Infrequent updates	Frequent updates to related data
Data Growth	Bounded, predictable	Unbounded, grows over time
Document Size	<16 MB total	Risk of exceeding 16 MB limit
Atomic Writes	Need atomicity across related data	Atomicity not required

db.createCollection("users", {
  validator: {
    $jsonSchema: {
      bsonType: "object",
      required: ["email", "created_at"],
      properties: {
        email: {bsonType: "string", pattern: "^.+@.+$"},
        age: {bsonType: "int", minimum: 0, maximum: 150},
        created_at: {bsonType: "date"}
      }
    }
  }
})

Index Type	Use Case	Example
Single Field	Equality or range on one field	`db.users.createIndex({email: 1})`
Compound (ESR)	Multi-field queries (Equality, Sort, Range)	`db.orders.createIndex({status: 1, created_at: -1, total: 1})`
Multikey	Arrays (e.g., tags, categories)	`db.products.createIndex({tags: 1})`
Text	Full-text search	`db.posts.createIndex({content: "text"})`
Geospatial	Location-based queries (2dsphere)	`db.locations.createIndex({coordinates: "2dsphere"})`
Hashed	Sharding, equality-only queries	`db.sessions.createIndex({session_id: "hashed"})`
Wildcard	Flexible schema with many fields	`db.events.createIndex({"metadata.$**": 1})`
Partial	Index subset of documents	`db.users.createIndex({last_login: 1}, {partialFilterExpression: {active: true}})`

// Covered query: all fields in index, no document access
db.orders.createIndex({user_id: 1, status: 1, total: 1})
db.orders.find({user_id: 12345, status: "shipped"}, {_id: 0, user_id: 1, status: 1, total: 1})
// explain() shows: totalDocsExamined: 0 (covered by index)

// Query: Find active users, sort by created_at descending, filter age > 18
db.users.find({status: "active", age: {$gt: 18}}).sort({created_at: -1})

// Correct index (ESR):
// Equality: status (exact match)
// Sort: created_at (sort field)
// Range: age (range filter)
db.users.createIndex({status: 1, created_at: -1, age: 1})

Shard Key Type	Use Case	Pros	Cons
Hashed	Monotonically increasing IDs, even distribution	Uniform write distribution, no hotspots	Cannot use range queries efficiently on shard key
Ranged	Time-series data, natural ordering	Efficient range queries, targeted reads	Risk of hotspots if monotonic (e.g., timestamp)
Compound	Multi-tenant apps, complex access patterns	Balances distribution and query targeting	More complex to design

// Enable sharding on database
sh.enableSharding("myapp")

// Shard collection with hashed shard key
sh.shardCollection("myapp.users", {user_id: "hashed"})

// MongoDB 8.0: Move unsharded collection to specific shard
db.adminCommand({moveCollection: "myapp.analytics", toShard: "shard02"})

// Shard by timestamp for time-series data
sh.shardCollection("myapp.events", {timestamp: 1})

// Zone sharding (MongoDB 8.0): Route data by date ranges to specific shards
sh.addShardToZone("shard01", "recent")
sh.updateZoneKeyRange("myapp.events", {timestamp: ISODate("2025-01-01")}, {timestamp: MaxKey}, "recent")

// Optimized aggregation: $match early, $project late, use indexes
db.orders.aggregate([
  // Stage 1: $match FIRST (uses index, reduces documents)
  {$match: {status: "shipped", created_at: {$gte: ISODate("2025-01-01")}}},

  // Stage 2: $sort (uses index if compound index exists)
  {$sort: {created_at: -1}},

  // Stage 3: $lookup (join with users collection)
  {$lookup: {
    from: "users",
    localField: "user_id",
    foreignField: "_id",
    as: "user_details"
  }},

  // Stage 4: $group (aggregation after filtering)
  {$group: {
    _id: "$user_id",
    total_spent: {$sum: "$total"},
    order_count: {$sum: 1}
  }},

  // Stage 5: $project LAST (reduce network transfer)
  {$project: {_id: 1, total_spent: 1, order_count: 1}}
])

rs.initiate({
  _id: "myReplicaSet",
  members: [
    {_id: 0, host: "mongo1.example.com:27017", priority: 2},  // Primary (high priority)
    {_id: 1, host: "mongo2.example.com:27017", priority: 1},  // Secondary
    {_id: 2, host: "mongo3.example.com:27017", arbiterOnly: true}  // Arbiter (no data)
  ]
})

# Default: 50% of RAM - 1 GB

Mongodb Database Architect

Purpose & When-To-Use

Pre-Checks

Mongodb Database Architect

Purpose & When-To-Use

Pre-Checks

Procedure

T1: Quick Schema Review & Index Recommendations (≤2k tokens)

T2: Complete Architecture Design (≤6k tokens)

1. Document Schema Design

2. Advanced Indexing Strategy

3. Sharding Strategy

4. Aggregation Pipeline Optimization

5. Replica Set Configuration

6. WiredTiger Cache & Connection Pool Tuning

Vector Index Tuning

Azure Resource Manager Redis Dotnet

Redis Expert

Elasticsearch

Cache Expert

Abp Mongodb