技能档案

Apify

Name: Apify
Author: danielmiessler

Social media scraping, business data, e-commerce via Apify actors — with auto-update workflow for actor catalog. USE WHEN Twitter, Instagram, LinkedIn, TikTok, YouTube, Facebook, Google Maps, Amazon scraping, Apify, update Apify actors, social media scraping, lead generation, web scraper.

danielmiessler11,564 星标2026年3月14日

职业
分类: 效率与集成

技能内容

Customization

Before executing, check for user customizations at: ~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/Apify/

If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.

🚨 MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION)

You MUST send this notification BEFORE doing anything else when this skill is invoked.

Send voice notification:

curl -s -X POST http://localhost:8888/notify \
  -H "Content-Type: application/json" \
  -d '{"message": "Running the WORKFLOWNAME workflow in the Apify skill to ACTION"}' \
  > /dev/null 2>&1 &

Output text notification:

相关技能

Apify | Skills Pool

Running the **WorkflowName** workflow in the **Apify** skill to ACTION...

import { scrapeInstagramProfile, searchGoogleMaps } from 'actors'

// 1. Call the actor wrapper
const profile = await scrapeInstagramProfile({
  username: 'target_username',
  maxPosts: 50
})

// 2. Filter in code - BEFORE data reaches model!
const viral = profile.latestPosts?.filter(p => p.likesCount > 10000)

// 3. Only filtered results reach model context
console.log(viral) // ~10 posts instead of 50

import { scrapeInstagramProfile, scrapeInstagramPosts } from 'actors'

// Get profile with recent posts
const profile = await scrapeInstagramProfile({
  username: 'competitor',
  maxPosts: 100
})

// Filter in code - only high-performing posts from last 30 days
const thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)
const topRecent = profile.latestPosts
  ?.filter(p =>
    new Date(p.timestamp).getTime() > thirtyDaysAgo &&
    p.likesCount > 5000
  )
  .sort((a, b) => b.likesCount - a.likesCount)
  .slice(0, 10)

// Only 10 posts reach model instead of 100!

import { searchLinkedInJobs } from 'actors'

const jobs = await searchLinkedInJobs({
  keywords: 'AI engineer',
  location: 'San Francisco',
  remote: true,
  maxResults: 200
})

// Filter in code - only senior roles at well-funded startups
const topJobs = jobs.filter(j =>
  j.seniority?.includes('Senior') &&
  parseInt(j.applicants || '0') > 50
)

import { scrapeTikTokHashtag } from 'actors'

const videos = await scrapeTikTokHashtag({
  hashtag: 'ai',
  maxResults: 500
})

// Filter in code - only viral content
const viral = videos
  .filter(v => v.playCount > 1000000)
  .sort((a, b) => b.playCount - a.playCount)
  .slice(0, 20)

import { searchGoogleMaps } from 'actors'

// Search with contact info extraction
const places = await searchGoogleMaps({
  query: 'restaurants in Austin',
  maxResults: 500,
  includeReviews: true,
  maxReviewsPerPlace: 20,
  scrapeContactInfo: true // Extracts emails from websites!
})

// Filter in code - only highly-rated with email/phone
const qualifiedLeads = places
  .filter(p =>
    p.rating >= 4.5 &&
    p.reviewsCount >= 100 &&
    (p.email || p.phone)
  )
  .map(p => ({
    name: p.name,
    rating: p.rating,
    reviews: p.reviewsCount,
    email: p.email,
    phone: p.phone,
    website: p.website,
    address: p.address
  }))

// Export leads - only qualified results!
console.log(`Found ${qualifiedLeads.length} qualified leads`)

import { scrapeGoogleMapsReviews } from 'actors'

const reviews = await scrapeGoogleMapsReviews({
  placeUrl: 'https://maps.google.com/maps?cid=12345',
  maxResults: 1000
})

// Filter in code - analyze sentiment by rating
const recentNegative = reviews
  .filter(r => {
    const thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)
    return (
      r.rating <= 2 &&
      new Date(r.publishedAtDate).getTime() > thirtyDaysAgo &&
      r.text.length > 50
    )
  })

// Identify common complaints
const complaints = recentNegative.map(r => r.text)

import { scrapeAmazonProduct } from 'actors'

const product = await scrapeAmazonProduct({
  productUrl: 'https://www.amazon.com/dp/B08L5VT894',
  includeReviews: true,
  maxReviews: 200
})

// Filter in code - only recent negative reviews
const recentNegative = product.reviews
  ?.filter(r => {
    const weekAgo = Date.now() - (7 * 24 * 60 * 60 * 1000)
    return (
      r.rating <= 2 &&
      new Date(r.date).getTime() > weekAgo
    )
  })

console.log(`Price: $${product.price}`)
console.log(`Rating: ${product.rating}/5`)
console.log(`Recent issues: ${recentNegative?.length} complaints`)

import { scrapeWebsite } from 'actors'

const products = await scrapeWebsite({
  startUrls: ['https://example.com/products'],
  linkSelector: 'a.product-link',
  maxPagesPerCrawl: 100,
  pageFunction: `
    async function pageFunction(context) {
      const { request, $, log } = context

      return {
        url: request.url,
        title: $('h1.product-title').text(),
        price: $('span.price').text(),
        inStock: $('.in-stock').length > 0,
        description: $('.description').text()
      }
    }
  `
})

// Filter in code - only available products under $100
const affordable = products.filter(p =>
  p.inStock &&
  parseFloat(p.price.replace('$', '')) < 100
)

import {
  scrapeInstagramHashtag,
  scrapeTikTokHashtag,
  searchYouTube
} from 'actors'

// Run all platforms in parallel
const [instagramPosts, tiktokVideos, youtubeVideos] = await Promise.all([
  scrapeInstagramHashtag({ hashtag: 'ai', maxResults: 100 }),
  scrapeTikTokHashtag({ hashtag: 'ai', maxResults: 100 }),
  searchYouTube({ query: '#ai', maxResults: 100 })
])

// Combine and filter - only viral content across all platforms
const allViral = [
  ...instagramPosts.filter(p => p.likesCount > 10000),
  ...tiktokVideos.filter(v => v.playCount > 100000),
  ...youtubeVideos.filter(v => v.viewsCount > 50000)
]

console.log(`Found ${allViral.length} viral posts across 3 platforms`)

import { searchGoogleMaps, scrapeLinkedInProfile } from 'actors'

// 1. Find businesses on Google Maps
const restaurants = await searchGoogleMaps({
  query: 'restaurants in SF',
  maxResults: 100,
  scrapeContactInfo: true
})

// 2. Filter for qualified leads
const qualified = restaurants.filter(r =>
  r.rating >= 4.5 &&
  r.email &&
  r.reviewsCount >= 50
)

// 3. Enrich with LinkedIn data (if available)
const enriched = await Promise.all(
  qualified.map(async (restaurant) => {
    // Try to find LinkedIn company page
    // ... additional enrichment logic
    return restaurant
  })
)

import {
  scrapeInstagramProfile,
  scrapeYouTubeChannel,
  scrapeTikTokProfile
} from 'actors'

async function analyzeCompetitor(username: string) {
  // Gather data from all platforms
  const [instagram, youtube, tiktok] = await Promise.all([
    scrapeInstagramProfile({ username, maxPosts: 30 }),
    scrapeYouTubeChannel({ channelUrl: `https://youtube.com/@${username}`, maxVideos: 30 }),
    scrapeTikTokProfile({ username, maxVideos: 30 })
  ])

  // Calculate engagement metrics in code
  return {
    username,
    instagram: {
      followers: instagram.followersCount,
      avgLikes: average(instagram.latestPosts?.map(p => p.likesCount) || []),
      engagementRate: calculateEngagement(instagram)
    },
    youtube: {
      subscribers: youtube.subscribersCount,
      avgViews: average(youtube.videos?.map(v => v.viewsCount) || [])
    },
    tiktok: {
      followers: tiktok.followersCount,
      avgPlays: average(tiktok.videos?.map(v => v.playCount) || [])
    }
  }
}

1. search-actors → 1,000 tokens
2. call-actor → 1,000 tokens
3. get-actor-output → 50,000 tokens (100 unfiltered posts)
TOTAL: ~52,000 tokens

const profile = await scrapeInstagramProfile({
  username: 'user',
  maxPosts: 100
})

// Filter in code - only top 10 posts
const top = profile.latestPosts
  ?.sort((a, b) => b.likesCount - a.likesCount)
  .slice(0, 10)

// TOTAL: ~500 tokens (only 10 filtered posts reach model)

# Required - Get from https://console.apify.com/account/integrations
APIFY_TOKEN=apify_api_xxxxx...

{
  memory: 2048,    // MB: 128, 256, 512, 1024, 2048, 4096, 8192
  timeout: 300,    // seconds
  build: 'latest'  // or specific build number
}

Apify

Customization

🚨 MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION)

Apify

Customization

🚨 MANDATORY: Voice Notification (REQUIRED BEFORE ANY ACTION)

Apify - Social Media & Web Scraping

🔌 File-Based MCP

🎯 Overview

📊 Available Actors

Social Media (5 platforms)

Business & Lead Generation

E-commerce

Web Scraping

🚀 Quick Start

Basic Usage Pattern

📚 Examples by Use Case

Social Media Monitoring

Lead Generation (Business Intelligence)

E-commerce & Competitive Intelligence

Custom Web Scraping

🎨 Advanced Patterns

Pattern 1: Multi-Platform Social Listening

Pattern 2: Lead Enrichment Pipeline

Pattern 3: Competitive Analysis Dashboard

💰 Token Savings Calculator

🔧 Actor Reference

Social Media

Instagram

LinkedIn

TikTok

YouTube

Facebook

Business & Lead Generation

Google Maps

E-commerce

Amazon

Web Scraping

General Web

⚙️ Configuration

🎯 When to Use This vs MCP

🔗 Links

Feishu Perm

Discord

Coding Agent (bash-first)

Apple Notes

Feishu Wiki

Bear Notes