Social media scraping, business data, e-commerce via Apify actors — with auto-update workflow for actor catalog. USE WHEN Twitter, Instagram, LinkedIn, TikTok, YouTube, Facebook, Google Maps, Amazon scraping, Apify, update Apify actors, social media scraping, lead generation, web scraper.
Before executing, check for user customizations at:
~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/Apify/
If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.
You MUST send this notification BEFORE doing anything else when this skill is invoked.
Send voice notification:
curl -s -X POST http://localhost:8888/notify \
-H "Content-Type: application/json" \
-d '{"message": "Running the WORKFLOWNAME workflow in the Apify skill to ACTION"}' \
> /dev/null 2>&1 &
Output text notification:
Running the **WorkflowName** workflow in the **Apify** skill to ACTION...
This is not optional. Execute this curl command immediately upon skill invocation.
Direct TypeScript access to 9 popular Apify actors with 99% token savings.
This skill is a file-based MCP - a code-first API wrapper that replaces token-heavy MCP protocol calls.
Why file-based? Filter data in code BEFORE returning to model context = 97.5% token savings.
Direct TypeScript access to the 9 most popular Apify actors without MCP overhead. Filter and transform data in code BEFORE it reaches the model context.
import { scrapeInstagramProfile, searchGoogleMaps } from 'actors'
// 1. Call the actor wrapper
const profile = await scrapeInstagramProfile({
username: 'target_username',
maxPosts: 50
})
// 2. Filter in code - BEFORE data reaches model!
const viral = profile.latestPosts?.filter(p => p.likesCount > 10000)
// 3. Only filtered results reach model context
console.log(viral) // ~10 posts instead of 50
Instagram - Track engagement:
import { scrapeInstagramProfile, scrapeInstagramPosts } from 'actors'
// Get profile with recent posts
const profile = await scrapeInstagramProfile({
username: 'competitor',
maxPosts: 100
})
// Filter in code - only high-performing posts from last 30 days
const thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)
const topRecent = profile.latestPosts
?.filter(p =>
new Date(p.timestamp).getTime() > thirtyDaysAgo &&
p.likesCount > 5000
)
.sort((a, b) => b.likesCount - a.likesCount)
.slice(0, 10)
// Only 10 posts reach model instead of 100!
LinkedIn - Job search:
import { searchLinkedInJobs } from 'actors'
const jobs = await searchLinkedInJobs({
keywords: 'AI engineer',
location: 'San Francisco',
remote: true,
maxResults: 200
})
// Filter in code - only senior roles at well-funded startups
const topJobs = jobs.filter(j =>
j.seniority?.includes('Senior') &&
parseInt(j.applicants || '0') > 50
)
TikTok - Trend analysis:
import { scrapeTikTokHashtag } from 'actors'
const videos = await scrapeTikTokHashtag({
hashtag: 'ai',
maxResults: 500
})
// Filter in code - only viral content
const viral = videos
.filter(v => v.playCount > 1000000)
.sort((a, b) => b.playCount - a.playCount)
.slice(0, 20)
Google Maps - Local business leads:
import { searchGoogleMaps } from 'actors'
// Search with contact info extraction
const places = await searchGoogleMaps({
query: 'restaurants in Austin',
maxResults: 500,
includeReviews: true,
maxReviewsPerPlace: 20,
scrapeContactInfo: true // Extracts emails from websites!
})
// Filter in code - only highly-rated with email/phone
const qualifiedLeads = places
.filter(p =>
p.rating >= 4.5 &&
p.reviewsCount >= 100 &&
(p.email || p.phone)
)
.map(p => ({
name: p.name,
rating: p.rating,
reviews: p.reviewsCount,
email: p.email,
phone: p.phone,
website: p.website,
address: p.address
}))
// Export leads - only qualified results!
console.log(`Found ${qualifiedLeads.length} qualified leads`)
Google Maps - Review sentiment analysis:
import { scrapeGoogleMapsReviews } from 'actors'
const reviews = await scrapeGoogleMapsReviews({
placeUrl: 'https://maps.google.com/maps?cid=12345',
maxResults: 1000
})
// Filter in code - analyze sentiment by rating
const recentNegative = reviews
.filter(r => {
const thirtyDaysAgo = Date.now() - (30 * 24 * 60 * 60 * 1000)
return (
r.rating <= 2 &&
new Date(r.publishedAtDate).getTime() > thirtyDaysAgo &&
r.text.length > 50
)
})
// Identify common complaints
const complaints = recentNegative.map(r => r.text)
Amazon - Price monitoring:
import { scrapeAmazonProduct } from 'actors'
const product = await scrapeAmazonProduct({
productUrl: 'https://www.amazon.com/dp/B08L5VT894',
includeReviews: true,
maxReviews: 200
})
// Filter in code - only recent negative reviews
const recentNegative = product.reviews
?.filter(r => {
const weekAgo = Date.now() - (7 * 24 * 60 * 60 * 1000)
return (
r.rating <= 2 &&
new Date(r.date).getTime() > weekAgo
)
})
console.log(`Price: $${product.price}`)
console.log(`Rating: ${product.rating}/5`)
console.log(`Recent issues: ${recentNegative?.length} complaints`)
Any Website - Custom extraction:
import { scrapeWebsite } from 'actors'
const products = await scrapeWebsite({
startUrls: ['https://example.com/products'],
linkSelector: 'a.product-link',
maxPagesPerCrawl: 100,
pageFunction: `
async function pageFunction(context) {
const { request, $, log } = context
return {
url: request.url,
title: $('h1.product-title').text(),
price: $('span.price').text(),
inStock: $('.in-stock').length > 0,
description: $('.description').text()
}
}
`
})
// Filter in code - only available products under $100
const affordable = products.filter(p =>
p.inStock &&
parseFloat(p.price.replace('$', '')) < 100
)
import {
scrapeInstagramHashtag,
scrapeTikTokHashtag,
searchYouTube
} from 'actors'
// Run all platforms in parallel
const [instagramPosts, tiktokVideos, youtubeVideos] = await Promise.all([
scrapeInstagramHashtag({ hashtag: 'ai', maxResults: 100 }),
scrapeTikTokHashtag({ hashtag: 'ai', maxResults: 100 }),
searchYouTube({ query: '#ai', maxResults: 100 })
])
// Combine and filter - only viral content across all platforms
const allViral = [
...instagramPosts.filter(p => p.likesCount > 10000),
...tiktokVideos.filter(v => v.playCount > 100000),
...youtubeVideos.filter(v => v.viewsCount > 50000)
]
console.log(`Found ${allViral.length} viral posts across 3 platforms`)
import { searchGoogleMaps, scrapeLinkedInProfile } from 'actors'
// 1. Find businesses on Google Maps
const restaurants = await searchGoogleMaps({
query: 'restaurants in SF',
maxResults: 100,
scrapeContactInfo: true
})
// 2. Filter for qualified leads
const qualified = restaurants.filter(r =>
r.rating >= 4.5 &&
r.email &&
r.reviewsCount >= 50
)
// 3. Enrich with LinkedIn data (if available)
const enriched = await Promise.all(
qualified.map(async (restaurant) => {
// Try to find LinkedIn company page
// ... additional enrichment logic
return restaurant
})
)
import {
scrapeInstagramProfile,
scrapeYouTubeChannel,
scrapeTikTokProfile
} from 'actors'
async function analyzeCompetitor(username: string) {
// Gather data from all platforms
const [instagram, youtube, tiktok] = await Promise.all([
scrapeInstagramProfile({ username, maxPosts: 30 }),
scrapeYouTubeChannel({ channelUrl: `https://youtube.com/@${username}`, maxVideos: 30 }),
scrapeTikTokProfile({ username, maxVideos: 30 })
])
// Calculate engagement metrics in code
return {
username,
instagram: {
followers: instagram.followersCount,
avgLikes: average(instagram.latestPosts?.map(p => p.likesCount) || []),
engagementRate: calculateEngagement(instagram)
},
youtube: {
subscribers: youtube.subscribersCount,
avgViews: average(youtube.videos?.map(v => v.viewsCount) || [])
},
tiktok: {
followers: tiktok.followersCount,
avgPlays: average(tiktok.videos?.map(v => v.playCount) || [])
}
}
}
Example: Instagram profile with 100 posts
MCP Approach:
1. search-actors → 1,000 tokens
2. call-actor → 1,000 tokens
3. get-actor-output → 50,000 tokens (100 unfiltered posts)
TOTAL: ~52,000 tokens
File-Based Approach:
const profile = await scrapeInstagramProfile({
username: 'user',
maxPosts: 100
})
// Filter in code - only top 10 posts
const top = profile.latestPosts
?.sort((a, b) => b.likesCount - a.likesCount)
.slice(0, 10)
// TOTAL: ~500 tokens (only 10 filtered posts reach model)
Savings: 99% reduction (52,000 → 500 tokens)
scrapeInstagramProfile(input) - Profile + postsscrapeInstagramPosts(input) - Posts from userscrapeInstagramHashtag(input) - Posts by hashtagscrapeInstagramComments(input) - Comments on postscrapeLinkedInProfile(input) - Profile + experience + emailsearchLinkedInJobs(input) - Job listingsscrapeLinkedInPosts(input) - Posts from profile/companyscrapeTikTokProfile(input) - Profile + videosscrapeTikTokHashtag(input) - Videos by hashtagscrapeTikTokComments(input) - Comments on videoscrapeYouTubeChannel(input) - Channel + videossearchYouTube(input) - Search videosscrapeYouTubeComments(input) - Comments on videoscrapeFacebookPosts(input) - Posts from pagesscrapeFacebookGroups(input) - Group postsscrapeFacebookComments(input) - Post commentssearchGoogleMaps(input) - Search places (with contact extraction!)scrapeGoogleMapsPlace(input) - Single place detailsscrapeGoogleMapsReviews(input) - Place reviewsscrapeAmazonProduct(input) - Product details + reviewsscrapeAmazonReviews(input) - Product reviews onlyscrapeWebsite(input) - Custom multi-page crawlingscrapePage(url, pageFunction) - Single page extractionEnvironment Variables:
# Required - Get from https://console.apify.com/account/integrations
APIFY_TOKEN=apify_api_xxxxx...
Actor Run Options:
{
memory: 2048, // MB: 128, 256, 512, 1024, 2048, 4096, 8192
timeout: 300, // seconds
build: 'latest' // or specific build number
}
Use File-Based (this skill):
Use MCP:
Remember: Filter data in code BEFORE returning to model context. This is where the 99% token savings happen!