Extract Reddit Posts and Discussions via API
Subreddit posts, comment threads, user history, and cross-platform search. Structured Reddit data extraction. Structured JSON, cursor pagination, no Reddit app registration.
Reddit Has the Best Unfiltered Opinion Data on the Internet
Reddit is where people say what they actually think. Product reviews, technology comparisons, hiring experiences, brand complaints — the discussions on Reddit are more honest than any survey or focus group. Google increasingly surfaces Reddit threads in search results because the content is genuinely useful.
But accessing Reddit data programmatically has gotten harder. Reddit's official API went paid in 2023, killing most third-party apps. The free tier (100 requests per minute for personal use) requires OAuth, a registered app, and comes with strict usage policies that ban commercial data collection.
Pushshift, the academic archive that most researchers relied on, was restricted to moderation tools. The Reddit data pipeline that powered research, product development, and market analysis went dark for most teams.
Five Endpoints for Reddit Data
Subreddit Posts
Extract posts from any subreddit. Returns titles, body text, scores, comment counts, URLs, author info, and timestamps. Supports sort options (hot, new, top, rising).
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
subreddit |
string | Yes | Subreddit name (without r/) |
sort |
string | No | Sort order: hot, new, top, rising |
count |
integer | No | Number of posts to return |
cursor |
string | No | Pagination cursor from previous response |
Response
{
"posts": [
{
"id": "1abc234",
"title": "Just migrated our data pipeline to event-driven architecture. AMA.",
"body": "After 6 months of work, we moved from batch processing to...",
"author": "data_engineer_jane",
"score": 847,
"upvote_ratio": 0.94,
"num_comments": 234,
"url": "https://reddit.com/r/dataengineering/comments/1abc234/",
"created_utc": "2026-03-08T15:30:00Z",
"subreddit": "dataengineering",
"is_self": true,
"link_url": null
}
],
"has_more": true,
"cursor": "eyJhZnRlciI6InQzXzFhYmMyMzQifQ=="
}Post Comments
Extract the full comment thread for any post. Returns nested comment trees with author, score, timestamps, and reply chains.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
post_url |
string | Yes* | Reddit post URL |
post_id |
string | Yes* | Reddit post ID (alternative to post_url) |
sort |
string | No | Comment sort: best, top, new, controversial |
Response
{
"comments": [
{
"id": "c5d6e7f",
"author": "devops_mike",
"body": "We did the same migration last year. Key lesson: start with...",
"score": 234,
"created_utc": "2026-03-08T16:15:00Z",
"replies": [
{
"id": "g8h9i0j",
"author": "data_engineer_jane",
"body": "Totally agree. We hit the same wall with...",
"score": 89,
"replies": []
}
]
}
]
}Search Posts
Search across all of Reddit by keyword. Returns matching posts from any subreddit with engagement metrics.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | Yes | Search keywords |
subreddit |
string | No | Limit to specific subreddit |
sort |
string | No | relevance, top, new, comments |
count |
integer | No | Results per page |
cursor |
string | No | Pagination cursor |
User Posts
Extract all posts from a specific user. Useful for understanding a user's expertise, interests, and posting patterns across all subreddits.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
username |
string | Yes | Reddit username (without u/) |
sort |
string | No | Sort order: hot, new, top |
count |
integer | No | Number of posts to return |
cursor |
string | No | Pagination cursor |
User Comments
Extract a user's comment history across all subreddits. Reveals their areas of expertise and engagement patterns.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
username |
string | Yes | Reddit username (without u/) |
sort |
string | No | Sort order: hot, new, top |
count |
integer | No | Number of comments to return |
cursor |
string | No | Pagination cursor |
Code Examples
Production-ready examples for extracting Reddit data.
Product Sentiment Analysis
import requests API_KEY = "YOUR_API_KEY" BASE = "https://api.anysite.io" headers = {"access-token": API_KEY} # Search for product discussions results = requests.post(f"{BASE}/api/reddit/search/posts", headers=headers, json={ "query": "TechCorp review", "sort": "top", "count": 50 } ).json() # Pull comments from top discussions for post in results["posts"][:10]: comments = requests.post(f"{BASE}/api/reddit/posts/comments", headers=headers, json={"post_url": post["url"]} ).json() print(f"\n[{post['score']} pts] {post['title']}") print(f" {post['num_comments']} comments in r/{post['subreddit']}") for comment in comments["comments"][:3]: print(f" > {comment['body'][:100]}... ({comment['score']} pts)")
Competitive Intelligence
import requests API_KEY = "YOUR_API_KEY" BASE = "https://api.anysite.io" headers = {"access-token": API_KEY} # Monitor competitor mentions across key subreddits subreddits = ["dataengineering", "devops", "machinelearning", "SaaS"] competitor = "CompetitorName" all_mentions = [] for sub in subreddits: results = requests.post(f"{BASE}/api/reddit/search/posts", headers=headers, json={ "query": competitor, "subreddit": sub, "sort": "new", "count": 25 } ).json() all_mentions.extend(results["posts"]) print(f"Found {len(all_mentions)} mentions of {competitor}") for post in sorted(all_mentions, key=lambda p: p["score"], reverse=True)[:10]: print(f" r/{post['subreddit']} [{post['score']}] {post['title'][:80]}")
Subreddit Posts
# Get top posts from a subreddit curl -X POST "https://api.anysite.io/api/reddit/posts" \ -H "access-token: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"subreddit": "dataengineering", "sort": "top", "count": 25}'
Search Across Reddit
# Search all subreddits by keyword curl -X POST "https://api.anysite.io/api/reddit/search/posts" \ -H "access-token: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"query": "best data pipeline tool 2026", "count": 50}'
Post Comments
# Get full comment thread for a post curl -X POST "https://api.anysite.io/api/reddit/posts/comments" \ -H "access-token: YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"post_url": "https://reddit.com/r/dataengineering/comments/abc123/"}'
CLI Commands
anysite api /api/reddit/posts subreddit=dataengineering sort=top count=25
anysite api /api/reddit/search/posts \ query="kubernetes alternatives" sort=top count=50
anysite api /api/reddit/posts/comments \ post_url="https://reddit.com/r/dataengineering/comments/abc123/"
anysite api /api/reddit/user/posts username=data_engineer_jane anysite api /api/reddit/user/comments username=data_engineer_jane
Pipeline YAML — Market Research
name: reddit-market-research sources: product_discussions: endpoint: /api/reddit/search/posts input: query: "data pipeline tool recommendation" count: 50 detailed_threads: endpoint: /api/reddit/posts/comments depends_on: product_discussions input: post_url: ${product_discussions.url} on_error: skip competitor_mentions: endpoint: /api/reddit/search/posts input: query: ${file:competitor_names.txt} count: 25 parallel: 3 storage: format: parquet path: ./data/reddit-research
Use Cases
Product Research and Feature Validation
Problem
Before building a feature, you want to know if real users are asking for it. Before launching a product, you want to know what frustrations exist with current solutions. Surveys are biased — people give you answers they think you want to hear. Reddit gives you what they actually say.
Solution
Search for discussions about your problem space. Pull comment threads from "What tool do you use for X?" and "Frustrated with Y" posts. Analyze the recurring pain points, feature requests, and tool recommendations.
Result
Unfiltered market intelligence. Know exactly what problems users describe in their own words, which solutions they've tried, and what they wish existed.
Brand and Reputation Monitoring
Problem
Reddit threads about your product rank in Google search results. A negative experience post with 500 upvotes becomes the first thing potential customers see. Without monitoring, you discover these threads weeks later.
Solution
Search for your brand name, product names, and key executive names across Reddit. Pull comment threads to understand sentiment. Respond to legitimate complaints. Track mention volume and sentiment over time.
Result
Real-time brand intelligence from Reddit. Catch negative threads early, understand what drives positive mentions, and track your reputation as it evolves.
Content Ideation and SEO Research
Problem
Finding topics that your target audience actively discusses and cares about is the foundation of content strategy. Reddit discussions show you the exact questions people ask, the language they use, and the problems they describe.
Solution
Monitor relevant subreddits for trending topics, recurring questions, and popular comparisons. Use post scores and comment counts as proxy for audience interest. Extract the exact phrasing people use as seed keywords for SEO research.
Result
Content calendars built from real audience demand, not keyword tool estimates. The questions people ask on Reddit today are the blog posts and landing pages you should write tomorrow.
Academic and Social Research
Problem
Reddit is one of the largest sources of public discourse data. Researchers studying online communities, public opinion, technology adoption, or social dynamics need structured access to posts and comments at scale.
Solution
Extract subreddit posts and comment threads programmatically. Build datasets spanning specific time periods, topics, or communities. Store in Parquet or databases for analysis with standard research tools.
Result
Structured datasets from Reddit for academic analysis. Reproducible data collection pipelines that can be re-run to track changes over time.
How Anysite Compares
| Feature | Anysite | Reddit API (Free) | Reddit API (Paid) | Apify | Pushshift |
|---|---|---|---|---|---|
| Subreddit posts | 1 credit per request | 100 req/min | Higher limits | Actor-based | Restricted |
| Comments | 1 credit per thread | 100 req/min | Higher limits | Actor-based | Restricted |
| Search | Cross-platform | Reddit only | Reddit only | Actor-based | Historical only |
| User history | Posts + comments | Available | Available | Actor-based | Restricted |
| Authentication | API key | OAuth 2.0 | OAuth 2.0 + contract | API key | Moderator access |
| Commercial use | Allowed | Restricted | Contract required | Allowed | Not allowed |
| Pipeline support | YAML pipelines | None | None | Actor scheduling | None |
| Other platforms | 9+ sources | Reddit only | Reddit only | 1,800+ actors | Reddit only |
Reddit's free API tier (100 req/min) requires OAuth and prohibits commercial data collection. Their paid enterprise tier requires a sales contract. Anysite provides the same data access with a simple API key and credits — no OAuth, no usage restrictions, no contract negotiations.
Pushshift, the academic archive that researchers relied on, is now restricted to Reddit moderation tools. The historical data that powered most Reddit research is no longer publicly accessible.
Pricing
All endpoints cost 1 credit per request.
Endpoint Costs
| Endpoint | Cost |
|---|---|
| Subreddit posts | 1 credit |
| Post comments | 1 credit |
| Search posts | 1 credit |
| User posts | 1 credit |
| User comments | 1 credit |
Cost Examples
| Use Case | Monthly Volume | Credits | Recommended Plan |
|---|---|---|---|
| Monitor 5 subreddits (daily) | ~150 requests | ~150 | Starter ($49/mo) |
| Brand monitoring (daily search + threads) | ~300 requests | ~300 | Starter ($49/mo) |
| Market research (50 threads + comments) | ~100 requests | ~100 | Starter ($49/mo) |
| Full pipeline (search + comments + user history) | ~1,000 requests | ~1,000 | Starter ($49/mo) |
Reddit data extraction is among the most cost-efficient use cases. Comprehensive monitoring and research stays well within the Starter plan's 15,000 monthly credits.
Frequently Asked Questions
anysite llm classify can categorize posts by sentiment, topic, or custom categories.Related Endpoints
Start Extracting Reddit Data
7-day free trial with 1,000 credits. Posts, comments, search, and user history. No Reddit account required.