Wikipedia Data API
7 endpoints covering articles, full-text search, categories, links, and multilingual data. Clean JSON, no scraping, no continuation tokens, no HTML parsing.
What the Wikipedia API gives you
Wikipedia holds the world's largest body of structured encyclopaedic knowledge โ millions of articles across hundreds of languages, all interlinked and categorised. The problem is that the raw MediaWiki API is verbose, inconsistent, and painful to work with at scale. The Anysite Wikipedia API wraps it in a clean, consistent REST interface: one POST call returns a fully structured article, section tree, category memberships, or search results. No parsing HTML, no dealing with continuation tokens, no maintaining fragile scrapers.
All Wikipedia endpoints
| Endpoint | What it returns |
|---|---|
/api/wikipedia/articles |
Full article metadata โ title, description, summary, Wikidata ID, image, categories, langlinks, last revision, and canonical URL |
/api/wikipedia/articles/categories |
Category memberships for an article โ category title, hidden flag, and canonical web URL |
/api/wikipedia/articles/content |
Full plain-text article content plus a section index (level, heading, byte offset) for structured navigation |
/api/wikipedia/articles/langlinks |
Interlanguage links โ language code and localized title for each linked Wikipedia edition |
/api/wikipedia/articles/links |
Internal links within an article โ title and canonical URL for each outbound link |
/api/wikipedia/articles/search |
Full-text search results โ ranked articles with snippet, word count, size, and last-updated timestamp |
/api/wikipedia/categories/members |
Pages and subcategories belonging to a category โ title, type, added date, and canonical URL |
What you can build
Knowledge graphs
Traverse the Wikipedia link graph at scale. Use /articles/links and /categories/members to map conceptual relationships between topics โ build ontologies, populate knowledge bases, or feed entity resolution pipelines.
Content research and summarisation
Pull structured article content for any topic with /articles/content. The section tree lets you extract only the sections you need โ ideal for LLM context windows, research assistants, and automated briefing tools.
NLP training datasets
Collect clean plain-text corpora across hundreds of languages. Use /articles/search to find articles by topic, then /articles/content to retrieve full text โ multilingual, structured, and consistent.
Multilingual entity linking
Map named entities across languages using /articles/langlinks. Given an article in English, retrieve its equivalents in Spanish, Japanese, or Arabic โ enabling cross-lingual disambiguation and translation pipelines.
Quick start
All Wikipedia endpoints use the access-token header for authentication. Here is a minimal example โ fetching a Wikipedia article by title:
curl -X POST https://api.anysite.io/api/wikipedia/articles -H "access-token: YOUR_API_KEY" -H "Content-Type: application/json" -d '{
"article": "Python (programming language)",
"language": "en"
}'
To retrieve the full plain-text content and section structure of an article:
curl -X POST https://api.anysite.io/api/wikipedia/articles/content -H "access-token: YOUR_API_KEY" -H "Content-Type: application/json" -d '{
"article": "Albert Einstein"
}'
To search Wikipedia for articles matching a query:
curl -X POST https://api.anysite.io/api/wikipedia/articles/search -H "access-token: YOUR_API_KEY" -H "Content-Type: application/json" -d '{
"keyword": "machine learning",
"count": 10
}'
Plans
Every Wikipedia endpoint is included in all Anysite plans โ there's no per-endpoint pricing to track. Use MCP Unlimited for flat-rate access through your AI tools, or a credit plan for REST & CLI at scale.
Frequently asked questions
"Albert Einstein"), a numeric page ID (e.g. "736"), or a full Wikipedia URL (e.g. "https://en.wikipedia.org/wiki/Photosynthesis"). Any of these forms resolves to the same article.language parameter with the two-letter Wikipedia language code โ for example "de" for German, "fr" for French, or "ja" for Japanese. The default is "en". Every endpoint that accepts an article or category parameter also accepts language.text field, along with a sections array. Each section entry carries an index, heading level, anchor slug, and byte offset into text โ so you can slice out any individual section without parsing the full body.Category: prefix โ both "Machine learning" and "Category:Machine learning" are valid. The response lists every direct member (articles and subcategories), with each entry carrying its title, type, addition date, and canonical Wikipedia URL./articles/links returns the internal Wikipedia links within the article body โ the blue links to other Wikipedia pages. /articles/categories returns the categories the article has been assigned to by editors (shown at the bottom of a Wikipedia page). They serve different graph-traversal purposes: links give you the conceptual neighbourhood; categories give you the editorial classification.Related endpoints
Start using the Wikipedia API
Structured Wikipedia data โ articles, search, categories, and link graphs โ via one clean REST API.