WIKIPEDIA DATA API

Wikipedia Data API

7 endpoints covering articles, full-text search, categories, links, and multilingual data. Clean JSON, no scraping, no continuation tokens, no HTML parsing.

7 endpoints multilingual JSON output access-token auth

What the Wikipedia API gives you

Wikipedia holds the world's largest body of structured encyclopaedic knowledge โ€” millions of articles across hundreds of languages, all interlinked and categorised. The problem is that the raw MediaWiki API is verbose, inconsistent, and painful to work with at scale. The Anysite Wikipedia API wraps it in a clean, consistent REST interface: one POST call returns a fully structured article, section tree, category memberships, or search results. No parsing HTML, no dealing with continuation tokens, no maintaining fragile scrapers.

All Wikipedia endpoints

Endpoint What it returns
/api/wikipedia/articles Full article metadata โ€” title, description, summary, Wikidata ID, image, categories, langlinks, last revision, and canonical URL
/api/wikipedia/articles/categories Category memberships for an article โ€” category title, hidden flag, and canonical web URL
/api/wikipedia/articles/content Full plain-text article content plus a section index (level, heading, byte offset) for structured navigation
/api/wikipedia/articles/langlinks Interlanguage links โ€” language code and localized title for each linked Wikipedia edition
/api/wikipedia/articles/links Internal links within an article โ€” title and canonical URL for each outbound link
/api/wikipedia/articles/search Full-text search results โ€” ranked articles with snippet, word count, size, and last-updated timestamp
/api/wikipedia/categories/members Pages and subcategories belonging to a category โ€” title, type, added date, and canonical URL

What you can build

Knowledge graphs

Traverse the Wikipedia link graph at scale. Use /articles/links and /categories/members to map conceptual relationships between topics โ€” build ontologies, populate knowledge bases, or feed entity resolution pipelines.

Content research and summarisation

Pull structured article content for any topic with /articles/content. The section tree lets you extract only the sections you need โ€” ideal for LLM context windows, research assistants, and automated briefing tools.

NLP training datasets

Collect clean plain-text corpora across hundreds of languages. Use /articles/search to find articles by topic, then /articles/content to retrieve full text โ€” multilingual, structured, and consistent.

Multilingual entity linking

Map named entities across languages using /articles/langlinks. Given an article in English, retrieve its equivalents in Spanish, Japanese, or Arabic โ€” enabling cross-lingual disambiguation and translation pipelines.

Quick start

All Wikipedia endpoints use the access-token header for authentication. Here is a minimal example โ€” fetching a Wikipedia article by title:

SHELL
curl -X POST https://api.anysite.io/api/wikipedia/articles   -H "access-token: YOUR_API_KEY"   -H "Content-Type: application/json"   -d '{
    "article": "Python (programming language)",
    "language": "en"
  }'

To retrieve the full plain-text content and section structure of an article:

SHELL
curl -X POST https://api.anysite.io/api/wikipedia/articles/content   -H "access-token: YOUR_API_KEY"   -H "Content-Type: application/json"   -d '{
    "article": "Albert Einstein"
  }'

To search Wikipedia for articles matching a query:

SHELL
curl -X POST https://api.anysite.io/api/wikipedia/articles/search   -H "access-token: YOUR_API_KEY"   -H "Content-Type: application/json"   -d '{
    "keyword": "machine learning",
    "count": 10
  }'

Plans

Every Wikipedia endpoint is included in all Anysite plans โ€” there's no per-endpoint pricing to track. Use MCP Unlimited for flat-rate access through your AI tools, or a credit plan for REST & CLI at scale.

Frequently asked questions

What can I pass as the article identifier?
All article endpoints accept a title (e.g. "Albert Einstein"), a numeric page ID (e.g. "736"), or a full Wikipedia URL (e.g. "https://en.wikipedia.org/wiki/Photosynthesis"). Any of these forms resolves to the same article.
How do I retrieve articles in languages other than English?
Pass a language parameter with the two-letter Wikipedia language code โ€” for example "de" for German, "fr" for French, or "ja" for Japanese. The default is "en". Every endpoint that accepts an article or category parameter also accepts language.
What does /articles/content return and how is it structured?
It returns the complete plain-text of the article (no HTML markup) in a text field, along with a sections array. Each section entry carries an index, heading level, anchor slug, and byte offset into text โ€” so you can slice out any individual section without parsing the full body.
How does the category members endpoint work?
Pass the category name with or without the Category: prefix โ€” both "Machine learning" and "Category:Machine learning" are valid. The response lists every direct member (articles and subcategories), with each entry carrying its title, type, addition date, and canonical Wikipedia URL.
What is the difference between /articles/links and /articles/categories?
/articles/links returns the internal Wikipedia links within the article body โ€” the blue links to other Wikipedia pages. /articles/categories returns the categories the article has been assigned to by editors (shown at the bottom of a Wikipedia page). They serve different graph-traversal purposes: links give you the conceptual neighbourhood; categories give you the editorial classification.

Related endpoints

Start using the Wikipedia API

Structured Wikipedia data โ€” articles, search, categories, and link graphs โ€” via one clean REST API.