A service for mining changes and interactions from revisioned writing platforms.
The APIs provide provenance and change information about the tokens a Wikipedia article consists of, for
several languages. Apart from the source language edition they draw from, their specifications and usage
identical, as described below.
WikiWho API EN
WikiWho API DE
WikiWho API EU
WikiWho API TR
WikiWho API ES
The WikiWho wrapper provide easy access to the API:
pip install wikiwho_wrapper
from wikiwho_wrapper import WikiWho ww = WikiWho(lng='de') # or WikiWho(USERNAME, PASSWORD, lng='de') #You can either use api to directly access the JSON (raw format from api.wikiwho.net) response = ww.api.all_content("Bioglass") # Or you can use the dataview to obtain a pandas DataFrame with the data dataView = ww.dv.all_content("Bioglass")
For each article page, the API mirrors its current state on the Wikipedia. The API is based on the WikiWho algorithm (~95% acc.).
Currently, there is a limit of 2000 requests/day for unregistered users, and also a 60 requests/minute limit for all users.Terminology used:
See the description of the different query types for more information.
A dataset with this data (until Nov. 2016, no redirects) is available for download at https://doi.org/10.5281/zenodo.345571.
Please cite it as well if you use data from this API in your research (note that the dataset excludes redirect articles and tokenization can slightly differ from the API version, as we continuously improve it).
An example call: Cologne
This collection of APIs can be thought of as an additional service on top of the core WikiWho data described above, available for the same languages. The same term descriptions as above apply.
The goal is to deliver annotated HTML of a Wiki article that can be read by a browser (instead of annotated, tokenized Wikitext as delivered by the primary API).
Annotations available per token (realized via <span>) are currently:
plus certain metadata (e.g., ‘present’ authors and their percentages of words originally written in the current revision, revision list with metadata).
The main use case so far (hence the name) is colored annotation of text parts with this meta information, for example in a Grease-/Tampermonkey userscript we developed, which runs in a browser extension and can be used on any Wikipedia article to find out how wrote and changed which words via simple visual inspection. Find our more here.
Note that this API project is still in alpha, several elements of Wiki pages cannot yet be annotated, such as Tables, Infoboxes (or any templates), certain references. To find out more about the backend that delivers this data and to file issues or even contribute (always welcome!) check our Github Repository.