iPool is the (i)ntelligent content Pool of Axel Springer SE. Aim of this project is the establishment of a company-wide semantic content pool for a unified, standardized retrieval and distribution of content and publications. One of the key features is the semantic enrichment of content by natural language processing.
We are using Basic Auth for the authorization and protection of our api.
When making a request to our API you must always send the correct authorization header:
Authorization: Basic < TOKEN >
The header is a combination of username and secrect which is combined with a colon
and then Base64 encoded:
token(user: 'user', secret: 'password') --> Base64('user' + ':' + 'password') -->
A complete basic auth call using curl looks like the following:
curl -k --user user:password -X GET --header 'Accept: application/json'
The iPool provides a general search over news articles and also a search in social content such as Twitter and Instagram. Each document (for now except social content) is enriched with additional keywords, extracted geo locations, organizations, events and mentioned persons. Depending on the endpoint you will also get a reference to an external image or link to scaled down thumbnail image.
To semantically enrich a given content, you POST the content wrapped in json to the endpoint. For example:
Text to enrich:
Ein Dorf in Nordschweden steht unter Schock: US-Präsident Donald Trump lag mit seiner anfangs nur belächelten Behauptung richtig, dass etwas in Schweden passiert sei.
you POST the following JSON to our endpoint:
"body": "Ein Dorf in Nordschweden steht unter Schock: US-Präsident Donald Trump lag mit seiner anfangs nur belächelten Behauptung richtig, dass etwas in Schweden passiert sei.",
Which tells you that Donald Trump is a person and that Nordschweden is a geolocation.
We import a lot of social content from twitter and extract any hashtags used. The content is grouped in categories.
To get the list of existing categories make a call to:
When you have a category (for example 'DOSSIER - Journalismus'), you can get the hashtags used in this category by calling:
As a result you will get an array with buckets by time and topics.
yyyy-MM-dd'T'HH:mm:ss.SSSZZ. For example "2016-03-15T03:02:13.552Z"
To get all articles related to Merkel you can do a basic search:
To get all articles related to "Angel Merkel" you can do a phrase search with quotes:
To get all articles related to Merkel AND Putin you can do a basic search combining the words
api/search?q=Merkel AND Putin.
To get all articles related to Merkel OR Putin you can do a basic search combining the words
api/search?q=Merkel OR Putin.
To get all articles related to Putin AND NOT Merkel you can do a search combining the words
with AND and NOT:
api/search?q=Putin AND NOT Merkel.
To get all articles related to Putin AND NOT "Angela Merkel" you can do a search combining the
phrases with AND and NOT.
api/search?q=Putin AND NOT "Angela Merkel".
Articles provided by web providers such as WELT usually contain a reference to an image
thumbnail. To get an image related to Obama from the publisher WELT you could do the
api/search?q="Obama"&publisher="WELT" and extract the image url from the
To include or exclude geo locations or persons you can use the text filter functionality. It works
for the following fields: events, geos, keywords, orgs, persons, products and categories.
To search for the event "Landtagswahl" that is related to geo location "Baden" but not "Stuttgart" you can do the following search:
The social search contains documents from Instagram and Twitter. Each returned document contains
To get all images from instagram related a query, such as Trump you can do a social search:
First get an article you are interested in:
You now have a list of documents with each document having a field called 'identifier'. Use this
value to query the related endpoint.
Some of the publishers provide also categories for each document. This field is only available for the
The list of publishers with the additional information about category: DPA, RTR, BLOOM, AFP, SID, AGT, AP and GLP
To get all articles related to Sport from publisher DPA you can do a basic search:
To get a list of all available publishers for articles you can do the following query:
This includes rss feeds such as www.bild.de but also news agencies such as DPA.
To get a list of all available social media sources you can do the following query:
To get a list of all available news agencies you can do the following query: