News API Python: Importing web-based news article

Overview

It seems that the way that we consume information has changed a lot. We have become quite a news junkie recently. One thing, in particular, is that We have been reading quite a lot of international news to determine the stages of Covid-19 in our country.

To do this, we generally visit a lot of news media sites in various countries to read up on the news. This gave me an idea. Why not create international news for Corona? And here it is. This blog is about how We created the news data from NewsApi.

News API

The source of data comes from the News API, which lets me access articles from leading news outlets from various countries for free. The only caveat is that I could only hit the API 500 times a day, and there is a result limit of 100 results for a particular query for free accounts.

We tried to get around those limit barriers, so we don’t hit the API a lot. We also tried to get news data from last month using multiple filters to get a lot of data.

Python does not have built-in functionality for News API, so first we have to install the package, to run my initial queries.

Install news-api python

Now let’s jump to the coding approach to use the beauty of this package to retrieve the bulk news data:

 #Importing necessary files
 import pandas as pd
 import datetime
 from datetime import timedelta
 from newsapi.newsapi_client import NewsApiClient
 newsapi = NewsApiClient(api_key='aadd8b9994ac49fa9e742705846cb107')

Paste your own API-key which you get after registering on the News API. The primary way the API works is by giving us access to 3 functions.

a) A function to get Recent News from a country:

json_data = newsapi.get_top_headlines(language='en', country='in')
data = pd.DataFrame(json_data['articles'])
data.head()

#Here country=’in’ represents India. You can use the desired country of your own

news api python get data

b) A function to get “Everything” related to a query from the country. You can see the descriptions of API parameters here:

news api python everything

c) A function to get a list of sources from a Country programmatically. We can then use these sources to pull data from the “everything” API:

def get_sources(country):
    sources = newsapi.get_sources(country=country)
    sources = [x['id'] for x in sources['sources']]
    return sources
sources = get_sources(country='in')
print(sources[:5])

Output: 'google-news-in', 'the-hindu', 'the-times-of-india'

I used all the functions above to get data that refreshes at a particular cadence. You can see how I use these API functions in a loop to download the data.

For folks who are lost, you might like to start with the basics first. The News API is also free. There might be rate limits that might kick in even after we have tried to handle that.

Conclusion

Here I have tried to extract the web-based Covid news using Python. You can use this data to perform many analyses and it is quite sufficient for that. Literally thanks to News API for saving our time.

Leave a Comment

Your email address will not be published.