Python3でNewspaper3kのarticleモジュール(Python2ならNewspaper)を紹介します。
インストール
>>> pip install newspaper3k
コマンド紹介
>>> from newspaper import Article
>>> article = Article('http://www.independent.co.uk/life-style/motoring/small-suvs-go-head-to-head-peugeot-3008-vs-toyota-c-hr-vs-seat-ateca-a7617746.html') # 記事URL
>>> article.download()
>>> article.html # htmlコードを取得
'<!DOCTYPE html>\n<!--[if IE 8]>\n<html class=...'
>>> article.parse() # 記事中身読込み
>>> article.title # 記事タイトルの取得
'Small SUVs go head-to-head: Peugeot 3008 vs Toyota C-HR vs Seat Ateca'
>>> article.authors # 記事著者の取得
['Graham Scott']
>>> article.publish_date # 記事公開日時の取得
datetime.datetime(2017, 3, 8, 10, 44, 46, tzinfo=tzutc())
>>> article.text # 記事テキストの取得
"For it seemed like ever, if you had a small SUV big..."