46
46

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

複数のRSSに並列リクエスト後マージして返す関数

Last updated at Posted at 2014-05-22

わりと需要のある機能なのでコピペして使えるように関数化しました。

  • ほとんど @Hiraku さんの記事のパクリです。
  • RSS2.0に準拠していないXMLが返されてきたときの挙動は未定義です。ちゃんと対応したい場合は全ての必要項目に関して isset 構文などを用いたチェックが必要です。
関数定義
/**
 * fetch_rss_items
 *
 * @param $urls RSSのURLが記述された1次元配列
 * @return マージして新しい順にソートされたitemの2次元配列(pubDateはtimestampに変換済み)
 * @link http://qiita.com/Hiraku/items/1c67b51040246efb4254
 * @link http://qiita.com/mpyw/items/77288c948b64eab1a3b3
 */
function fetch_rss_items(array $urls) {
    
    /* 0. 配列の初期化 */
    $items = array();
    if (!$urls) {
        return $items;
    }
    
    /* 1. cURLリソースの準備 */
    $mh = curl_multi_init();
    foreach ($urls as $url) {
        $ch = curl_init();
        curl_setopt_array($ch, array(
            CURLOPT_URL => filter_var($url),
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_TIMEOUT => 5,
            CURLOPT_CONNECTTIMEOUT => 5,
            CURLOPT_ENCODING => 'gzip', // gzipを使ったほうが高速なので
        ));
        curl_multi_add_handle($mh, $ch);
    }
    
    /* 2. リクエストの開始 */
    while (curl_multi_exec($mh, $running) === CURLM_CALL_MULTI_PERFORM);
    
    /* 3. レスポンスの待機 */
    do switch (curl_multi_select($mh, 5)) {
        
        case -1: /* 失敗 */
            usleep(10);
            while (curl_multi_exec($mh, $running) === CURLM_CALL_MULTI_PERFORM);
        
        case 0: /* タイムアウト */
            continue 2;
        
        default: /* どれかが読み取り可能な状態になった */
            while (curl_multi_exec($mh, $running) === CURLM_CALL_MULTI_PERFORM);
            do if ($info = curl_multi_info_read($mh, $remains)) {
                $xml = curl_multi_getcontent($info['handle']);
                curl_multi_remove_handle($mh, $info['handle']);
                if (!$xml = @simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NOCDATA)) {
                    continue;
                }
                foreach ($xml->channel->item as $item) {
                    $items[] = array(
                        'title' => (string)$item->title,
                        'description' => (string)$item->description,
                        'link'  => (string)$item->link,
                        'timestamp' => strtotime((string)$item->pubDate),
                        'site_title' => (string)$xml->channel->title,
                        'site_description' => (string)$xml->channel->description,
                        'site_link' => (string)$xml->channel->link,
                    );
                }
            } while ($remains);
            
    } while ($running);
    
    /* 4. タイムスタンプが新しい順にソート */
    usort($items, function ($a, $b) {
        return $b['timestamp'] - $a['timestamp'];
    });
    
    /* 5. 配列を返す */
    return $items;
    
}
使用例
print_r(fetch_rss_items(array(
    'http://www.feedforall.com/sample.xml',
    'http://www.feedforall.com/sample-feed.xml',
)));
46
46
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
46
46

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?