Help us understand the problem. What is going on with this article?

広告フィード出力時にInput is not proper UTF-8

More than 3 years have passed since last update.

現象

広告フィードファイルを出力したら、chromeとfirefoxなどで読み込みすると、エラーになります。

This page contains the following errors:
error on line 752 at column 910:Input is not proper UTF-8,indicate encoding!
Bytes: 0xE7 0x9C 0x8B 0xE8

原因

書いてる通り、UTF-8として不正な(変化できない)バイトが存在してるため

解決

不正なバイトを無くせば良いので、こちらを参考にしました。
http://stackoverflow.com/questions/8635578/how-to-check-whether-the-character-is-utf-8

scrubメソッドで一発でできる
以下はファイルパスを渡して、不正なバイトをなくして、ファイルへ書き直すこと

def delete_invalid_char(file_path)
  valid_string = File.read(file_path, :encoding => Encoding::UTF_8).scrub('')
  File.write(file_path, valid_string)
end

検証

  • 不正なエンコードされていないか

    File.read('feed.xml').valid_encoding?
    
  • フィード形式などが正しいかどうかを検証
    https://github.com/alexdunae/w3c_validators

    require 'w3c_validators'
    
    include W3CValidators
    
    @validator = FeedValidator.new
    
    results = @validator.validate_file('feed.xml')
    
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
No comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
ユーザーは見つかりませんでした