0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

はじめに

移植やってます。
( from python 3.7 to ruby 2.7 )

lxml.etree (Python)

from lxml import etree

parser = etree.XMLParser(remove_comments=True, ns_clean=True)
tree = etree.parse(source, parser=parser)

データをxml形式で取り扱いますので、外部ライブラリであるlxmlが使用されます。

classlxml.etree.XMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, schema: XMLSchema = None, huge_tree=False, remove_blank_text=False, resolve_entities=True, remove_comments=False, remove_pis=False, strip_cdata=True, collect_ids=True, target=None, compact=True)

オプション 初期値 備考
remove_comments False コメントを破棄する
ns_clean False 冗長な名前空間宣言をクリーンアップします

rexml (Ruby)

まずは、標準ライブラリであるrexmlを調べました。

rexml.rb
xml_string = <<-EOT
<root>
  <ele_0/>
  text 0
  <!--comment 0-->
  <?target_0 pi_0?>
  <![CDATA[cdata 0]]>
  <ele_1/>
  text 1
  <!--comment 1-->
  <?target_0 pi_1?>
  <![CDATA[cdata 1]]>
</root>
EOT
context = {ignore_whitespace_nodes: :all, compress_whitespace: :all}
d = REXML::Document.new(xml_string, context)
root = d.root
root.children.size # => 10
root.each {|child| p "#{child.class}: #{child}" }

"REXML::Element: <ele_0/>"
"REXML::Text: \n text 0\n "
"REXML::Comment: comment 0"
"REXML::Instruction: <?target_0 pi_0?>"
"REXML::CData: cdata 0"
"REXML::Element: <ele_1/>"
"REXML::Text: \n text 1\n "
"REXML::Comment: comment 1"
"REXML::Instruction: <?target_0 pi_1?>"
"REXML::CData: cdata 1"

<!-- -->を読み込まないか、"REXML::Comment:を出力しなければ、目的は達成できそうです。

メモ

  • Python の lxml.etree を学習した
  • 百里を行く者は九十里を半ばとす
0
0
3

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?