Module: MARC::NokogiriReader

Includes:
GenericPullParser
Defined in:
lib/marc/xml_parsers.rb

Overview

NokogiriReader uses the Nokogiri SAX Parser to quickly read a MARCXML document. Because dynamically subclassing MARC::XMLReader is a little ugly, we need to recreate all of the SAX event methods from Nokogiri::XML::SAX::Document here rather than subclassing.

Class Method Summary (collapse)

Instance Method Summary (collapse)

Methods included from GenericPullParser

#characters, #end_element_namespace, #start_element_namespace, #yield_record

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

- (Object) method_missing(methName, *args)



113
114
115
116
117
118
119
# File 'lib/marc/xml_parsers.rb', line 113

def method_missing(methName, *args)
  sax_methods = [:xmldecl, :start_document, :end_document, :start_element,
    :end_element, :comment, :warning, :error, :cdata_block]
  unless sax_methods.index(methName)
    raise NoMethodError.new("undefined method '#{methName} for #{self}", 'no_meth')
  end
end

Class Method Details

+ (Object) extended(receiver)



93
94
95
96
# File 'lib/marc/xml_parsers.rb', line 93

def self.extended(receiver)
  require 'nokogiri'
  receiver.init
end

Instance Method Details

- (Object) attributes_to_hash(attributes)



123
124
125
126
127
128
129
# File 'lib/marc/xml_parsers.rb', line 123

def attributes_to_hash(attributes)
  hash = {}
  attributes.each do | att |
    hash[att.localname] = att.value
  end
  hash
end

- (Object) each(&block)

Loop through the MARC records in the XML document



107
108
109
110
# File 'lib/marc/xml_parsers.rb', line 107

def each(&block)    
  @block = block
  @parser.parse(@handle)
end

- (Object) init

Sets our instance variables for SAX parsing in Nokogiri and parser



99
100
101
102
103
104
# File 'lib/marc/xml_parsers.rb', line 99

def init
  @record = {:record=>nil,:field=>nil,:subfield=>nil}
  @current_element = nil
  @ns = "http://www.loc.gov/MARC21/slim"
  @parser = Nokogiri::XML::SAX::Parser.new(self)         
end