Pipeline Style: then, tap, and the Art of Chaining
Ruby'sthen method lets you pipe data through transformations like Unix pipes. tap lets you peek at data mid-chain without changing it. Combined with method chaining, you can build entire text processing pipelines that read top-to-bottom, left-to-right, just like a shell script.
Part 1: then / yield_self
then passes the receiver into a block and returns the block's result. Think of it like a Unix pipe: data flows through transformations.
No Perl equivalent. You'd use nested function calls or temporary variables. Ruby just lets the data flow.# Pipeline style - like Unix pipes result = data .then { |d| parse(d) } .then { |d| transform(d) } .then { |d| format(d) } # One-liner version File.read(path).then { JSON.parse(_1) }.then { _1["users"] } # Practical: read, parse, extract config = File.read("/etc/app.conf") .then { |text| text.split("\n") } .then { |lines| lines.reject { _1 =~ %r~^#~ } } .then { |lines| lines.map { _1.split("=", 2) } } .then { |pairs| pairs.to_h }
yield_self is an older alias for then (Ruby 2.5). Use then (Ruby 2.6+).
Part 2: tap (Debug Mid-Chain)
tap passes the receiver into a block but returns THE ORIGINAL OBJECT, not the block's result. Perfect for debugging without breaking the chain:
The key difference:# Inspect data at each step without changing it data.map(&:chomp) .tap { |x| STDERR.puts "After chomp: #{x.size} lines" } .select { _1 =~ %r~error~i } .tap { |x| STDERR.puts "After filter: #{x.size} lines" } .map(&:downcase) .tap { |x| STDERR.puts "Final: #{x.first(3).inspect}" }
thenreturns the BLOCK'S result (transforms the value)tapreturns the ORIGINAL value (inspects without changing)
tap is printf debugging for method chains. You'll use it more than you think.
Part 3: Method Chaining Best Practices
Both approaches are valid. Chaining is more concise; temp variables are more debuggable. Pick what reads best for your situation.# Good: clean pipeline lines .map(&:strip) .reject(&:empty?) .select { _1 =~ %r~ERROR~ } .map { _1.split[0] } .uniq .sort # vs. Perl-style temp variables (also fine, more familiar): stripped = lines.map(&:strip) non_empty = stripped.reject(&:empty?) errors = non_empty.select { _1 =~ %r~ERROR~ } timestamps = errors.map { _1.split[0] } result = timestamps.uniq.sort
Part 4: A Real Text Processing Pipeline
Twelve lines. Reads from file, filters error responses, extracts IPs, counts them, sorts by frequency, takes the top 10, and prints a formatted report. No temporary variables. No loops. Just data flowing through transformations.#!/usr/bin/env ruby # Process Apache access log - extract top IPs with errors File.readlines("/var/log/apache2/access.log") .map(&:chomp) .select { _1 =~ %r~ [45]\d{2} ~ } # 4xx and 5xx responses .map { _1.split[0] } # extract IP (first field) .tally # count occurrences .sort_by { |ip, count| -count } # sort by count descending .first(10) # top 10 .each { |ip, count| printf "%6d %s\n", count, ip }
Part 5: Implicit Return
Ruby methods return the last expression automatically, which enables chaining:Perl always needs explicitdef double(n) n * 2 # no 'return' needed end # This works because each method returns a value that feeds the next "hello".upcase.reverse.chars.first # => "O"
return (or relies on last expression, but it's less idiomatic). In Ruby, everything returns a value, and that's what makes pipelines possible.
Created By: Wildcard Wizard. Copyright 2026