io_splice RubyGem user+dev discussion/patches/pulls/bugs/help
 help / color / mirror / code / Atom feed
From: "Iñaki Baz Castillo" <ibc@aliax.net>
To: ruby.io.splice@librelist.com
Subject: Some benchmarks
Date: Wed, 22 Dec 2010 15:01:06 +0100	[thread overview]
Message-ID: <AANLkTinXs4NK5D-=hn9shO4r8LMRmLLP8ssvb12JhSBh@mail.gmail.com> (raw)
In-Reply-To: AANLkTi=M+qGa7G1PvYyB+fbJUzrersJQwtDkct3hZEiy@mail.gmail.com

Hi, I've done some benchamarks comparing FileUtils vs io_splice when
copying files in my computer (AMD 64 Phenom II X4 965):


Test data:
- Source file size:  496 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:            0.00013375282287597656
- FileUtils.copy_stream:   0.0001666545867919922
- IO.splice:               6.341934204101562e-05


Test data:
- Source file size:  496 bytes
- Number of iterations:  10
Results:
- FileUtils.cp:            0.0027534961700439453
- FileUtils.copy_stream:   0.002769947052001953
- IO.splice:               0.0014998912811279297


Test data:
- Source file size:  496 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:            0.02746438980102539
- FileUtils.copy_stream:   0.018292665481567383
- IO.splice:               0.010256767272949219


Test data:
- Source file size:  496 bytes
- Number of iterations:  1000
Results:
- FileUtils.cp:            0.25453615188598633
- FileUtils.copy_stream:   0.13935613632202148
- IO.splice:               0.05126452445983887


Test data:
- Source file size:  16102 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:            0.00014328956604003906
- FileUtils.copy_stream:   0.0004949569702148438
- IO.splice:               0.00028967857360839844


Test data:
- Source file size:  16102 bytes
- Number of iterations:  10
Results:
- FileUtils.cp:            0.0018320083618164062
- FileUtils.copy_stream:   0.001699686050415039
- IO.splice:               0.0004978179931640625


Test data:
- Source file size:  16102 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:            0.039102792739868164
- FileUtils.copy_stream:   0.0330507755279541
- IO.splice:               0.01671743392944336


Test data:
- Source file size:  16102 bytes
- Number of iterations:  1000
Results:
- FileUtils.cp:            0.3610107898712158
- FileUtils.copy_stream:   0.35822200775146484
- IO.splice:               0.08929753303527832


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  1
Results:
- FileUtils.cp:            0.001172780990600586
- FileUtils.copy_stream:   0.0011954307556152344
- IO.splice:               0.0009520053863525391


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  10
Results:
- FileUtils.cp:            0.08811688423156738
- FileUtils.copy_stream:   0.09790825843811035
- IO.splice:               0.014630317687988281


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  100
Results:
- FileUtils.cp:            1.1194334030151367
- FileUtils.copy_stream:   1.5543222427368164
- IO.splice:               0.0931394100189209


Test data:
- Source file size:  1156222 bytes
- Number of iterations:  1000
Results:
- FileUtils.cp:            12.707785606384277
- FileUtils.copy_stream:   13.745135068893433
- IO.splice:               9.723489761352539



The script is below.

It's clear that using io_splice is good for big files (or lot of
copies from same small source file).

I don't understand why in the last test (big file, 1000 copies)
io_splice takes so long, maybe it takes more time initializing each
object within the benchmark block?




Script:
---------------------------------------------------------
#!/usr/bin/ruby

require "fileutils"
require "benchmark"
require "io/splice"


SRC_FILE = ARGV[0]
DST_FILE = ARGV[1]
TIMES = ( ARGV[2] ? ARGV[2].to_i : 1 )


puts "Test data:"
puts "- Source file size:  #{File.size(SRC_FILE)} bytes"
puts "- Number of iterations:  #{TIMES}"


puts "Results:"

print "- FileUtils.cp:            "
puts Benchmark.realtime {
 TIMES.times do
   FileUtils.cp SRC_FILE, DST_FILE
 end
}


print "- FileUtils.copy_stream:   "
puts Benchmark.realtime {
 TIMES.times do
   FileUtils.copy_stream SRC_FILE, DST_FILE
 end
}


print "- IO.splice:               "
puts Benchmark.realtime {
 TIMES.times do |n|
   source = File.open(SRC_FILE, 'rb')
   dest = File.open(DST_FILE + "_#{n}", 'wb')
   source_fd = source.fileno
   dest_fd = dest.fileno

   # We use a pipe as a ring buffer in kernel space.
   # pipes may store up to IO::Splice::PIPE_CAPA bytes
   pipe = IO.pipe
   rfd, wfd = pipe.map { |io| io.fileno }

   begin
     nread = begin
       # first pull as many bytes as possible into the pipe
       IO.splice(source_fd, nil, wfd, nil, IO::Splice::PIPE_CAPA, 0)
     rescue EOFError
       break
     end

     # now move the contents of the pipe buffer into the destination file
     # the copied data never enters userspace
     nwritten = IO.splice(rfd, nil, dest_fd, nil, nread, 0)

     nwritten == nread or
       abort "short splice to destination file: #{nwritten} != #{nread}"
   end while true
 end
}
---------------------------------------------------------


I've tryed to declare source, source_fd and pipe = IO.pipe before the
benchmark block but then I get empty copied files (I need to declare
all of them within the benchmark block). I assume the test script can
be improved.



Regards.


--
Iñaki Baz Castillo
<ibc@aliax.net>

       reply	other threads:[~2010-12-22 14:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <AANLkTi=M+qGa7G1PvYyB+fbJUzrersJQwtDkct3hZEiy@mail.gmail.com>
2010-12-22 14:01 ` Iñaki Baz Castillo [this message]
2010-12-22 19:56   ` Some benchmarks Eric Wong
2010-12-23 15:41     ` Iñaki Baz Castillo
2010-12-23 18:06       ` Eric Wong
2010-12-27 10:01         ` Iñaki Baz Castillo
2010-12-27 17:33       ` Iñaki Baz Castillo
2010-12-27 21:38         ` Eric Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://yhbt.net/ruby_io_splice/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTinXs4NK5D-=hn9shO4r8LMRmLLP8ssvb12JhSBh@mail.gmail.com' \
    --to=ibc@aliax.net \
    --cc=ruby.io.splice@librelist.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://yhbt.net/ruby_io_splice.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).