j***@krogh.cc
2014-10-20 19:03:59 UTC
Hi.
One of our "production issues" is that the system generates lots of
wal-files, lots is like 151952 files over the last 24h, which is about
2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
but the system does indeed work fine. We do subsequently gzip the files to
limit actual disk-usage, this makes the files roughly 30-50% in size.
That being said, along comes the backup, scheduled ones a day and tries to
read off these wal-files, which to the backup looks like "an awfull lot of
small files", our backup utillized a single thread to read of those files
and levels of at reading through 30-40MB/s from a 21 drive Raid50 of
rotating drives, which is quite bad. That causes a daily incremental run
to take in the order of 24h. Differential picking up larger deltas and
full are even worse.
One could of-course question a lot of the other things here, but
currently the only limiting factor is actually the backup being
able to keep up transporting the wal-log away from the system due
to the small wal size.
A short test like:
find . -type f -ctime -1 | tail -n 50 | xargs cat | pipebench > /dev/null
confirms the backup speed to be roughly the same as seen by the backup
software.
Another test from the same volume doing:
find . -type f -ctime -1 | tail -n 50 | xargs cat > largefile
And then wait for the fs to not cache the file any more and
cat largefile | pipebench > /dev/null
confirms that the disk-subsystem can do way (150-200MB/s) better on larger
files.
Thread here around the same topic:
http://postgresql.1045698.n5.nabble.com/How-do-you-change-the-size-of-the-WAL-files-td3425516.html
But not a warm welcoming workaround.
Suggestions are welcome. An archive-command/restore command that could
combine/split wal-segments might be the easiest workaround, but how about
crash-safeness?
One of our "production issues" is that the system generates lots of
wal-files, lots is like 151952 files over the last 24h, which is about
2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
but the system does indeed work fine. We do subsequently gzip the files to
limit actual disk-usage, this makes the files roughly 30-50% in size.
That being said, along comes the backup, scheduled ones a day and tries to
read off these wal-files, which to the backup looks like "an awfull lot of
small files", our backup utillized a single thread to read of those files
and levels of at reading through 30-40MB/s from a 21 drive Raid50 of
rotating drives, which is quite bad. That causes a daily incremental run
to take in the order of 24h. Differential picking up larger deltas and
full are even worse.
One could of-course question a lot of the other things here, but
currently the only limiting factor is actually the backup being
able to keep up transporting the wal-log away from the system due
to the small wal size.
A short test like:
find . -type f -ctime -1 | tail -n 50 | xargs cat | pipebench > /dev/null
confirms the backup speed to be roughly the same as seen by the backup
software.
Another test from the same volume doing:
find . -type f -ctime -1 | tail -n 50 | xargs cat > largefile
And then wait for the fs to not cache the file any more and
cat largefile | pipebench > /dev/null
confirms that the disk-subsystem can do way (150-200MB/s) better on larger
files.
Thread here around the same topic:
http://postgresql.1045698.n5.nabble.com/How-do-you-change-the-size-of-the-WAL-files-td3425516.html
But not a warm welcoming workaround.
Suggestions are welcome. An archive-command/restore command that could
combine/split wal-segments might be the easiest workaround, but how about
crash-safeness?
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers