Andrew Pollack's Blog

Technology, Family, Entertainment, Politics, and Random Noise

50 Percent Reduction in mail file size with no loss of data or functionality

By Andrew Pollack on 06/24/2008 at 01:48 PM EDT

This morning, I set up DAOS on my Domino 8.5 Beta server. Its a Win32 machine running in my office. My customer facing servers are not upgraded yet. Since my mail file is replicated onto more than one clustered server, I saw no danger in giving DAOS a try today. So far, everyone I know who's been testing it is increasingly comfortable with its ability to handle problems as well or better than data stored in the NSF natively.

DAOS pulls file attachments out of the NSF and stores them in an arcane file tree on disk maintained by the Domino server. It eliminates duplicates of the same file (based on hash values) and It is TOTALLY transparent to users. When you open documents, you see the attachments as normal. If you have local replicas, they're unaffected. You literally cannot tell unless you're in the Admin client and looking for the information. It's what the old "single copy object store" was supposed to do (at least the way we wanted it to), but never worked out at all.

Setting it up means enabling it on the server document, setting a couple of options on the database advanced properties, and running a copy-style compact on the database. I knew I'd see some heavy benefit, but didn't expect to see a 50% reduction in the total disk space used. At the end of the run, I've gone from half a gigabyte to well under 250megs. On top of that, nearly 200 megs of file is now stored in DAOS which means that view updates and anything else requiring a database scan is going to be much faster.

The real big value comes when you look at using DAOS in places with many mail users sharing files. The lack of duplication should be tremendous.


There are  - loading -  comments....

re: 50 Reduction in mail file size with no loss of data or functionalityBy Chuck Hauble on 06/24/2008 at 02:07 PM EDT
Great news.. Have you thought at all about how to architect the backups for
the attachments DAOS? I wonder if the Domino Backup APIs see them or if there
is some other process we will need to use
backups...By Andrew Pollack on 06/24/2008 at 02:13 PM EDT
One of the things that makes DAOS manageable, is that it is more loosely
coupled to the nsf files. As long as your backups include the data tree
containing the attachments, you should be ok.

On top of this, deleting a document doesn't immediately delete the attachment
record (unless you want). The attachments are "pruned" periodically after "n"
days. This should also serve to make restores more trouble free.

File names on the attachments are not 1:1 with the attachment names. The on
disk schema is designed to be very robust and repairable.
re: 50 Percent Reduction in mail file size with no loss of data or functionalityBy Paul Gagnon on 06/24/2008 at 03:57 PM EDT
thats awesome. We have some users approaching 10 gigs in size for their mail
files, many all have the same 15mb powerpoint and all its revisions.

What happens to the attachments when you need to do a hardware upgrade and load
Domino on the new box?
DependsBy Andrew Pollack on 06/24/2008 at 05:10 PM EDT
As long as you do your new box by making a new replica on the new machine, it
would be totally transparent.

If you're doing it by copying data directories, you need to make sure you also
copy the data tree DAOS uses to store its data. Since you reference the
location of the root of that DAOS tree on the server document, you want to make
sure that if its not the same it gets changed before you start the server up.

Its really just a file system with lots of oddly named files in it. As long as
you make sure the server knows where it is, you should be fine.

Once key thing, is that you can store it on a different spindle from your
databases. Maybe put it out on SAN and keep your local nsf's on a local RAID
array.

I think this is going to be a big driver for a lot of companies to move to 8.5.
re: 50 Percent Reduction in mail file size with no loss of data or functionalityBy Dave Harris on 06/25/2008 at 08:00 AM EDT
Andrew, 50% compared to what? If it's against ND7, then, yeah well, nothing to
write home about really (I mean, it is, but you'll see where I'm going with
this).

If it's against 8.0.1 with document/design compression enabled across the
board, then yes it's truly impressive: I already managed to squeeze a 35%
reduction on mail with that enabled, so a further 50% would be truly
remarkable.
re: 50 Percent Reduction in mail file size with no loss of data or functionalityBy Yancy Lent on 06/26/2008 at 03:01 PM EDT
Great post, great feature. I was mainly looking for one piece of detail which
you answered; "based on hash values". There is always a need to know how files
are considered 'the same'.

Here is another great post about this: http://planetlotus.org/27c29b
re: 50 Percent Reduction in mail file size with no loss of data or functionalityBy John possi on 06/27/2008 at 02:08 PM EDT
What about backup? I have a backup utility that is a file-system backup based
backup utility. So when I backup the DAOS directory and then one year after I
need to restore one DB, how do I know which DAOS files belong to each database?

Also it sounds like if there are thousands of mails then there will be millons
of attachments in my file system. Not really good.

Somebody told me that if you open a DAOS cache file, sometimes it can be opened
and you can read the attachment content since it's not encrypted. The said that
this is when you uncheck the compress checkbox in the file-attach dialog.


Other Recent Stories...

  1. 05/05/2016Is the growing social-sourced economy the modern back door into socialism?Is the growing social-sourced economy the modern back door into socialism? I read a really insightful post a couple of days ago that suggested the use of social network funding sites like “Go Fund Me” and “Kickstarter” have come about and gained popularity in part because the existing economy in no longer serving its purpose for anyone who isn’t already wealthy. Have the traditional ways to get new ventures funded become closed to all but a few who aren’t already connected to them and so onerous as to make ...... 
  2. 04/20/2016Want to be whitelisted? Here are some sensible rules for web site advertisingAn increasing number of websites are now detecting when users have ad-blocking enabled, and refuse to show content unless you "whitelist" their site (disable your ad-blocking for them). I think that is a fair decision on their part, it's how they pay for the site. However, if you want me (and many others) to white list your site, there are some rules you should follow. If you violate these rules, I won't whitelist your site, I'll just find content elsewhere. 1. The total space taken up by advertisements ...... 
  3. 12/30/2015Fantastic new series on Syfy called “The Expanse” – for people who love traditional science fiction[] “The Expanse” is a new science fiction series being broadcast onthe Syfy channelthis winter. It’s closely based on a series of books by author James S. A. Corey beginning with “Leviathan Wakes”. There are 5 books in the “Expanse” series so far. If you’re a fan of the novels you’ll appreciate how closely the books are followed.TIP: The first five episodes are already available on Syfy.com. If you’re having trouble getting into the characters and plot, use those to get up to speed.The worlds created for ...... 
  4. 10/20/2015My suggestion is to stay away from PayAnywhere(dot)com  
  5. 08/07/2015Here is one for you VMWARE gurus - particularly if you run ESXi without fancy drive arrays 
  6. 08/06/2015The Killer of Orphans (Orphan Documents) 
  7. 06/02/2015Homeopathic Marketing: Traveler on my Android is now calling itself VERSE. Allow me to translate that for the IBM Notes community... 
  8. 03/17/2015A review of British Airways Premium Economy Service – How to destroy customer goodwill all at once 
  9. 02/26/2015There's a bug in how @TextToTime() and @ToTime() process date strings related to international standards and browser settings. 
  10. 01/21/2015Delivering two new presentations at Developer Camp (EntwicklerCamp) 2015 in Germany 
Click here for more articles.....


pen icon Comment Entry
Subject
Your Name
Homepage
*Your Email
* Your email address is required, but not displayed.
 
Your thoughts....
 
Remember Me  

Please wait while your document is saved.