Second Signal Logo

Technote on getNthDocument

Now Available 2.0!
Web Services Enabled
    Our newly released NCT Search 2.0 gives you the power to search across multiple Lotus Notes databases on the fly! Take control of your searches without the overhead of additional indexes. Search only the databases you want, when you want, and make the results display the way you want. NCT Search is your answer to multi-database searches.

    Click here for informtion, Live Demo, and a free trial!

Technote submitted by Andrew Pollack

There are many ways to speed up processing in loops. The key to most of them is to do as little as possible inside the loop. Access the backend database only when really needed, and never every waste time. This technote deals specifically with the "getNthDocument" function.

Back in the r4.x days, I once spent several hours trying to figure out why my data dump got progressively slower as it went. I had assumed it was something to do with the ODBC or the back end database. I went as far as to move from ACCESS to SQL server as a back end (this meant finding and installing SQL server)..... Nothing helped.

On a whim, I decided to comment out all code other then the loop through the notesdocumentcollection, and time it, to see what happened. Guess what? Steadily increasing "time/hundred documents" values. That is to say, each hundred documents took longer to read then the previous ones.

After switching the method of looping through the documents, I was able to observe that not only did the "time/hundred documents" stop increasing, it actually got slightly better as the server cache "noticed" what I was doing. By the end, I was at a steady, fast, rate.

How much change are we talking about?

Each 100 documents I looped through, took 0.1 seconds longer then the previous 100 documents. If that doesn't sound like much, go back to your math books! We're talking about a DELTA of 0.1 seconds per 100 documents.

For every 1000 documents in the loop, you are adding 1 second to a loop through 100 documents. By the time you hit 10,000 documents, you are adding 100 seconds for each 100 documents you read!

This made working with the data virtually impossible.

I made only one change. Instead of code like this:

for x = 1 to collection.count
set doc=collection.getnthdocument(x)
<code here>
next x

I set my code to look like this:

set doc=collection.getfirstdocument
while not doc is nothing
<code here>
set doc=collection.getnextdocument(doc)

And the result? No increase in the "Time/Hundred Documents" value. In fact, because the server was cacheing what I was doing, I saw a net decrease over the first few thousand documents, followed by an astonishingly steady rate.

For me? What wasn't running in 6 hours, ran in about 10 minutes.

Test Environment:

Notes Server:
Pentium II - 233 mhz
128 Megs of RAM
Notes 4.61
Partioned Server. This is one of TWO active servers

Database Server
Microsoft SQL Server 6.5

10-Base-T - Single Low-End Hub between workstation & server

Notes 4.61b
Windows 95 (OEM w/ all service packs)
Pentium Pro - 233mhz
60 Megs of RAM